Given that an estimated 0.6% of the U.S. population is transgender (trans) and that large health disparities for this population have been documented, government and research organizations are increasingly expanding measures of sex/gender to be trans inclusive. Options suggested for trans community surveys, such as expansive check-all-that-apply gender identity lists and write-in options that offer maximum flexibility, are generally not appropriate for broad population surveys. These require limited questions and a small number of categories for analysis. Limited evaluation has been undertaken of trans-inclusive population survey measures for sex/gender, including those currently in use. Using an internet survey and follow-up of 311 participants, and cognitive interviews from a maximum-diversity sub-sample (n = 79), we conducted a mixed-methods evaluation of two existing measures: a two-step question developed in the United States and a multidimensional measure developed in Canada. We found very low levels of item missingness, and no indicators of confusion on the part of cisgender (non-trans) participants for both measures. However, a majority of interview participants indicated problems with each question item set. Agreement between the two measures in assessment of gender identity was very high (K = 0.9081), but gender identity was a poor proxy for other dimensions of sex or gender among trans participants. Issues to inform measure development or adaptation that emerged from analysis included dimensions of sex/gender measured, whether non-binary identities were trans, Indigenous and cultural identities, proxy reporting, temporality concerns, and the inability of a single item to provide a valid measure of sex/gender. Based on this evaluation, we recommend that population surveys meant for multi-purpose analysis consider a new Multidimensional Sex/Gender Measure for testing that includes three simple items (one asked only of a small sub-group) to assess gender identity and lived gender, with optional additions. We provide considerations for adaptation of this measure to different contexts.