god likes scientists

I don’t believe in astrology, but I do firmly believe in the Resurrection. As far as scientific evidence goes, both prove equally absurd. WTF? Why accept even one of them? Why not both? I have no good answer for this!

I could say that “events” in my life confirm my Christian experience, but that might be pattern recognition bias—seeing confirmation of my faith in patterns that my brain constructs out of non-patterned signal (because brains do that [1]).

Similarly, because I’ve never bothered to look for confirmation of astrological interpretations of my life, I’ve never “found” them in my life narrative. Again, pattern recognition bias.

Perhaps—and I am so completely unfamiliar with astrology to know for sure—astrology is about finding comfort in the universe’s design—that there is a “plan”. Is there anything in astrology that is meant to be uncomfortable? I don’t know!

The Christian experience is not comfortable; at least I don’t seek it out for comfort with regard to my place the universe. If God asks me to, I’ll perform God’s work in Hell.

Maybe its about love: I do not perceive that the universe as expressed as stars, planets, mass, and energy “loves” me. But I need to feel love and the Christian narrative offers that. The Resurrection itself is a love story.

Perhaps I created God in my image—an image of a human who needs love. And the need for love comes from evolutionary psychology; human-to-human attachment driving tribal cooperation, driving survival, driving gene propagation. The selfish gene [2].

“What is truth?” retorted the Pilate [3].

I’m going to continue trusting God and continue trusting my faith in God, even without these questions answered. And the God I believe in wants us to wrestle with these matters; God gave us brains and expects us to use them critically.

God likes scientists: “Doubting” Thomas just wanted evidence. He was not rejected for asking for it.

“Faith” and “belief” mean different things. I see “belief” as getting hung up on the facts—where science and logic matter to defining reality. “Belief” has its place: For example I believe in “F=m*a” at appropriate velocities and definitely believe in God’s existence and love.

But I do not emphasize belief in my spiritual practice, which is where “faith” comes in. “Faith”, in my book, is trusting the divine deep within my soul without needing to understand all the particulars about where things are headed.

Sometimes faith doesn’t even require much commitment to reality. I tell a story in my article “an allegory of affection from a Hindu goddess” about a visitation by Durga that I experienced during a dream. I do not worry about whether this visit really happened or not; the experience enriched my faith in the Christian god while enhancing my understanding of Hinduism. I’m not going to argue about what is “real” in this situation. Rather, I’ll just accept the personal growth that came of it.

This is the embrace of faith.


  1. https://en.wikipedia.org/wiki/Apophenia
  2. https://en.wikipedia.org/wiki/The_Selfish_Gene
  3. John 18:38 (New International Version)
  4. The image below is copied from https://www.wikihow.com/Be-a-Good-Scientist.

toward a gene panel for psychiatric violence

I recently developed a method for specifying a comprehensive gene list for investigating genes related to psychiatric violence, which I describe below. First though, here’s a cool picture from the analysis:


I started by extracting a list of diseases involving violence from [1], removing epilepsy, dementia, mental retardation (is there a better word for this?), and Alzheimer’s disease. Also removed sexual sadism from the list as one might debate whether or not this qualifies as “disease”. I then matched those diseases by name–more accurately by components within each name–to diseases contained in DisGeNET [2] to determine genes associated with those diseases. Next, I built a network graph of the genes where an edge between two genes indicates one or more diseases in common. A tractable subset of this graph is pictured above for demonstration. Finally, I computed the size of the ego graph for each gene (node) and ranked them, as listed below. Greater ego network size indicates greater probable biological importance vis-a-vis psychiatric diseases having violence as a symptom.

I posted my code used for this analysis at [3].


  1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686644/
  2. http://www.disgenet.org
  3. https://github.com/whole-systems-enterprises/blog/tree/master/adoption_study/genetics


Here is a portion of the gene list I put together:

Ego Network Size NCBI Gene Symbol NCBI Gene ID
1729008 NR3C1 2908
1723895 GRN 2896
1723751 PBRM1 55193
1721624 DAOA-AS1 282706
1721482 OPRM1 4988
1721103 LINC00273 649159
1720930 TDO2 6999
1720404 MTHFR 4524
. . . . . . . . .

tracking my gender transition through computational linguistics and machine learning

I wrote 299 blog posts in the last decade, roughly half on badassdatascience.com and half on genderpunk360.com. Produced most of the Badass Data Science content while publicly expressing as a man, and most of the Gender Punk 360 content as a woman. Some articles appear on both blogs—for example this one—and in the analysis described below I account for such duplication.

My speech therapist observed that I successfully employ feminine language in my recent video “radical forgiveness”. This led me to thinking: Has the language I use in my prose evolved as I blossomed into femininity? I detail my attempt to answer this question using mathematical analysis below.

Two Caveats

I make two major assumptions in this analysis, assumptions I will address in future work:

First, I assume my writing skill remained constant throughout the last ten years. Not a great assumption in the long haul but necessary to simplify the math for this “back of the envelope” analysis.

Second, the two blogs cover different subjects, and the first one even contains source code on occasion. This may distort the clustering process described below. Again, ignoring this concern proves acceptable for this “quick-and-dirty” calculation to enable exploration of the problem domain.


I download each of my blog posts and then calculated the part of speech (POS) for each word in the post. After that I computed the frequency distribution of the POSs. I then performed hierarchical clustering using a similarity matrix defined by the dot product of each pair of posts’ POS use frequency distribution vectors. The resulting dendrogram looks like:

I recommend downloading the image to view it at full size.

Each vertical line represents a blog post, and the trees linking the vertical lines indicate the degree of similarity between any two blog posts. For example, in the above image, the cyan and magenta colored posts prove similar but the green and black posts diverge significantly in terms of their POS use frequency distributions. The asterisks indicate posts created after I started expressing publicly as a woman full-time. The colors divide the tree into sections that group similar blog posts. Please note that I chose the grouping threshold manually (but rationally).


By visually inspecting the density of these asterisks for the different color groups we derive an indication of how “feminine” or how “masculine” we might regard each group of blog posts. For example, we see sparse femininity in the green, yellow, and black groups; while we see enriched femininity in the cyan and purple group. The algorithm clearly found little distinction between the posts within the large red group, but even there we visually recognize sections of diminished femininity and sections of enhanced femininity.

So a linguistical difference between my pre- and post-transition writing appears to exist. But is it real? Can we conclude that my prose grew more feminine after my public transition? Not so fast! We must build a model that includes time as a variable to cancel out possible influence of improvement in my writing skill, and then test that model for significance. I’ll save this work for a later date.

Grrl on Grrl Podcast interviewed me!

Today my interview with Grrl on Grrl Podcast came out!  We discuss, among other things,

  • The science of gender identity
  • The music of Axis Evil
  • “Ladylike” behavior as a source of personal empowerment
  • Cultural appropriation
  • Psychosexuality
  • Model minorities

Big thanks to June Owatari of Grrl on Grrl Podcast for working so hard to put this together! The music presented during the interviews may be downloaded here.


Thermo Fisher: ten years at an uncommonly fabulous company

Many laid-off employees trash their former employer. But my decade at Thermo Fisher stands as one of the richest experiences of my life, despite significant challenges along the way. So I want to remind current Thermo Fisher employees and leadership what they can take pride in:

Exceptional Handling of my On-the-Job Gender Change

I joined the company at the Austin site as “Daniel Edmund Williams” and left from the Carlsbad site as “Emily Marie Williams”. No easy feat.

The (public) transition took place one year into my tenure at the Carlsbad site. My colleagues there embraced my chosen identity completely. Sure there were a few initial hiccups in name and pronoun use, but those faded quickly. No one fussed about the bathrooms or showers.

Yes, a few folks were uncomfortable at first. I took them to lunch. I turned the other cheek. They came around.

Thermo Fisher employees and leadership can therefore take pride in their openness.


Thermo Fisher’s HR department knows what they did for me, along with the challenges I faced. These stories are of course not for public consumption.

I thank them for all their tremendous support. I thank them for all the collaborative problem-solving and for delivering substantial grace.

Thermo Fisher employees and leadership can therefore take pride in their Human Resources Department.

Learned to “Manage Up”

Working at a large corporation for a decade usually means reporting to multiple bosses. Most managed exceptionally well, a few struggled. One was downright abusive. Immersed in this environment, I became skilled at collaborative problem-solving and team-centered idea promotion, skills I’m extremely thankful for.

I also learned how to stand up to the abusive boss—proudly setting an example for my less experienced colleagues.

Company employees and leadership can therefore (mostly) take pride in their management.

Learned to Manage (Down)

An intern reported to me one summer, allowing me to develop my talents at management. While no one specifically coached me on management skills during this period, the many good (and a few bad) management examples set around me directed my compass.

Acquired Technical Skills and Sharpened my Business Acumen

Immediately following my layoff last July I founded Whole-Systems Enterprises, Inc. Employing all the data science skills I learned at Thermo Fisher, we are developing and optimizing day-trading algorithms. We are also selling bioinformatics and data science consulting services. My experience at Thermo Fisher made this possible.

Thermo Fisher employees and leadership can therefore take pride in their technical development.

Why Am I Saying All This?

This blog, and the book I’m writing based on it, covers transgender issues. Employment is a major transgender issue, not just during the public act of transition but encompassing the whole life experience of work. I wanted to celebrate an organization that is getting it right.

The whole proves greater than the sum of its parts.

vocal frequency response

I now can speak consistently for an hour in a feminine voice—decent pitch, resonance, and inflection—before needing to rest. Moreover, my voice now passes on the phone.

So my voice therapist and I decided to tackle my singing range, to feminize that as well. (Followers of Axis Evil know I sing with a masculine voice despite functioning in all other parts of my life using a feminine one).

I needed data to see where I stand currently:

Starting at D3 (146.832 Hz), which lies in the gender-neutral pitch range, I recorded myself singing the words “I am Emily” up the scale in half-step intervals until D5 (587.330 Hz). (But I couldn’t make it that far in practice). I used a synthesizer to provide the pitch at each interval.

I then cut the synthesizer track and ran the vocal track through a frequency analysis algorithm to get a frequency response (Bode) plot:

As you can see from the plot, I can hold it up to about middle C, but can’t currently sustain volume beyond that.

Good baseline information.

at the genetics conference

We (scientists), suspect a genetic component to gender dysphoria. While that may not be the full explanation, it might be a factor. See “the science of gender identity (part 1: genetics)” for my previous analysis of this subject.

CRISPR/Cas9 technology allows us to edit genomes early in life. The idea is that we can replace genetic mutations that correlate to future disease with nucleotide sequences that correlate to healthy outcomes.

Last October I presented a poster at the American Society of Human Genetics’ annual conference. Most of the workshops I attended were deeply technical, but I attended one about the ethics of using CRISPR/Cas9.

There were hundreds of people in the workshop’s audience, and many big names in the human genetics field. Researchers and M.D.-PhDs filled the room. We agreed that the CRISPR/Cas9 technology could be beneficial to disease prevention.

I stepped up to the microphone at that point (audience feedback was invited) and stated that

We must be careful not to confuse diversity and disease. I like being transgender and would not want that to have been eradicated. Similarly, we must not be hasty to “treat” other socially challenging conditions such as homosexuality and Asperger’s syndrome.

I made it clear that just because someone, somewhere finds my identity distasteful and culturally and/or religiously problematic, it should never be “cured” a priori as if it were a disease.

My statements elicited a full chorus of cheers from the audience, and many complemented me for my courage afterwards.

selecting DNA targets for a transgender gene panel

This is a work in progress…


There are several genes that some researchers believe are correlated with gender dysphoria and transsexualism. Please see my earlier work discussing some of these genes at “the science of gender identity (part 1: genetics)“.

In particular, there are single-nucleotide polymorphisms (SNPs) and variations in tandem repeat length in these genes that have been identified. I am therefore designing a targeted DNA sequencing panel for investigating whether a person has any of these variations.

Started with DisGeNET 3.0, 4.0 and 5.0

DisGeNET [1] provides scored relationships between diseases and genes (not that trans stuff is a disease!) based on, among other things, text mining of the literature. Fortunately they also report PubMed IDs for the papers providing evidence of a given gene’s connection to a disease, which allows me to begin my literature search. The following graph and table shows the DisGeNET content I found related to the keywords “transsexualism”, and “gender”. (I allowed partial string matches in the search):

NCBI Gene Symbol NCBI Gene ID Condition PubMed ID Sentence
AR 367 Transsexualism 18962445 Androgen receptor repeat length polymorphism associated with male-to-female transsexualism.
COMT 1312 Gender disorders 17419009 Future studies on the COMT gene in mentally ill subjects should be stratified by clinical subtypes of the disorder, gender and ethnicity.
CYP17A1 1586 Transsexualism 17765230 A polymorphism of the CYP17 gene related to sex steroid metabolism is associated with female-to-male but not male-to-female transsexualism.
CYP19A1 1588 Transsexualism 15854782 However, binary logistic regression analysis revealed significant partial effects for all three polymorphisms, as well as for the interaction between the AR and aromatase gene polymorphisms, on the risk of developing transsexualism.
CYP19A1 1588 Transsexualism 18962445 No associations for transsexualism were evident in repeat lengths for CYP19 or ERbeta genes.
CYP19A1 1588 Transsexualism 25124466 Association study of ERβ, AR, and CYP19A1 genes and MtF transsexualism.
ESR2 2100 Transsexualism 24274329 The (CA)n polymorphism of ERβ gene is associated with FtM transsexualism.
ESR2 2100 Transsexualism 25124466 We investigated the association between genotype and transsexualism by performing a molecular analysis of three variable regions of genes ERβ, AR, and CYP19A1 in 915 individuals (442 MtFs and 473 control males).
HSD17B13 345275 Gender Dysphoria 23045263 Patients with 5α-reductase 2 (5α-RD2) and 17β-hydroxysteroid dehydrogenase 3 (17β-HSD3) deficiencies exhibit the highest rates of gender dysphoria (incidence of up to 63%).
HSD17B3 3293 Gender Dysphoria 23045263 Patients with 5α-reductase 2 (5α-RD2) and 17β-hydroxysteroid dehydrogenase 3 (17β-HSD3) deficiencies exhibit the highest rates of gender dysphoria (incidence of up to 63%).
HSD17B7 51478 Gender Dysphoria 23045263 Patients with 5α-reductase 2 (5α-RD2) and 17β-hydroxysteroid dehydrogenase 3 (17β-HSD3) deficiencies exhibit the highest rates of gender dysphoria (incidence of up to 63%).
LITAF 9516 Transsexualism 1483176 As her twin sister had no sexual identity problems, it appears that transsexualism is not transmitted by a simple genetic mechanism.


  1. Janet Piñero, Àlex Bravo, Núria Queralt-Rosinach, Alba Gutiérrez-Sacristán, Jordi Deu-Pons, Emilio Centeno, Javier García-García, Ferran Sanz, Laura I. Furlong; DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res 2017; 45 (D1): D833-D839. doi: 10.1093/nar/gkw943

the science of gender identity (part 4: summary)

To prepare for a book I intend to write on the science of gender identity, I drafted the following three blog posts to collect my thoughts. They are highly technical; I need to recast the content for the layperson. I also assembled some of my own biological data to analyze.

The first post covers the very little we know about the genetics involved. It specifically examines polymorphisms in particular genes and tests their correlation to transsexualism.

The second post I think is the most compelling. And it has the best pictures! It investigates brain anatomy in transsexuals and how it differs from that of cisgendered individuals.

Finally, the third post covers some of the most recent psychological information I could get my hands on. There were two problems here: Since I’m not a psychologist I only understood the statistical arguments in the papers, and have very little access to psychological literature. Nonetheless I did my best. The most notable discussion in this post is the description of a study examining the stability of gender identity in very young children.

My Own Data


While no correlation between testosterone level and the male-to-female transgender experience has ever been established, it is interesting that my natural testosterone level is extremely low. (This was measured before I started blocking my testosterone with Spironolactone). Here is where I sit on the curve for natal males my age:

The “normal boundaries” are those that my HMO says are healthy. To produce the curve I extracted the mean and standard deviation from [1]. I asked this source for the raw data so I could produce the actual data distribution, rather than the normal approximation, but they did not respond.

Brain Anatomy

I have a brain MRI recorded before I started taking hormones. This is important because hormones can alter brain anatomy. I’m attempting to use the 3DSlicer program [2] to measure the sizes of my various brain regions using image recognition. My intent is to compare the measurements to a body of (sort of) age-matched female and male brain MRIs I downloaded from [3].

Right now I’m struggling with the image recognition for my particular MRI, but I’ll figure it out and report the results on this blog.

An example of what this effort looks like in 3DSlicer is:

Related Posts

the science of gender identity (part 1: genetics)

the science of gender identity (part 2: brain anatomy)

the science of gender identity (part 3: psychology)


  1. http://www.ncbi.nlm.nih.gov/pubmed/21697255 (supplemental data)
  2. http://www.slicer.org
  3. http://www.loni.usc.edu

the science of gender identity (part 1: genetics)

This is the first in a multi-part series surveying the current science of gender identity, particularly with regard to the transgendered population. I intend to discuss the genetic, brain anatomic, and neuropsychological findings of recent studies on the matter. As always, I will incorporate my own statistical analysis of raw study data wherever possible.

Here I start by discussing four studies involving genetic variations thought to be correlated with transsexualism. Some of these studies show promising leads toward increasing our understanding, others report limited or no findings. Limited or no findings does not imply that no genetic factors relate to transsexualism, just that none were found for the particular gene variant examined by the study.

My only beef with these studies is that they consider only one or a few genetic variations at a time. This is a limitation of the technology used. As the cost of whole-genome sequencing decreases, we’ll be able to look for simultaneous genetic variations that play a role in concert with each other.

Code and data for the analyses presented below is attached.

A Bit About the Words I’m Using

Two words I use in this post bother me, so I thought I’d explain my choice to use them.

First, I’d prefer to use the umbrella term “transgender” to label the study participants described below. However, “transgender” is too broad, as the research I describe focused on those who particularly modify their bodies to become a member of a different sex, which not all transgendered individuals want to do. Therefore I use the medical term for this population: “transsexuals”.

Second, “nucleotide variation”, which I associate below through analysis with transsexualism, implies there is a “normal” non-variation. The word is used to indicate that the particular DNA sequence involved is not present in most individuals’ genome. More common DNA variations are those that result in blue eyes vs. the more frequent brown, and certainly nothing is pathological about have blue eyes. In the same vein, I assert that nothing is pathological about transsexualism; its hypothesized genetic component is simply part of our genetic diversity.

Gene Promoter Variation rs549669867

A nucleotide variation (rs549669867) in the promoter for the gene CYP17A1 associates with female-to-male transsexualism according to a study outlined in [1]. CYP17A1 is a key gene involved in steroid metabolism, and this particular variation causes carriers to possess higher concentrations of both testosterone and estrodiol in their bodies [1]. These findings are consistent with a prevailing theory that extra testosterone causes masculinization of the female brain during fetal development, thereby contributing to development of gender dysphoria.

Here I present independent statistical reasoning based on data obtained from the study paper, which supports the researchers’ conclusions. These conclusions do not fully explain the origins of female-to-male transsexualism, as there were non-transsexuals included in the study who had the nucleotide variation, and there were transsexuals in the study who did not. However, the difference in frequencies of the variation’s occurrence between the transsexual and non-transsexual study participant groups is statistically significant.

First I’ll discuss the nucleotide variation itself. The following screenshot from the UCSC Genome Browser [2] shows 50 nucleotides upstream and downstream from the start of gene CYP17A1 on chromosome 10 of the human genome:

The variation we are examining is shown in the lower left, 34 nucleotides before the start of CYP17A1 (this is inside the “promoter” region of the gene). For the genomic strand sequenced in the study (any of two could have been chosen), the normal nucleotide at this position is a “T” and the variation is a “C”. From analysis of 1000 Genomes Project data, this variation is expected to occur on one of an individual’s two copies of chromosome 10 with a frequency of 0.02% [3].

Now the statistical analysis:

The study recruited 49 female-to-male transsexuals and 913 female controls, then sequenced their DNA in the promoter region of gene CYP17A1 to determine their genotype. The genotype could be one of three outcomes: “TT”, indicating lack of the nucleotide variation on both copies of chromosome 10; “CT”, indicating the variation occurs on only one of the chromosome 10 copies; and “CC”, indicating the variation is present on both copies of chromosome 10. The genotypes and their frequencies by group are listed in the following table:

We make two comparisons: The number of recessive genotypes vs. non-recessive genotypes (CC vs. CT + TT), and the number of dominant genotypes vs. non-dominant genotypes (TT vs. CT + CC). A variation often has to be recessive (present on both copies of its chromosome) to be biologically active, though this is not always the case.

Testing recessive vs. non-recessive genotype counts by study group using a Chi-square test yields a p-value of 0.04034, indicating a statistically significant difference exists between the transsexual and non-transsexual groups with regard to presence or absence of the recessive genotype.

Testing dominant vs. non-dominant genotype counts by study group using a Chi-square test yields a p-value of 0.06322, which is just over the commonly used threshold for declaring statistical significance.

It follows from this data and analysis that we can conclude that the recessive genotype is associated with female-to-male transsexualism. Again, this association does not explain all cases, e.g., why some non-transsexuals also have the recessive genotype, but it contributes to scientific efforts to understand transsexualism’s origins.

Gene Variation rs743572

Nucleotide variation rs743572 also impacts gene CYP17A1. Rather than residing in the promoter region of the gene as did rs549669867, this variation lies within the gene itself.

In the my analysis of this variation’s study data discussed below [4], the association between the variation and transsexualism (comparing transsexuals vs. controls) is not significant. However, the difference in the frequency of the variation between female-to-male transsexuals and male-to-female transsexuals is significant according to the statistical test I conducted. (The study authors concluded the same thing, just with different p-values). Therefore I’m reporting this variation as notable with regard to our efforts to understand the genetic underpinnings of transsexualism. The difference between this variation’s frequency in female-to-male transsexuals vs. male-to-female transsexuals may lead to insight into the origin of each outcome separately (per nominal biological sex), rather than help provide a “one size fits all” explanation for transsexualism.

rs743572 resides 139 nucleotide positions from the start of gene CYP17A1. It occurs on one of individuals’ two copies of chromosome 10 with a frequency of 41% [5]. The fact that this variation is much more common than rs549669867 probably explains why the transsexualism vs. control association for the variation I investigate below does not prove statistically significant. The following screenshot from the UCSC Genome Browser [2] shows the variation on gene CYP17A1 within chromosome 10 of the human genome:

The study [4] whose data I analyze here recruited 151 male-to-female and 142 female-to-male transsexuals. The researchers also recruited 167 male and 168 female non-transsexuals. All were Spaniards with no possibly confounding health issues. Of these subjects, 36% of the male-to-female and 45% of the female-to-male transsexuals carried the variation. 39% of the male and 38% of the female non-transsexuals also carried the variation. Presence or absence of the variation was determined through DNA sequencing. From this data I constructed the following contingency table, rounding to get whole numbers:

Performing pairwise comparisons of the count proportions using a Chi-squared goodness of fit test yields the following p-values:

As mentioned above, the only significant difference in variation proportions is in the comparison of female-to-male vs. male-to-female transsexuals. Therefore this variation does not by itself seem a strong contributor to our effort to explain the transgendered experience in terms of genetics. However, a whole-genome comparison study on similar test subjects could elucidate whether this variation interacts with other variations to form a combined association with transsexualism.

Androgen Receptor Repeat Length Variation rs193922933

A study [6] correlated the androgen receptor (AR) gene’s CAG repeat length variation (rs193922933) with male-to-female transsexualism. I feel the researchers did not perform their statistical analysis correctly, and have remedhttp://rs193922933ied the situation below. However my conclusion was the same.

The AR gene’s CAG repeat length is highly variable between individuals. Each occurrence of the repeat appends an extra amino acid to the androgen receptor protein, as shown below. No information about the frequency distribution of this variation was readily available [7].

Longer CAG repeat lengths are known to diminish testosterone signaling, which impacts masculinization of the brain during development [6].

The study authors sequenced the CAG repeat region of 112 male-to-female transsexuals and 258 male controls. They report the length data in the following plot (but not their raw data) [6]:

Using the GNU Image Manipulation Program, I measured each bar to determine the percentages and reconstructed the source data, re-plotted as follows:

Here we see that the CAG repeat length medians between the transsexual subjects and the controls differ by one (with the transsexual group’s median being longer), and that the interquartile limits are identical. The control group has a heavier lower tail.

The researchers compared the means using a t-test, which I am uncomfortable with due to the skew in the male controls’ distribution. Therefore I performed a quasi-Poisson regression since this is underdispersed count data. That analysis reported a statistically significant difference between the two groups (p = 0.0269).

I could not find data on the practical significance of a median difference of one CAG repeat length.

Negative Results

Another study [8], found no association between CAG repeat length variation in the AR gene and transsexualism. Furthermore, it found no association between transsexualism and variations in four other sex hormone-related genes: estrogen receptors alpha and beta, aromatase CYP19, and progesterone receptor PGR.

More Research Needed

A search of DisGeNET (a database of disease*-gene annotations) [9] for the term “transsexualism” shows only five genes and five PubMed publications covering the subject. This reveals the dearth of research on the matter. The image below showing the genes and PubMed articles extracted from the search comes from my own implementation of DisGeNET’s data within a graph database, which I discuss here.

*I of course object to DisGeNET’s labeling of “transsexualism” as a disease, and to its connection with the MeSH term “mental disorders”. I’ve contacted DisGeNET and MeSH about this issue and will report back on their response shortly.

Related Posts

the science of gender identity (part 2: brain anatomy)

the science of gender identity (part 3: psychology)

Code and Data



  1. http://www.ncbi.nlm.nih.gov/pubmed/17765230
  2. https://genome.ucsc.edu/
  3. http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?type=rs&rs=rs549669867
  4. http://www.ncbi.nlm.nih.gov/pubmed/25929975
  5. http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?searchType=adhoc_search&type=rs&rs=rs743572
  6. http://www.ncbi.nlm.nih.gov/pubmed/18962445
  7. http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?type=rs&rs=rs193922933
  8. http://www.ncbi.nlm.nih.gov/pubmed/19604497
  9. http://www.disgenet.org/web/DisGeNET/menu