Endometriosis Whitepaper

Endometriosis Whitepaper
Orchid's team of genetic experts has developed a genetic risk score (GRS) for endometriosis.
Written by  Orchid Team

Orchid has developed advanced genetic risk scores (GRS) for a variety of diseases. Here we present our data on our GRS of endometriosis.

Endometriosis

Endometriosis is a chronic inflammatory disease characterized by the presence of uterine-like tissue outside the uterus, primarily within the pelvic region. It commonly causes severe pelvic pain, painful intercourse, and infertility. Symptoms can vary widely and are often accompanied by chronic fatigue. The disease usually occurs between menarche and menopause but can occasionally appear outside this range. Risk factors include early menarche, shorter menstrual cycles, low BMI, and family history.[1]

Genetic Risk Score

Endometriosis is shaped by both environmental and genetic factors. Monogenic testing is not available because no single gene causes the condition. Genetic risk scores (GRS), which combine the small effects of many variants into a single score, are currently the only way to estimate genetic risk. Although not diagnostic, a GRS can indicate how likely an individual is to develop the disease.

Orchid’s endometriosis GRS was trained following current industry standards.[2][3] The GRS was constructed using the SBayesRC algorithm trained on publicly available FinnGen summary statistics.[4][5] The summary statistics include 20,190 cases and 130,160 controls.[6] The resulting GRS contains over a million variants. 

Risk predictions are adjusted to each individual’s ancestry, with predictive power decaying as genetic distance from the predominately European training data increases.[7] Orchid considers a GRS meaningfully predictive if individuals at roughly the 97.7th percentile have an odds ratio (OR) of 2. The endometriosis GRS meets this criteria for the European and Central South Asian ancestry groups and is available to individuals in these groups. Availability for an individual may vary due to admixture.

Clinical Impact and Prevalence

Endometriosis affects up to 10% of reproductive-age women worldwide, roughly 176 million individuals.[1][8] Most cases go undiagnosed because definitive diagnosis requires surgical confirmation, complicating efforts to obtain precise prevalence estimates. However, the condition is notably common among women experiencing infertility, affecting up to 50% in some populations studied.[1]

Treatment of severe cases includes surgical removal of lesions and hormonal therapies to manage symptoms. Unfortunately, current therapies often have adverse effects, are contraceptive, or fail to provide lasting relief.[1]

Performant Risk Stratification

We evaluated the predictive performance of Orchid’s endometriosis GRS using the UK Biobank (UKB), a research database of roughly 500,000 genotyped individuals from the United Kingdom. We restricted the analysis to women of British ancestry and defined endometriosis as any diagnoses under ICD-10 codes N80.1 through N80.9, yielding 2,577 cases and 218,278 controls (1.2% prevalence). We then grouped individuals by GRS percentile and compared the observed disease prevalence within each group to our model’s predictions (Figure 1). For additional technical details, see the Supplementary Data.

Figure 1: Risk Stratification. Predicted risk compared to observed risk in the UKB grouped by GRS percentile.

UKB participants tend to be healthier than the general population, which leads to lower observed disease prevalence.[9] Eskenazi et al. a prevalence up to 10%, much higher than the prevalence in the UKB.[8] We adjust our model so that its average predicted risk aligns with this estimate (see Figure 2).[10] People at the tail end of the GRS distribution were at an elevated risk compared to the mean (see Table 3), with adults in the 99th percentile 2.2x more likely to develop endometriosis than average (22.4% vs 10.0%).

Figure 2: Adjusted Risk Stratification. Predicted risk estimates adjusted so that overall prevalence matches Eskenazi et al’s 10% estimate.
Lifetime Risk Relative Risk
Average (mean) 10.0% 1.0x
Top 5% of distribution 17.6% 1.8x
Top 3% of distribution 19.2% 1.9x
Top 1% of distribution 22.4% 2.2x
Top 0.5% of distribution 24.4% 2.4x
Table 3: Prevalence and relative risk at elevated genetic risk. Individuals at the tail end of the GRS distribution were at an elevated risk of developing endometriosis.

References

1.  Zondervan, K.T., Becker, C.M., Koga, K. et al. Endometriosis. Nat Rev Dis Primers 4, 9 (2018). https://doi.org/10.1038/s41572-018-0008-5

2. Moore S, Davidson I, Anomaly J, et al. Development and validation of polygenic scores for within-family prediction of disease risks. medRxiv. 2025. doi:10.1101/2025.08.06.25333145.

3. Cordogan S, Starr DB, Treff NR, et al. Within- and between-family validation of nine polygenic risk scores developed in 1.5 million individuals: implications for IVF, embryo selection, and reduction in lifetime disease risk. medRxiv. 2025. doi:10.1101/2025.10.24.25338613.

4. Zheng, Z., Liu, S., Sidorenko, J. et al. Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries. Nat Genet 56, 767–777 (2024). https://doi.org/10.1038/s41588-024-01704-y

5. FinnGen. FinnGen Summary Statistics. Available at: https://console.cloud.google.com/storage/browser/finngen-public-data-r12. Accessed 2025-12-05.

6. FinnGen. FinnGen Phenotypes. Available at: https://r12.finngen.fi/. Accessed 2025-12-15.

7. Privé, Florian et al. “Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort.” American journal of human genetics vol. 109,1 (2022): 12-23. doi:10.1016/j.ajhg.2021.11.008

8. Eskenazi B, Warner ML. Epidemiology of endometriosis. Obstet Gynecol Clin North Am. 1997;24(2):235–258. doi:10.1016/S0889-8545(05)70302-8

9. Fry A, Littlejohns TJ, Sudlow C, et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am J Epidemiol. 2017;186:1026–1034. doi:10.1093/aje/kwx246.

10. Chatterjee N, Shi J, García-Closas M et al. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat Rev Genet. 2016;17:392–406. doi:10.1038/nrg.2016.27

Supplementary Figures

Baseline Risk OR per SD OR per 2 SD
9.32% 1.56 2.43
Table 4: OR per SD. The baseline risk for an individual with a median GRS, and the predicted OR at one and two SDs, respectively. A GRS must have a predicted OR >2 at 2 SD to be included in Orchid clinical reports.
UKB Prevalence Population Prevalence Liability R2
1.17% 10%[8] 4.41%
Table 5: Liability R2 The estimated liability R2 using a population prevalence of 10%.
Figure 6: GRS histograms. GRS distributions for cases and controls. Both are approximately normal, with the case distribution shifted noticeably higher.
Figure 7: The receiver operating characteristic (ROC) used to compute the ROC area under the curve (AUC). The ROC curve is a graphical representation of a binary classifier’s performance, plotting the True Positive Rate (TPR) against the False Positive Rate (FPR) across different decision thresholds. A curve closer to the top-left indicates a better model, while a diagonal line (AUC = 0.5) represents random guessing.
Figure 8: Calibration Curve. Calibration plot showing observed disease prevalence versus predicted risk across GRS deciles.

Acknowledgements

This research has been conducted using the UK Biobank Resource under Application Number 80545. 

Get expert reviewed guides hot off the presses.

Recent Articles