Inflammatory Bowel Disease Whitepaper

Inflammatory Bowel Disease Whitepaper
Orchid's team of genetic experts has developed a genetic risk score (GRS) for Inflammatory bowel disease.
Written by  Orchid Team

Orchid has developed advanced genetic risk scores (GRS) for a variety of diseases. Here we present our data on our GRS of inflammatory bowel disease.

Inflammatory Bowel Disease

Inflammatory bowel disease (IBD) is a term for two conditions, Crohn’s disease and ulcerative colitis, that are characterized by chronic inflammation of the gastrointestinal tract. It can cause chronic abdominal pain, diarrhea, ulcers, fatigue, weight loss, and anemia. Crohn’s disease can also cause perianal lesions. Surprisingly, smoking is associated with an increased risk of Crohn’s disease but a reduced risk of ulcerative colitis. Diet may also play a role in the development and management of both diseases.[1]

Genetic Risk Score

IBD is shaped by both environmental and genetic factors. Monogenic testing is not available because no single gene causes the condition. Genetic risk scores (GRS), which combine the small effects of many variants into a single score, are currently the only way to estimate genetic risk. Although not diagnostic, a GRS can indicate how likely an individual is to develop the disease.

Orchid’s IBD GRS was trained following current industry standards.[2][3] The GRS was constructed using the SBayesRC algorithm trained on publicly available FinnGen and Million Veterans Program summary statistics.[4][5] The summary statistics include 20,764 cases and 1,104,431 controls.[6] The resulting GRS contains over a million variants. 

Risk predictions are adjusted to each individual’s ancestry, with predictive power decaying as genetic distance from the predominately European training data increases.[7] Orchid considers a GRS meaningfully predictive if individuals at roughly the 97.7th percentile have an odds ratio (OR) of 2. The IBD GRS meets this criteria for all common ancestry groups.

Clinical Impact and Prevalence

IBD affects approximately 1.3% of U.S. adults per a 2015 survey.[8] Current treatments emphasize mucosal healing through the reduction of gut inflammation, with biologics associated with increased remission and mucosal healing rates in moderate to severe cases.[1]

Performant Risk Stratification

We evaluated the predictive performance of Orchid’s IBD GRS using the UK Biobank (UKB), a research database of roughly 500,000 genotyped individuals from the United Kingdom. We restricted the analysis to participants of British ancestry and defined IBD using the K50.x (Crohn’s disease) and K51.x (ulcerative colitis) ICD-10 codes, yielding 5,987 cases and 402,533 controls (1.5% prevalence). We then grouped individuals by GRS percentile and compared the observed disease prevalence within each group to our model’s predictions (Figure 1). For additional technical details, see the Supplementary Data.

Figure 1: Risk Stratification. Observed vs predicted risk in the UKB grouped by GRS percentile.

Xu et al. estimate a 1.3% prevalence of IBD,[8] similar to the computed 1.5% prevalence in the UKB. We adjust our model so that its average prevalence aligns with the Xu et al. estimate (see Figure 2).[9] People at the tail end of the GRS distribution were at an elevated risk compared to the mean (see Table 3), with adults in the 99th percentile 3.8x more likely to develop IBD than average (5.0% vs 1.3%). 

Figure 2: Adjusted Risk Stratification. Predicted risk estimates adjusted so that overall prevalence matches Xu et al’s 1.3% estimate.
Lifetime Risk Relative Risk
Average (mean) 1.3% 1.0x
Top 5% of distribution 3.2% 2.5x
Top 3% of distribution 3.7% 2.8x
Top 1% of distribution 5.0% 3.8x
Top 0.5% of distribution 5.9% 4.5x
Table 3: Prevalence and relative risk at elevated genetic risk. Individuals at the tail end of the GRS distribution were at an elevated risk of developing IBD.

References

1. Roda G, Ng SC, Kotze PG, et al. Crohn’s disease. Nat Rev Dis Primers. 2020;6(1):22. doi:10.1038/s41572-020-0156-2

2. Moore S, Davidson I, Anomaly J, et al. Development and validation of polygenic scores for within-family prediction of disease risks. medRxiv. 2025. doi:10.1101/2025.08.06.25333145.

3. Cordogan S, Starr DB, Treff NR, et al. Within- and between-family validation of nine polygenic risk scores developed in 1.5 million individuals: implications for IVF, embryo selection, and reduction in lifetime disease risk. medRxiv. 2025. doi:10.1101/2025.10.24.25338613.

4. Zheng, Z., Liu, S., Sidorenko, J. et al. Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries. Nat Genet 56, 767–777 (2024). https://doi.org/10.1038/s41588-024-01704-y

5. FinnGen. FinnGen+MVP+UKBB Summary Statistics. Available at: https://mvp-ukbb.finngen.fi/about. Accessed 2025-12-05.

6. FinnGen. FinnGen+MVP+UKBB Phenotypes. Available at: https://mvp-ukbb.finngen.fi. Accessed 2025-12-15.

7. Privé, Florian et al. “Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort.” American journal of human genetics vol. 109,1 (2022): 12-23. doi:10.1016/j.ajhg.2021.11.008

8. Xu F, Dahlhamer JM, Zammitti EP, et al. Health-Risk Behaviors and Chronic Conditions Among Adults with Inflammatory Bowel Disease — United States, 2015 and 2016. MMWR Morb Mortal Wkly Rep. 2018;67(6):190–195. doi:10.15585/mmwr.mm6706a4

9. Chatterjee N, Shi J, García-Closas M et al. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat Rev Genet. 2016;17:392–406. doi:10.1038/nrg.2016.27

Supplementary Figures

Baseline Risk OR per SD OR per 2 SD
1.03% 2.00 4.00
Table 4: OR per SD. The baseline risk for an individual with a median GRS, and the predicted OR at one and two SDs, respectively. A GRS must have a predicted OR >2 at 2 SD to be included in Orchid clinical reports.
UKB Prevalence Population Prevalence Liability R2
1.5% 1.3%[8] 6.38%
Table 5: Liability R2 The estimated liability R2 using a population prevalence of 1.3%.
Figure 6: GRS histograms. GRS distributions for cases and controls. Both are approximately normal, with the case distribution shifted noticeably higher.
Figure 7: The receiver operating characteristic (ROC) used to compute the ROC area under the curve (AUC). The ROC curve is a graphical representation of a binary classifier’s performance, plotting the True Positive Rate (TPR) against the False Positive Rate (FPR) across different decision thresholds. A curve closer to the top-left indicates a better model, while a diagonal line (AUC = 0.5) represents random guessing.
Figure 8: Calibration Curve. Calibration plot showing observed disease prevalence versus predicted risk across GRS deciles.

Acknowledgements

This research has been conducted using the UK Biobank Resource under Application Number 80545. 

Get expert reviewed guides hot off the presses.

Recent Articles