Systematic functional regulatory assessment of disease-associated variants.

Posted on November 14, 2013

Genome-wide association studies have discovered many genetic loci associated with disease traits, but the functional molecular basis of these associations is often unresolved. Genome-wide regulatory and gene expression profiles measured across individuals and diseases reflect downstream effects of genetic variation and may allow for functional assessment of disease-associated loci. Here, we present a unique approach for systematic integration of genetic disease associations, transcription factor binding among individuals, and gene expression data to assess the functional consequences of variants associated with hundreds of human diseases. In an analysis of genome-wide binding profiles of NFκB, we find that disease-associated SNPs are enriched in NFκB binding regions overall, and specifically for inflammatory-mediated diseases, such as asthma, rheumatoid arthritis, and coronary artery disease. Using genome-wide variation in transcription factor-binding data, we find that NFκB binding is often correlated with disease-associated variants in a genotype-specific and allele-specific manner. Furthermore, we show that this binding variation is often related to expression of nearby genes, which are also found to have altered expression in independent profiling of the variant-associated disease condition. Thus, using this integrative approach, we provide a unique means to assign putative function to many disease-associated SNPs.

http://www.ncbi.nlm.nih.gov/pubmed/

NFκB regions are enriched for disease-associated variants. (A) Fold-enrichments for disease-associated variants in NFκB regions: all associations and genome-wide significant variants include variants with nominal P < 0.01 and P < 10−7, respectively, in the original study. Error bars reflect the 95% confidence interval of Fisher’s exact test. (B) This enrichment is more pronounced when considering variants in LD with disease-associated variants. (C) Initial (lead SNP) enrichment analysis was repeated on a per-disease basis. Enrichment is shown for all associations relative to all diseases (red) and genome-wide significant associated variants (blue). (D) NFκB binding sites with disease-associated variants are stronger binding peaks than the average NFκB binding site. Error bars are shown based on t test 95% confidence interval.

NFκB regions are enriched for disease-associated variants. (A) Fold-enrichments for disease-associated variants in NFκB regions: all associations and genome-wide significant variants include variants with nominal P < 0.01 and P < 10−7, respectively, in the original study. Error bars reflect the 95% confidence interval of Fisher’s exact test. (B) This enrichment is more pronounced when considering variants in LD with disease-associated variants. (C) Initial (lead SNP) enrichment analysis was repeated on a per-disease basis. Enrichment is shown for all associations relative to all diseases (red) and genome-wide significant associated variants (blue). (D) NFκB binding sites with disease-associated variants are stronger binding peaks than the average NFκB binding site. Error bars are shown based on t test 95% confidence interval.