Transcription regulation: models for combinatorial regulation and functional specificity

Thomas, David John (2014) Transcription regulation: models for combinatorial regulation and functional specificity. Doctoral thesis (PhD), University of Sussex.

PDF - Published Version
Download (3MB) | Preview


Gene regulation id controlled by transcription factor proteins that bind to specific DNA sequences, known as transcription factor binding sites (TFBSs). Combinations of transcription factors working, co-operatively in cis-regulatory modules (CRMs), play a role in regulating gene expression. Current computational methods for TFBS prediction cannot distinguish between functional and non-functional sites, and predict very large numbers of false positives.

The thesis focuses on the development of a novel computational model, based on artificial neural networks (ANNs), for the identification of functional TFBSs, and the CRMs within which they operate in the human genome. Datasets of 12,239 experimentally verified true positive (TP) TFBSs and 130,199 false positive (FP) TFBSs were extracted using a combination of position weight matrices from the JASPAR database and experimentally verified sites from the Encyclopedia of DNA elements (ENCODE). A number of machine learning alsgorithms were tested using a range of genetic information including gene expression, necleosome positioning, DNA methylation states and DNA entropy. The best model, that gave a mean area under the curve under a receiver operator characteristic curve of 0.800, was based on a feedforward ANN using backpropagation.

This model was then used to predict functional TFBSs in a number of gene sets from the human genome. The predictions, combined with experimentally proven TFBSs from ENCODE, were used to investigate combinatorial [atterns of TFBSs operating in CRMs. CRM patterns have been analysed in disease-associated genes located in linkage disequilibrium blocks containing SNPs obtained from Genome Wide Association Studies (GWAS).

The potential for the model to make functional TFBS predictions to aid in the annotation of orphan genes of unknown function is discussed. In addition this thesis presents computational work on a number of smaller published studies.

Item Type: Thesis (Doctoral)
Schools and Departments: School of Life Sciences > Biochemistry
Subjects: Q Science > QH Natural history > QH0301 Biology > QH0426 Genetics > QH0447 Genes. Alleles. Genome
Depositing User: Library Cataloguing
Date Deposited: 29 May 2014 13:04
Last Modified: 21 Sep 2015 13:30

View download statistics for this item

📧 Request an update