|Title||Bayesian Markov random field analysis for integrated network-based protein function prediction|
|Source||University. Promotor(en): Cajo ter Braak, co-promotor(en): Roeland van Ham. - [S.l.] : S.n. - ISBN 9789085859598 - 113|
PRI BIOS Applied Bioinformatics
|Publication type||Dissertation, internally prepared|
|Keyword(s)||statistiek - bayesiaanse theorie - markov-processen - netwerkanalyse - biostatistiek - toegepaste statistiek - bio-informatica - eiwitten - genen - moleculaire biologie - statistics - bayesian theory - markov processes - network analysis - biostatistics - applied statistics - bioinformatics - proteins - genes - molecular biology|
|Categories||Mathematical Statistics / Bioinformatics (General)|
Unravelling the functions of proteins is one of the most important aims of modern biology. Experimental inference of protein function is expensive and not scalable to large datasets. In this thesis a probabilistic method for protein function prediction is presented that integrates different types of data such as sequences and networks. The method is based on Bayesian Markov Random Field (BMRF) analysis. BMRF was initially applied to genome wide protein function prediction using network data in yeast and in also in Arabidopsis by integrating protein domains (i.e InterPro signatures), expressions and protein protein interactions. Several of the predictions were confirmed by experimental evidence. Further, an evolutionary discrete optimization algorithm is presented that integrates function predictions from different Gene Ontology (GO) terms to a single prediction that is consistent to the True Path Rule as imposed by the GO Directed Acyclic Graph. This integration leads to predictions that are easy to be interpreted. Evaluation of of this algorithm using Arabidopsis data showed that the prediction performance is improved, compared to single GO term predictions.