De novo sequencing, assembly and analysis of the genome of the laboratory strain Saccharomyces cerevisiae CEN.PK113-7D, a model for modern industrial biotechnology
Nijkamp, J.F. ; Broek, M. van den; Datema, E. ; Kok, S. de; Bosman, L. ; Luttik, M.A. ; Daran-Lapujade, P. ; Vongsangnak, W. ; Nielsen, J. ; Heijne, W.H.M. ; Klaassen, P. ; Paddon, C.J. ; Platt, D. ; Kotter, P. ; Ham, R.C.H.J. van; Reinders, M.J.T. ; Pronk, J.T. ; Ridder, D. de; Daran, J.M. - \ 2012
Microbial Cell Factories 11 (2012). - ISSN 1475-2859
l-arabinose - alcoholic fermentation - biotin-prototrophy - chemostat cultures - gene prediction - yeast genome - glucose - evolutionary - protein - xylose
Saccharomyces cerevisiae CEN.PK 113-7D is widely used for metabolic engineering and systems biology research in industry and academia. We sequenced, assembled, annotated and analyzed its genome. Single-nucleotide variations (SNV), insertions/deletions (indels) and differences in genome organization compared to the reference strain S. cerevisiae S288C were analyzed. In addition to a few large deletions and duplications, nearly 3000 indels were identified in the CEN.PK113-7D genome relative to S288C. These differences were overrepresented in genes whose functions are related to transcriptional regulation and chromatin remodelling. Some of these variations were caused by unstable tandem repeats, suggesting an innate evolvability of the corresponding genes. Besides a previously characterized mutation in adenylate cyclase, the CEN. PK113-7D genome sequence revealed a significant enrichment of non-synonymous mutations in genes encoding for components of the cAMP signalling pathway. Some phenotypic characteristics of the CEN. PK113-7D strains were explained by the presence of additional specific metabolic genes relative to S288C. In particular, the presence of the BIO1 and BIO6 genes correlated with a biotin prototrophy of CEN. PK113-7D. Furthermore, the copy number, chromosomal location and sequences of the MAL loci were resolved. The assembled sequence reveals that CEN. PK113-7D has a mosaic genome that combines characteristics of laboratory strains and wild-industrial strains.