The interspecific fungal hybrid Verticillium longisporum displays sub-genome-specific 1 gene expression 2 3

Hybridization is an important evolutionary mechanism that can enable organisms to adapt to environmental challenges. It has previously been shown that the fungal allodiploid species Verticillium longisporum, causal agent of Verticillium stem striping in rape seed, has originated from at least three independent hybridization events between two haploid Verticillium species. To reveal the impact of genome duplication as a consequence of the hybridization, we studied the genome and transcriptome dynamics upon two independent V. longisporum hybridization events, represented by the hybrid lineages “A1/D1” and “A1/D3”. We show that the V. longisporum genomes are characterized by extensive chromosomal rearrangements, including between parental chromosomal sets. V. longisporum hybrids display signs of evolutionary dynamics that are typically associated with the aftermath of allodiploidization, such as haploidization and a more relaxed gene evolution. Expression patterns of the two sub-genomes within the two hybrid lineages are more similar than those of the shared A1 parent between the two lineages, showing that expression patterns of the parental genomes homogenized within a lineage. However, as genes that display differential parental expression in planta do not typically display the same pattern in vitro, we conclude that sub-genome-specific responses occur in both lineages. Overall, our study uncovers the genomic and transcriptomic plasticity during evolution of the filamentous fungal hybrid V. longisporum and illustrate its adaptive potential.Verticillium is a genus of plant-associated fungi that include a handful of plant pathogens that collectively affect a wide range of hosts. On several occasions, haploid Verticillium species hybridized into the stable allodiploid species Verticillium longisporum, which is, in contrast to haploid Verticillium species, a Brassicaceae specialist. Here, we studied the evolutionary genome and transcriptome dynamics of V. longisporum and the impact of the hybridization. V. longisporum genomes display a mosaic structure due do genomic rearrangements between the parental chromosome sets. Similar to other allopolyploid hybrids, V. longisporum displays an ongoing loss of heterozygosity and a more relaxed gene evolution. Also, differential parental gene expression is observed, with an enrichment for genes that encode secreted proteins. Intriguingly, the majority of these genes displays sub-genome-specific responses under differential growth conditions. In conclusion, hybridization has incited the genomic and transcriptomic plasticity that enables adaptation to environmental changes in a parental allele-specific fashion.

rearrangements between homeologous chromosomes. To identify chromosomal 254 rearrangements after the hybridization event that led to the A1/D1 lineage, the genome of V. 255 longisporum strain VLB2 was aligned to that of strain VL20, revealing 24 syntenic breaks 256 ( Figure 3B). Rearrangement occurred in the majority of the chromosomes as only 2 and 3 257 chromosomes did not have syntenic breaks in VLB2 and VL20, respectively ( Figure 3B). As 258 genomic rearrangements are often associated with repeat-rich genome regions, such as in V. 259 dahliae (46, 48, 49), the synteny break points were tested for their association to repetitive 260 regions. Since the median repeat fraction in a 20 kb window around the repeats is 15.5%, 261 which is significantly more than the median repeat fraction based on random sampling 1 5 (average = 3.4%, σ = 1.7%) ( Figure S4), it can be concluded that the chromosomal 263 rearrangements are similarly associated with repeats also in V. longisporum. In conclusion, 264 chromosomal rearrangement rather than gene conversion is the main mechanism explaining 265 the mosaic structure of the V. longisporum genome. 266 267

V. longisporum loses heterozygosity through deletions 281
To study putative gene losses in the aftermath of hybridization, we determined genes that 282 have no homeolog or paralog, and can thus be considered to occur in single copy. For the 283 A1/D1 isolates, 15.3-15.4% of the genes occur in single copy, whereas this is 19.9% for 284 A1/D3 isolate PD589 ( Figure 2B). We checked if proteins encoded by single copy genes are 285 enriched for particular Gene Ontology (GO) terms, Clusters of Orthologous Groups (COGs), 286 or encoding a protein with a signal peptide, which suggests that these proteins are secreted. 287 No GO terms and COGs were enriched for the single copy genes in any of the V. 288 longisporum strains (Fisher's exact test with Benjamini-Hochberg correction, p-value < 289 0.05). In total, 7.8-10.2% of the single copy genes encodes a protein with a signal peptide, 290 which is a significantly lower than the 11.9-12.3% for genes with a homologous copy in the 291 same genome (Fisher's exact test, p-value < 0.05). Of the A1/D1 single copy genes, 52% 292 reside in the A1 sub-genome and 47% in the D1 sub-genome. Similarly, for PD589, 49% and 293 50% reside in the A1 and D3 sub-genome, respectively. Thus, single copy genes are equally 294 distributed across the two sub-genomes in V. longisporum. Single copy genes can either 295 originate from gene loss or from parent-specific contributions to the hybrid. Since VLB2 and 296 VL20 originate from the same hybridization event (40), we can quantify how many single 297 copy genes originate from gene loss during divergence of VLB2 and VL20. In total, 14.7-298 14.8% of the singly copy genes have at least one copy in each sub-genome of the other 299 A1/D1 strain, suggesting that gene deletion is an on-going process in V. longisporum 300 evolution. Of the single copy genes that lost their homeolog after the hybridization event, 301 48% resided in the species A1 sub-genome, whereas 51-52% in the D1 sub-genome, 302 suggesting that gene losses occurred to a similar extent in each of the sub-genomes.

Acceleration of gene evolution upon hybridization 304
To investigate the gene sequence evolution subsequent to hybridization, we compared the 305 ratio of non-synonymous (Ka) and synonymous (Ks) substitutions (ω) for branches leading to 306 Verticillium species (Figure 4). To exclude the putative impact of the (partial) chromosome 307 13 duplication in PD589, we excluded genes of this chromosome from the analysis. nubilum orthologs, but also higher than their lineage A1/D1 homologs from the 342 corresponding A1 and D sub-genomes to select genes that quickly evolved after the A1 and 343 D1/D3 last common ancestor. In total, 1,350 of the 3,823 (35.3%) analyzed genes were 344 quickly evolving in PD589 A1 sub-genome and 1,084 (28.4%) for the D3 sub-genome. We 345 screened for GO term, COG and secreted protein enrichments in these fast evolving A1/D3 346 genes and no enrichments for the COGs and for genes encoding secreted proteins were found.

9
In the A1 sub-genome 3 GO terms with a molecular function were significantly enriched, 348 associated with molecule binding (protein and ATP) and ATPase activity. In the D3 sub-349 genome, "ATP binding" was the only significantly enriched GO term, which was also 350 enriched in the A1 sub-genome. In conclusion, the more pronounced "gene relaxation" in the 351 A1/D3 lineage when compared with the A1/D1 lineage does not clearly seem to affect genes 352 with particular functions. with orthologs that are present in two homeologous copies in three V. longisporum strains 359 (VLB2, VL20, and PD589). Genes on chromosome 13 from strain PD589 and their homologs 360 were excluded from the analysis to avoid putative biases due to a (partial) chromosome 361 duplication, and in total 5,604 expressed genes were compared. RNA sequencing reads were 362 mapped to the predicted V. longisporum genes of which 50-51% mapped to species A1 363 homeologs and 49-50% to the D homeologs. Thus, we observed no global differences in 364 overall contribution to gene expression of the sub-genomes. Over half of the V. longisporum 365 homeologs display no differential expression with their V. dahliae ortholog, indicating that 366 the majority of the genes did not evolve differential expression patterns ( Figure 5A). In both 367 lineages, higher numbers of differently expressed genes were found in the A1 sub-genome 368 than in the D sub-genomes; 27 vs. 23% for A1/D1 and 38 vs. 34% for A1/D3, respectively. 369 The higher fraction of differentially expressed A1 genes is in accordance with the more 370 distant phylogenetic relationship of parent A1 with V. dahliae than of the D parents ( Figure  371 S3). Intriguingly, although D3 diverged more recently from V. dahliae than D1, D3 has more 372 2 0 differentially expressed orthologs with V. dahliae than D1. When comparing expression 373 patterns between sub-genomes, 11-13% of the genes display differential expression between 374 their A1 and D homeologs. Intriguingly, this is more than half the number of differentially 375 expressed D and V. dahliae orthologs (23-34%), despite the fact that the D parents diverged 376 more recently from V. dahliae than from species A1 ( Figure S3). In general, the gene 377 expression patterns of the A1 and D sub-genomes of the same hybridization event are highly 378 correlated (0.93-0.96), higher than D sub-genomes and V. dahliae strain JR2 (0.85-0.89) and 379 higher than the A1 sub-genomes between hybridization events (0.82-0.84) ( Figure 5B; Table  380 S2). To compare these expression patterns with the gene expression variation between 381 different V. dahliae strains, we sequenced RNA from the cotton-infecting V. dahliae strain 382 CQ2 grown in potato dextrose broth. Although JR2 and CQ2 belong to the same species, 383 their overall gene expression pattern is more dissimilar (ρ = 0.89) than that of V. longisporum 384 sub-genomes ( Figure 5B; Table S2). The overall discrepancy in phylogenetic relationship and 385 expression pattern similarities suggests that sub-genome expression patterns of the sub-386 genomes in V. longisporum homogenized upon hybridization. 387

Differential homeolog expression occurs in particular gene categories 395
Although parental gene expression patterns appear to have globally homogenized upon 396 hybridization, differential homeolog expression occurs as well ( Figure 5). To assess if genes 397 with differential homeolog expression belong to specific gene groups, we screened for 398 functional enrichments. In total, 10% of the fast-evolving PD589 genes (defined in the 399 previous section) have different homeolog expression, which is significantly lower than the 400 12% different homeolog expression for the remainder of the genes (Fisher's exact test, P < 401 0.05). In both A1/D1 and A1/D3 lineages, genes with differential homeolog expression are 402 enriched for GO terms related to oxidation-reduction processes, transmembrane transport and 403 FAD binding ( Figure 6A, Table S3). Additionally, the COGs "carbohydrate transport and 404 metabolism" and "secondary metabolites biosynthesis, transport, and catabolism" (Q) are 405 enriched in both lineages (Table S3). Furthermore, we tested if genes encoding secreted 406 proteins were significantly enriched among the genes with differential homeolog expression. 407 Indeed, 23 and 16 % of the genes with different homeolog expression code for a secreted 408 protein in the lineage A1/D1 isolates and in the A1/D3 isolate, respectively, whereas this is 409 9% of the genes that do not display differential expression among homeologs (VLB2 P 410 =1.23E-32, VL20 P =3.71E-29 and PD589 P =1.14E-08, Fisher's exact test). In conclusion, 411 differential homeolog expression seems to be important for particular gene categories, 412 including categories that can be implicated in plant pathogenicity. the V. longisporum strains VLB2, VL20, and PD589. As observed previously, oilseed rape 420 plants inoculated with VLB2 and PD589 developed typical Verticillium symptoms including 421 stunted plant growth and leaf chlorosis (51). In contrast, oilseed rape plants inoculated with 422 VL20 did not display any disease symptoms. Consequently, we performed total RNA 423 sequencing for oilseed rape plants inoculated with V. longisporum strains VLB2 and PD589. 424 For strain PD589, genes on chromosome 13 and their homeologs were removed from the 425 analysis. For VLB2 and PD589, 51% of the reads mapped to the A1 sub-genome and 49 to 426 the D sub-genome. Thus, similar to in vitro grown V. longisporum, we did not observe any 427 global difference in overall contribution to gene expression of one of the sub-genomes in 428 planta. In total, 1.1% and 2.7% of the homeologs displayed differential expression in planta, 429 which is less than the 11.3 and 13.4% found for VLB2 and PD589 grown in vitro, 430 respectively. Genes with differential homeolog expression in planta were not enriched for 431 any GO term in the A1/D1 strain VLB2 (Table S3), whereas in the A1/D3 strain PD589, 432 differentially expressed homeologs were enriched for GO terms associated with oxidation-433 reduction processes and molecular binding (iron ion, heme, flavin adenine dinucleotide and 434 FAD) ( Figure 6A, Table S3). For A1/D1 and A1/D3, genes with different homeolog 435 expression were enriched for those encoding secreted proteins; 25% of differentially 436 expressed homeologs encode secreted proteins and 8-9% of the non-differentially expressed 437 homeologs encode other proteins (P < 0.05, Fisher's exact test). Thus, similar to in vitro V. 438 longisporum grown, differential homeolog expression in planta is especially important for 439 genes encoding secreted proteins. In 33% of these secretome genes with differential 440 homeolog expression in planta, no Pfam domain could be annotated, which is a feature often 441 observed for effector proteins as they are often examples of biological innovation (52). Of 442 these genes that could be functionally annotated, a carbohydrate-active enzyme (CAZyme) 443 function was annotated in 32% of the cases. The remaining part of the functionally annotated 444 2 4 genes with differential homeolog expression included other enzymes, such as proteases, 445 lipases, carboxylesterases and peroxidases. We compared genes with differential homeolog 446 expression in planta and in vitro to assess potential correlation. Intriguingly, over half (54-447 55%) of the differentially expressed homeologs in planta are not differentially expressed in 448 culture medium or have the inverse expression pattern, e.g. in vitro: A1>D and in planta 449 A1<D ( Figure 6B). Thus, over half of the genes with a differential homeolog expression in 450 planta display a homeolog-specific response compared to in vitro growth. For VLB2, 19% of 451 these genes with a homeolog-specific response encode secreted proteins, whereas 32% of 452 genes with similar differential homeolog expression in planta and in vitro encode secreted 453 proteins. The opposite pattern in observed for PD589, i.e. 30% with a homeolog-specific 454 response and 19% with similar differential homeolog expression in planta and in vitro. 455 However, these differences were not significant (P > 0.05, Fisher's exact test). In conclusion, 456 different growing conditions cause homeolog-specific changes in the majority of the V. 457 longisporum genes with differential homeolog expression, which is enriched in genes that 458 encode secreted proteins. thermotolerance. Thus, more generally, hybrid fungi, comprising natural as well as artificial 485 hybrids, respond to environmental change in an allele-specific manner, especially for genes 486 that manipulate or mitigate environmental changes. Secretome genes with a differential 487 homeolog expression in planta often have an enzymatic function or lack an annotated Pfam 488 domain, which is a feature often observed for effector proteins that act in pathogenicity (52). 489 Thus, conceivably, homeolog specific responses in planta occurs in genes that are important 490 for host colonization. Similarly, differential homeolog expression in the hybrid opportunistic 491 human pathogen Candida orthopsilosis involves genes that are implicated in host 492 interactions, related to superoxide dismutase activity and zinc metabolism (55). 493 2 7 Although differential homeolog expression occurs, the general tendency is that 494 expression patterns between the A1 and D sub-genomes homogenized upon hybridization 495 ( Figure 5). Despite the absence of A1 and D1 species due to their enigmatic nature, we can 496 conclude that parental gene expression patterns homogenized in the aftermath of 497 hybridization as sub-genome expression patterns display more resemblance than the 498 expression pattern between V. dahliae and the D parents and between the A1 sub-genomes of 499 different hybridization events ( Figure 5B; Table S2). Homogenization of parental expression 500 patterns has been similarly observed in the fungal allopolyploid Epichloë Lp1 (36) as well as 501 in the artificial hybrid S. cerevisiae x S. uvarum where the extent of differential ortholog 502 expression between the parents was diminished upon hybridization (56). Thus, gene 503 expression homogenization seems to be a more general phenomenon in fungi. Gene 504 expression divergences may evolve through mutations in regulatory sequences of the gene 505 itself (cis-effects), such as promoter elements, or alterations in other regulatory factors (trans-506 effects), such as chromatin regulation (57, 58). Conceivably, the higher correlation in 507 homeolog expression patterns than parental ortholog expression patterns originates from 508 changes in trans regulators as homeologs, in contrast to orthologs, share the same nuclear 509 environment (58). Intriguingly, parent D3 has more genes that are differentially expressed to 510 V. dahliae orthologs than parent D1, despite that D3 diverged more recently from V. dahliae 511 than D1 ( Figure 5, Figure S4). Correspondingly, the A1 sub-genome of the lineage A1/D3 512 displays more differential gene expression with V. dahliae than the A1 sub-genome of the 513 A1/D1 lineage. This can indicate that A1 and D3 hybridized before A1 and D1, as more 514 distinct expression patterns may have evolved over time. 515 In addition to the transcriptomic plasticity of homeolog expression upon 516 environmental changes, V. longisporum is also plastic on a genomic level, which is displayed 517 by its mosaic structure ( Figure 1A, Table S1). Mosaicism is also observed in the grass 518 2 8 pathogen Zymoseptoria pseudotritici, which is a close relative of the prominent wheat 519 pathogen Zymoseptoria tritici (29). Z. pseudotritici is a homoploid hybrid that displays 520 mosaicism on a population level where genome regions inherited from one parent display low 521 variation, whereas high variable genome regions were transmitted from both parents. V. 522 longisporum mosaicism is caused by extensive genomic rearrangements after hybridization 523 ( Figure 2B The V. longisporum D genomes globally display accelerated evolution when 540 compared with their V. dahliae orthologs (Figure 4), which may be a consequence of genome 541 doubling. Interestingly, the V. longisporum A1/D3 lineage strain PD589 encountered a more 542 divergent gene evolution when compared with the A1/D1 lineage strains VLB2 and VL20 in 543 2 9 both sub-genomes, indicating that the A1/D3 hybridization occurred prior to the A1/D1 544 hybridization as a longer allodiploid state could facilitate extended sequence divergence (65). 545 However, accelerated evolution is not consistently observed in fungi as deceleration upon 546 allopolyploidization has been recorded in the fungal genus Trichosporon (66). Arguably, 547 environmental cues play an important role in the speed and grade of gene diversification upon 548 allopolyploidization (67). Possibly, accelerated gene evolution in V. longisporum is cued by a 549 host range alteration as it is, in contrast to haploid Verticillium species, a Brassicaceae 550 specialist (42). However, we did not find functional enrichments in fast evolving genes that 551 points towards that hypothesis. Moreover, as the A1 species remains enigmatic, we cannot be 552 sure a host shift occurred (39, 41). 553 Whole-genome duplication events are typically followed by extensive gene loss, often 554 leading to reversion to the original ploidy state (68). For instance, the artificial interspecific 555 hybrid S. cerevisiae x S. uvarum encountered nine independent events where loss of 556 heterozygosity occurred after evolving for hundreds of generations under nutrient-limited 557 conditions (69). Heterozygosity loss has only proceeded to a limited extent in V. 558 longisporum, as 84% of lineage A1/D1 genes and 79% of lineage A1/D3 genes are present in 559 two copies, whereas the haploid V. dahliae only contains 0.4% of its genes in two copies 560 ( Figure 2B). Thus, the V. longisporum genome displays the symptoms of a recent allodiploid, 561 with gene loss being an on-going process that by now has only progressed marginally. 562 Heterozygosity loss can indicate deleterious epistatic interactions between parental genomes 563 that need to homogenize in order for the hybrid to be viable. Similar to other fungal hybrids 564 (69, 70), we did not observe a specific group of genes where loss of heterozygosity was 565 selected for. The degree of haploidization is a third indication that the A1/D3 lineage likely 566 hybridized prior to A1/D1, as haploidization progress further in A1/D3 than in A1/D1 (Figure  567 2B). C. orthopsilosis hybrids from different hybridization events have different degrees of 568 3 0 heterozygosity loss but genes were homeologs are maintained in both hybrids are enriched 569 for those to have differential homeolog expression (55). Although species often revert to their 570 original ploidy state after polyploidization, a retention of both homeolog copies can also be 571 evolutionary advantageous, for instance to respond in a parental allele-specific fashion to 572 environmental cues (Figure 6). 573 574

Conclusion 575
Allodiploidization is an intrusive evolutionary mechanism in fungi where two chromosome 576 sets from parents with a distinct evolutionary history merge. Consequently, most genes obtain 577 an additional gene copy that can be differentially regulated according to the environmental 578 conditions. Thus, allodiploid fungi can respond in a parental allele-specific fashion to 579 environmental cues. Besides such parental allele-specific gene expression, allodiploidization 580 furthermore contributed to a dynamic genome evolution through rearrangements between 581 parental chromosome sets and accelerated gene evolution in V. longisporum. Thus, in 582 comparison to haploid Verticillium species, V. longisporum has a high adaptive potential that 583 can contribute to host immunity evasion and may explain its specialization towards 584 Brassicaceous plant hosts. 585 3 1

V. longisporum genome sequencing and assembly 587
Genome assemblies of the V. longisporum strains VLB2 and VL20 were previously 588 constructed using long reads obtained through single-molecule real-time (SMRT) sequencing 589 (40). Here, we sequenced V. longisporum strains PD589 using Oxford Nanopore Technology 590 (ONT). In order to obtain DNA of PD589, spores were harvest from PDA plates and grown 591 2% CTAB) and Sarkosyl (10% w/v) in a 2:2:1 ratio. Subsequently, ½ volume of 596 phenol/chloroform/isoamyl alcohol (25:24:1) was added, shaken vigorously and incubated at 597 room temperature (RT) for 5 minutes before centrifugation at maximum speed (16,000 rpm) 598 for 15 minutes (RT). The upper (aqueous phase) layer was transferred to a new tube, 5 μ L of 599 RNAase (10 mg/μL) was added and incubated at 37°C for one hour. Next, ½ volume of 600 chloroform was added, mixed and centrifuged at maximum speed for 10 minutes at RT. The 601 upper layer was transferred to a new tube and a second chloroform wash step was performed. 602 After transferring the upper layer to a new tube, it was mixed with 1 volume (~ 800 μ L) of 603 100% ice-cold ethanol by gently inverting the tube and finally the DNA was fished out and 604 washed twice by applying 500 μ L of 70% ethanol. Finally, the DNA was air-dried, 605 resuspended in nuclease-free water and stored at 4°C overnight. The DNA quality, size and 606 quantity were assessed by nanodrop, gel electrophoresis and Qubit analyses, respectively. 607 To sequence the V. longisporum strain PD589 DNA, a library was prepared as 608 described in the manufactures protocol provided by ONT (SQK-RAD004) with an initial 609 amount of ~ 400 ng HMW DNA. The library was loaded onto a R9.4.1 flow cell which ran 610