cerevisiae with a much higher number. This yeast seems therefore to differ clearly from filamentous fungi in the sense that it possesses quite a lower number of O-glycosylated proteins (Table 1), only partially explained by the smaller genome size, but they are more extensively O-glycosylated (Figure 2). Figure 2 Frequency distribution of the number of O -glycosylation sites per protein predicted by NetOGlyc. Inset displays the average number of O-glycosylated
residues per protein, corrected by multiplying by 0.68 to compensate the overestimation of O-glycosylated sites produced by the server on fungal proteins. See details in the text. If we look at individual proteins we can find some with an AG-881 cell line extremely high number of O-glycosylation sites (Additional file 2). The protein with the highest LY3039478 in vivo proportion of predicted O-glycosylated residues is the M. grisea protein MG06773.4, of unknown function, with about half of its 819 amino acids being predicted to be O-glycosylated. Next is the S. cerevisiae protein YIR019C (Muc1), a mucin-like protein necessary for the yeast to grow with a filamentous pseudohyphal form . Muc1 is a 1367-amino acids protein, of which 42% are predicted to be O-glycosylated.
Similar examples can be found in the rest of the Blasticidin S cell line genomes, with at least a few proteins predicted to have more than 25% of their residues O-glycosylated. Fungal proteins are rich in pHGRs The glycosylation positions
obtained from NetOGlyc were analyzed with the MS Excel macro XRR in search of O-glycosylation-rich regions. The Glutamate dehydrogenase raw results can be found in Additional file 3 and a summary is presented in Table 2. All the genomes analyzed code for plenty of secretory proteins with pHGRs. Between 18% (S. cerevisiae) and 31% (N. crassa) of all proteins with predicted signal peptide contain at least one pHGR. The average length of pHGRs was similar for the eight genomes, varying between 32.3 residues (U. maydis) and 66.9 residues (S. cerevisiae), although pHGRs could be found of any length between the minimum, 5 residues, to several hundred. All genomes coded for proteins predicted to have quite large pHGRs, the record being the 821-aa pHGR found in the S. cerevisiae protein Muc1 discussed above. Globally, we could summarize these data by saying that among the set of secretory fungal proteins predicted by NetOGlyc to be O-glycosylated, about one fourth shows at least one pHGR having a mean length of 23.6 amino acids and displaying, on average, an O-glycosylated Ser or Thr residue every four amino acids.