However, the availability of complete genome sequences for only a

However, the availability of complete INCB024360 clinical trial genome sequences for only a few strains is insufficient to interrogate the extent of the genetic diversity of H. influenzae and its close species relatives. In this study, a detailed analysis of 18 H. influenzae type find more b (Hib) strains compared to a common reference identified regions of high SNP density or sequence mismatches consistent with inter-strain exchange of DNA most plausibly derived from other H. influenzae strains through

transformation, rather than phage or conjugative transfer. Further evidence for the role of transformation in the import of novel sequence flanked by regions of DNA found in both the donor and recipient was obtained through

sequencing DNA obtained from a pool of strains each transformed with DNA from a heterologous donor Hib strain. Results Whole genome sequencing of 85 strains of Haemophilus spp The genomes of 96 strains of Haemophilus spp. (Table  1) were sequenced GDC-973 using the Illumina GAII platform. For 85 of these strains where sufficient coverage had been attained, genome sequences of between 1.27 Mbp to 1.91 Mbp in length were assembled by Velvet [14] (Table  1). The sequencing and assembly resulted in between 351 and 1521 contigs per strain with a median of 785 contigs per assembled genome. The genome sequences were partial and the %G+C content of these (37.94 to 40.39%) was higher than expected based on data from other completed H. influenzae genomes (38.01-38.15%). DNA similarity filipin searches and mapping of the sequence reads using MAQ [15] confirmed that the higher %G+C regions of the genomes had been preferentially sequenced, a known issue with early versions of the Illumina sequencing chemistry. We estimated the average genome coverage to be 83%, based on comparison with extant complete H. influenzae genome sequences; this data represents a ten-fold increase in the amount of genome sequence information

available for H. influenzae. Table 1 Haemophilus strains selected for study Strain name Type Geographic location Year Length of sequence (Mb) Disease/ Site of isolation RM7190 a Malaysia 1973 1.5 meningitis RM6062 a England 1965 1.5 nasopharynx RM6064 a England 1966 1.5 pleural fluid RM6073 a England 1966 1.6 bronchitis RM7017 b Ghana 1983 1.6 CSF RM7060 b New York, USA 1971 1.5 nasopharynx RM7414 b Kenya 1980’s 1.5   RM7419 b Kenya 1980’s 1.5   RM7651 b Norway 1976 1.7   DC11238 b UK 2003 1.8 meningitis DC800 b UK 1989 1.9 meningitis DC8708 b UK 2000 1.8   DCG1574 b Gambia 1993 1.8 nasopharynx Eagan b     1.5   RM7578 b Switzerland 1983 1.8   RM7582 b RSA 1980’s 1.8   RM7598 b USA 1985 1.8   RM7018 b* Ghana 1983 1.4 CSF RM7122 b* Australia <1984 1.5 meningitis RM7459 b* Iceland 1984 1.4 CSF RM7465 b* Iceland 1985 1.6 CSF RM7617 b* Malaysia 1970’s 1.5 CSF RM6132 c England 1964 1.

