Professor of Biological Sciences
of Biological Sciences,
Director, Center for Science Education, University of South Carolina
Bacterial Genome Evolution
We are studying the genomes of the caulobacters and the bacteriophage that attack them to understand the roles of genome rearrangements, mutations, and horizontal gene transfer in genome evolution. Caulobacters have an interesting life cycle (shown above) that includes the production of an immature cell which must go through a process of differentiation before it can divide. We have shown that extensive genome rearrangements have occurred among the Caulobacter species studied to date. However, similar rearrangements have not occurred among the phages that infect these bacteria. This project involves the isolation of bacteria and bacteriophage from the wild, characterization of the isolates, genome sequencing, and comparing the resulting genomes using bioinformatic tools to identify patterns of mutation, gene acquisition, and genome rearrangements.
Selected Publications (list of 112 total):
Bacteriophages remain an understudied component of bacterial communities. Therefore, our laboratory has initiated an effort to isolate large numbers of bacteriophages that infect Caulobacter crescentus to provide an estimate of the diversity of bacteriophages that infect this common environmental bacterium. The majority of the new isolates are phicbkviruses, a genus of giant viruses that appear to be Caulobacter specific. However, we have also isolated several Podoviruses with icosahedral heads and small tails. One of these Podoviruses, designated Lullwater, is similar to two previously isolated Caulobacter phages, Cd1 and Percy. All three have genomes that are approximately 45 kb and contain approximately 30 genes. The gene order is conserved among the three genomes with one of the genes coding for a DNA polymerase that has homology to the family of T7 DNA polymerases. Phylogenetic trees based on either the DNA polymerase or the RNA polymerase amino acid sequences suggests that the three phages represent a new branch of the T7virus tree. Based on these similarities, we concluded that Cd1, Lullwater and Percy comprise a new group in the T7virus genus.
Ash, K., K. M. Drake, W. S. Gibbs, and B. Ely. 2017. Genomic diversity of type B3 bacteriophages of Caulobacter crescentus. Current Microbiology 74:779-786 DOI 10.1007/s00284-017-1248-4. MS Word Version
The genomes of the type B3 bacteriophages that infect Caulobacter crescentus are among the largest phage genomes thus far deposited into GenBank with sizes over 200 kb. In this study, we introduce six new bacteriophage genomes which were obtained from phage collected from various water systems in the southeastern United States and from tropical locations across the globe. A comparative analysis of the 12 available genomes revealed a “core genome” which accounts for roughly 1/3 of these bacteriophage genomes and is predominately localized to the head, tail, and lysis gene regions. Despite being isolated from geographically distinct locations, the genomes of these bacteriophages are highly conserved in both genome sequence and gene order. We also identified the insertions, deletions, translocations, and horizontal gene transfer events which are responsible for the genomic diversity of this group of bacteriophages and demonstrated that these changes are not consistent with the idea that modular reassortment of genomes occurs in this group of bacteriophages.
Scott#, D., and B. Ely. 2016. Conservation of the essential genome among Caulobacter and Brevundimonas species. Current Microbiology 72:503-510 DOI: 10.1007/s00284-014-0721-6. MsWord version
When the genomes of Caulobacter isolates NA1000 and K31 were compared, numerous genome rearrangements were observed. In contrast, similar comparisons of closely related species of other bacterial genera revealed nominal rearrangements. A phylogenetic analysis of the 16S rRNA indicated that K31 is more closely related to Caulobacter henricii CB4 than to other known Caulobacters. Therefore, we sequenced the CB4 genome and compared it to all of the available Caulobacter genomes to study genome rearrangements, discern the conservation of the NA1000 essential genome, and address concerns about using 16S rRNA to group Caulobacter species. We also sequenced the novel bacteria, Brevundimonas DS20, a representative of the genus most closely related to Caulobacter and used it as part of an outgroup for phylogenetic comparisons. We expected to find that there would be fewer rearrangements when comparing more closely related Caulobacters. However we found that relatedness was not correlated with the amount of observed “genome scrambling”. We also discovered that nearly all of the essential genes previously identified for C. crescentus are present in the other Caulobacter genomes and in the Brevundimonas genomes as well. However, a few of these essential genes were only found in NA1000, and some were missing in a combination of one or more species, while other proteins were 100% identical across species. Also, phylogenetic comparisons of highly conserved genomic regions revealed clades similar to those identified by 16S rRNA-based phylogenies, verifying that 16S rRNA sequence comparisons are a valid method for grouping Caulobacters.
Callahan C. T., K. M. Wilson, B. Ely. 2015. Characterization of the proteins associated with Caulobacter crescentus bacteriophage CbK particles. Current Microbiology 72:75-80. DOI: 10.1007/s00284-015-0922-7. Full text
Bacteriophage genomes contain an abundance of genes that code for hypothetical proteins with either a conserved domain or no predicted function. The Caulobacter phage CbK has an unusual shape, designated morphotype B3 that consists of an elongated cylindrical head and a long flexible tail. To identify CbK proteins associated with the phage particle, intact phage particles were subjected to SDS-PAGE, and the resulting protein bands were digested with trypsin, and analyzed using MALDI mass spectroscopy to provide peptide molecular weights. These peptide molecular weights were then compared with the peptides that would be generated from the predicted amino acid sequences that are coded by the CbK genome, and the comparison of the actual and predicted peptide masses resulted in the identification of single genes that could code for the set of peptides derived from each of the 20 phage proteins. We also found that CsCl density gradient centrifugation resulted in the separation of empty phage heads, phage heads containing material organized in a spiral, isolated phage tails, and other particulate material from the intact phage particles. This additional material proved to be a good source of additional phage proteins and preliminary results suggest that it may include a CbK DNA replication complex.
Ely B., W. Gibbs, S. Diez, and K. Ash. 2015. The Caulobacter crescentus transducing phage Cr30 is a unique member of the T4-like family of myophages. Current Microbiology 70:854-858. DOI : 10.1007/s00284-015-0799-5 Full text
Bacteriophage Cr30 has proven useful for the transduction of Caulobacter crescentus. Nucleotide sequencing of Cr30 DNA revealed that the Cr30 genome consists of 155,997 bp of DNA that codes for 287 proteins and five tRNAs. In contrast to the 67 % GC content of the host genome, the GC content of the Cr30 genome is only 38 %. This lower GC content causes both the codon usage pattern and the amino acid composition of the Cr30 proteins to be quite different from those of the host bacteria. As a consequence, the Cr30 mRNAs probably are translated at a rate that is slower than the normal rate for host mRNAs. A phylogenetic comparison of the genome indicates that Cr30 is a member of the T4-like family that is most closely related to a new group of T-like phages exemplified by фM12.
Patel, S., B. Fletcher, D. C. Scott, and B. Ely. 2015. Genome sequence and phenotypic characterization of Caulobacter segnis. Current Microbiology 70:355-363. DOI: 10.1007/s00284-014-0726-1 MsWord version
Caulobacter segnis is a unique species of Caulobacter that was initially deemed Mycoplana segnis because it was isolated from soil and appeared to share a number of features with other Mycoplana. After a 16S rDNA analysis showed that it was closely related to Caulobacter crescentus, it was reclassified Caulobacter segnis. Because the C. segnis genome sequence available in GenBank contained 126 pseudogenes, we compared the original sequencing data to the GenBank sequence and determined that many of the pseudogenes were due to sequence errors in the GenBank sequence. Consequently, we used multiple approaches to correct and reannotate the C. segnis genome sequence. In total, we deleted 247 bp, added 14 bp, and changed 8 bp resulting in 233 fewer bases in our corrected sequence. The corrected sequence contains only 15 pseudogenes compared to 126 in the original annotation. Furthermore we found that unlike Mycoplana, C. segnis divides by fission, producing swarmer cells that have a single, polar flagellum.
Ash, K., T. Brown, T. Watford, L. E. Scott, C. Stephens, and B. Ely. 2014. A comparison of the Caulobacter NA1000 and K31 genomes reveals extensive genome rearrangements and differences in metabolic potential. Open Biology 4:140128 10.1098/rsob.140128
The genus Caulobacter is found in a variety of habitats and is known for its ability to thrive in low nutrient conditions. K31 is a novel Caulobacter isolate that has the ability to tolerate copper and chlorophenols and can grow at 4˚ C with a doubling time of 40 hours. K31 contains a 5.5 Mb chromosome that codes for more than 5500 proteins and two large plasmids, 234 Kb and178 Kb, that code for 438 additional proteins. A comparison of the K31 and the C. crescentus NA1000 genomes revealed extensive rearrangements of gene order suggesting that the genomes had been randomly scrambled. However, a careful analysis revealed that the distance from the origin of replication was conserved for the majority of the genes and that many of the rearrangements involved inversions that included the origin of replication. On a finer scale numerous small indels were observed. K31 proteins involved in essential functions shared 80-95% amino acid sequence identity with their C. crescentus homologs while other homolog pairs tended to have lower levels of identity. In addition, the K31 chromosome contains more than 1600 genes with no homolog in NA1000.
Scott, D., and B. Ely. 2015. Comparison of genome sequencing technology and assembly methods for the analysis of a GC-rich bacterial genome. Current Microbiology 70:338-344. DOI: 10.1007/s00284-014-0721-6
Improvements in technology and decreases in price have made de novo bacterial genomic sequencing a reality for many researchers, but it has created a need to evaluate the methods for generating a complete and accurate genome assembly. We sequenced the GC-rich Caulobacter henricii genome using the Illumina MiSeq, Roche 454, and Pacific Biosciences RS II sequencing systems. To generate a complete genome sequence, we performed assemblies using eight readily available programs and found that builds using the Illumina MiSeq and the Roche 454 data produced accurate yet numerous contigs. SPAdes performed the best followed by PANDAseq. In contrast, the Celera Assembler produced a single genomic contig using the Pacific Biosciences data after error correction with the Illumina MiSeq data. In addition, we duplicated this build using the Pacific Biosciences data with HGAP2.0. The accuracy of these builds was verified by Pulsed Field Gel Electrophoresis of genomic DNA cut with restriction enzymes.
Ely, B. and L. E. Scott. 2014. Correction of the Caulobacter crescentus NA1000 genome annotation. PLoS One 9(3): e91668. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0091668
Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. Thus a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.
Friedman, R. and B. Ely. 2012. Codon usage methods for horizontal gene transfer detection generate an abundance of false positive and false negative results. Current Microbiology 65:639-642. doi:10.1007/s00284-012-0205-5
Bacteria acquire new DNA in a process known as horizontal gene transfer (HGT). To investigate the evolutionary impact of this transfer of DNA, various methods have been developed to detect past HGT events. For example, codon usage-based methods detect the presence of transferred genes by identifying atypical patterns of codon usage. However, some inherited genes exhibit atypical codon usage and some transferred genes have codon usage patterns similar to those of the inherited genes. In this study, we used a comparative phylogenetic approach with Methylobacterium and Caulobacter species to demonstrate that even well-designed codon usage methods fail to detect many HGT events and generate a high rate of false positives (60–75 %) and false negatives (23–61 %). Therefore, we recommend caution when employing codon usage methods to identify transferred genes and suggest that the rapidly increasing availability of bacterial genome sequences makes the phylogenetic approach the method of choice.
Guerrero-Ferreira, R. C., P. H. Viollier, B. Ely, J. S. Poindexter, M. Georgieva, G. J. Jensen, and E. R. Wright. 2011. A novel mechanism for bacteriophage adsorption to the motile bacterium Caulobacter crescentus. Proceedings of the National Academy of Sciences USA 108:9963-9968. doi:10.1073/pnas.1012388108.
2D and 3D cryo-electron microscopy, together with adsorption kinetics assays of ϕCb13 and ϕCbK phage-infected Caulobacter crescentus, provides insight into the mechanisms of infection. ϕCb13 and ϕCbK actively interact with the flagellum and subsequently attach to receptors on the cell pole. We present evidence that the first interaction of the phage with the bacterial flagellum takes place through a filament on the phage head. This contact with the flagellum facilitates concentration of phage particles around the receptor (i.e., the pilus portals) on the bacterial cell surface, thereby increasing the likelihood of infection. Phage head filaments have not been well characterized and their function is described here. Phage head filaments may systematically underlie the initial interactions of phages with their hosts in other systems and possibly represent a widespread mechanism of efficient phage propagation.
Lightfield, J., N. R. Fram, and B. Ely. 2011. Across bacterial phyla distantly-related genomes with similar genomic GC content have similar patterns of amino acid usage. PLoS ONE 6(3): e17677.
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both Gram negative and Gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.
Wilson, J. L., B. Ely and B. A. Jackson. 2010. Evaluating African-derived mtDNA haplotype diversity via independent sample collections. Canadian Journal of Forensic Sciences 43:65-74.
Since the sample sizes for forensic cases, as well as studies of African ethnic groups, are usually low, and independent samples are rare, it has been difficult to determine whether small samples contain an accurate representation of a sampled population. In this study, two independent samples of the Bamileke had similar values with regard to standard and molecular diversity indices and selective neutrality. The two Fulbe samples were also similar, but they differed with respect to selective neutrality values. In both ethnic groups, shared haplotypes were present at low frequencies. However, the Fulbe had fewer exclusive matches with outside ethnic groups compared to the Bamileke. Nevertheless, in both ethnic groups, within-group matches were more common than matches to any other Cameroonian ethnic group. Only a small percentage of the observed mtDNA haplotypes have the potential for being ethnic and/or region specific. To assess this potential, sample sizes will have to be orders of magnitude larger in order to observe significant numbers of those relatively rare haplotypes. However, as database size is increased, haplotype sharing will correspondingly increase; and many haplotypes that are common in a single ethnic group will also be found in multiple ethnic groups.
J. and B. Ely. 2010. Evolution of an
A nucleotide sequence analysis of a fragment of a Morone MHC class Ia gene detected high levels of polymorphism in striped bass Morone saxatilis, white perch Morone americana and yellow bass Morone mississippiensis. Extremely low levels of MHC diversity, however, were detected in white bass Morone chrysops, suggesting the possibility of a severe population bottleneck for this species.
Cunningham, J. E., A. J. Montero, E. Garrett-Mayer, H. J. Berkel, and B. Ely. 2010. Racial differences in the incidence of breast cancer subtypes defined by combined histologic grade and hormone receptor status. Cancer Causes and Control 21:399-409.
Breast cancer encompasses several distinct clinical entities of very different characteristics and behaviors, a fact which likely contributes to the higher breast cancer mortality in African-Americans (AA) despite the higher incidence in European-Americans (EA). We are interested in how incidence variability in cancer subtypes defined by combined estrogen receptor (ER) and grade contributes to racial mortality disparities. As an initial step, we compared age-specific and age-adjusted incidence rates for each ER/Grade subtype in South Carolina (SC—a southern state) with Ohio (a northern mid-western state), using state registry data for 1996–2004. Each ER/Grade subtype had a distinct incidence pattern and rate, with three striking racial/geographic differences. First, the racial incidence disparity in ER negative (ER−) cancers was mostly within the ER−/G3 subtype, of which AAs had ~65% higher incidence than did EAs; ER−/G2 was much less common, but of significantly higher incidence in AAs. Second, the racial disparity in ER positive (ER+) cancers was in the ER+/lower-grade cancers, with a marked EA excess in both states. Third, AA incidence of the ER+/lower-grade subtypes was ~26% higher in Ohio than in SC. The other subtypes (ER−/G1 and ER+/G3) varied minimally by race and state, and the latter showed a strong association with age. Age adjustment halved the racial difference in mean age at diagnosis to about 2 years younger in AAs, compared to 4 years younger in case comparisons. Use of age-adjusted and age-specific rates of breast cancer subtypes may improve understanding of racial incidence and mortality disparities over time and geography. This approach also may aid in estimating the race-specific incidence rates of triple-negative breast cancer.
Liu, J. and B. Ely. 2009. Sibship reconstruction demonstrates the extremely low effective population size of striped bass Morone saxatilis in the Santee-Cooper system, South Carolina, USA. Molecular Ecology 18:4112-4120.
For organisms with great fecundity and high mortality in early life stages, such as shellfish or fishes, the need to match reproductive activity with environmental conditions conducive to spawning, fertilization, larval development and recruitment may result in extreme variance in reproductive success among individuals. The main objective of this study was to investigate evidence of large variance in the reproductive success of the striped bass Morone saxatilis in the Santee–Cooper system, South Carolina, USA. Seven microsatellite loci were analysed in 603 recruits representing three yearly cohorts from 1992 to 1994, and a group analysis was performed to identify full-sib families. Large variance in reproductive success was detected, with a few large, full-sib families contributing disproportionately to each of the cohorts. The severity of sweepstakes reproductive success varied among cohorts depending on environmentally imposed mortality. Estimations of the effective number of breeders in these long-lived fish ranged from 24 in 1992 to 44 in 1994. Furthermore, the estimated genetic effective population size (Ne = 93) is approximately four orders of magnitude lower than estimates of adult census size (N = 362 000). Furthermore, the presence of large full-sib families indicates that striped bass engage in pair mating in the wild. Heterogeneity in genetic composition was also observed among cohorts, suggesting that genetically different adults contribute to different cohorts and that chance rather than fitness variation determines reproductive success.
Liu, J. and B. Ely. 2009. Complex evolution of a highly-conserved microsatellite locus in several fish species. Journal of Fish Biology 75:442-447.
The evolutionary dynamics of a highly conserved microsatellite locus (Dla 11) were studied in several fish species. The data indicated that multiple types of compound microsatellites arose through point mutations that were sometimes followed by expansion of the derived motif. Furthermore, extensive length variation was detected among species in the regions immediately flanking the repeat region.
Ely, B., J. L. Wilson, F. Jackson, and B. A. Jackson. 2006. African-American mitochondrial DNAs often match mtDNAs found in multiple African ethnic groups. BMC Biology 4:34. Published correction Can a database of mtDNA HvsI sequences properly assign Africans to their country of origin?
Alvarado Bremer, J.R., J.
Mejuto, J. Gómez-Márquez, F. Boán, P. Carpintero, J.M. Rodríguez, J. Viñas,
T.W. Greig, and B. Ely. 2005. Hierarchical analyses of genetic variation of
samples from breeding and feeding grounds confirm the genetic partitioning of
northwest Atlantic and
Ely, B., J. Viñas, J. R. Alvarado Bremer, D. Black, L. Lucas, K. Covello, A. Labrie, and E. Thelen. 2005. Consequences of the historical demography on the global population structure of two highly migratory cosmopolitan marine fishes: the yellowfin tuna (Thunnus albacares) and the skipjack tuna (Katsuwonus pelamis). BMC Evolutionary Biology 5:19.
Jackson, B. A., J. L. Wilson, S. Kirbah, S. S. Sidney, J. Rosenberger, L. Bassie, J. A. D. Alie, D. C. McLean, W. T. Garvey, and B. Ely. 2005. Genetic diversity among four ethnic groups in Sierra Leone. American Journal of Physical Anthropology 128:156-163.
Alvarado Bremer, J. R., J. Viñas, J. Mejuto, B. Ely, and C. Pla. 2005. Comparative phylogeography of Atlantic bluefin tuna and swordfish: The combined effects of vicariance, secondary contact, introgression, and population expansion on the regional phylogenies of two highly migratory pelagic fishes. Molecular Phylogenetics and Evolution 26:169-187.
Bulak, J. S., C. S. Thomason, K. Han, and B. Ely. 2004. Distinctiveness and management of striped bass populations in the coastal rivers of South Carolina. North American Journal of Fisheries Management 24:1322-1329.
Osborne, R. L., L. O. Taylor, K. Han, B. Ely, and J. H. Dawson. 2004. A. ornata dehaloperoxidase: enhanced activity for the catalytic active globin using MCPBA. Biochemical and Biophysical Research Communications, 324 (4): 1194-1198.
Ely, B., D. S. Stoner, J. R. Alvarado Bremer, J. M. Dean, P. Addis, A. Cau, E. J. Thelen, W. J. Jones, D. E. Black, L. Smith, K. Scott, I. Naseri and J. M. Quattro. 2002. Analyses of nuclear ldhA gene and mtDNA control region sequences of Atlantic northern bluefin tuna populations. Marine Biotechnology 4:583-588.
William C. Nierman and 36 others.2001. Complete genome sequence of Caulobacter crescentus. Proceedings of the National Academy of Sciences 98:4136-4141
K., S. A. Woodin, D. E. Lincoln, K. T. Fielman, and B. Ely. 2001
Amphitrite ornata, a marine worm, contains two dehaloperoxidase genes. Marine Biotechnology 3: 287-292.
Han, K., Li, L., Leclerc, G. M., Hays,A. M., and B. Ely. 2000. Isolation and Characterization of Microsatellite Loci for striped bass (Morone saxatilis).Molecular Biotechnology 2:405-408.
Diaz, M., D. Wethey, J. Bulak, and B. Ely. 2000. Effect of harvest and effective population size on genetic diversity in a striped bass population. Transactions of the American Fisheries Society 129:1250-1255.
Ely, B., T. W. Ely, W. B. Crymes, Jr., and S. A. Minnich. 2000. A family of six flagellin genes contributes to the Caulobacter crescentus flagellar filament. Journal of Bacteriology 182:5001-5004.