TB Genome Annotation Portal

Mycobacterium tuberculosis is one of the most successful pathogens in the world, still responsible for millions of deaths each year. Nearly half of its protein coding genes have functions that are unknown. This website aggregates information and data our group and others on the functions and updated annotations of genes in the M. tuberculosis H37Rv genome.

Functional classes of Mtb genes

Latest News

  • 6-16-2024: Added annotations from TBCAP and links to BioCyc on each gene page
  • 1-9-2024: Posted RNAseq and TnSeq datasets on Datasets.html page
  • 2-24-2023: Added gene modules based on analysis of transcriptomic datasets
  • 8-25-2022: Added links to genes with correlated TnSeq profiles (TnSeqCorr)
  • 7-26-2022: Added updated annotations based on recent publications
  • 10-13-2021: Updated top 10 homologs in PDB for each gene; add links to PATRIC
  • 10-01-2021: Added links to Vulnerability Index and AlphaFold model for each gene;
    Improved Search (so keywords work, not just ORF ids and gene names)
  • 04-28-2021: Added Gene Expression data on transcriptional responses to drug treatments (Boshoff et al, 2004)
  • 03-14-2018: Updated GO terms from Uniprot
  • 03-14-2018: Added enzyme reactions from iSM810 metabolic model
  • 09-19-2017: New TnSeq data added for PPE68, EccD1, PE35 and EspI [expand]

Project Goals

Vast quantities of new genome sequence are added to public databases on a daily basis. But, while we are constantly deluged with new gene sequences we still have a limited ability to define their functions. Almost all functions are defined by comparison with genes from other strains in which experimental data are available. But these experimental data have not kept pace with the availability of new sequence. In fact, for most bacterial species, many genes have no known function and even those that are annotated have only limited information. This is particularly striking for Mycobacterium tuberculosis (Mtb), where only 52% of protein-coding genes have a putative function.

We plan to discover the roles of genes from Mtb, an important and widespread human pathogen, with previously unknown functions. Critically, we will target genes that fulfill vital roles in bacterial growth and survival, genes we have previously identified in genome-wide screens. This offers three major advantages. First, we will concentrate on only the most important unannotated genes. Second, phenotypes allow us to use the power of synthetic lethality to identify interactions. And third, phenotypes provide us with a context in which we can understand the outcome of biochemical and genetic assays. In addition to defining the roles for Mtb genes we aim to establish an efficient pathway for identifying gene function that can serve as a paradigm for other bacterial species. To accomplish this we will undertake an ambitious program to construct large numbers of Mtb mutants. This will be possible as we will take advantage of substantial mycobacterial genetic expertise among the participants. Moreover, we will use a number of analytic modalities brought in through a set of highly interconnected projects and cores.


This project was originally funded by the NIH as part of the NIAID Functional Genomics Program, under grant U19 AI107774 (2013-2018). The NIAID Functional Genomics Program for understanding the functions of uncharacterized genes in infectious disease pathogens will generate experimental data to determine the biochemical function(s) of hypothetical genes, unknown open reading frames, and noncoding RNAs. The program will apply state-of-the-art technologies to determine the biochemical and physiological roles of these gene components. Obtaining a more comprehensive understanding of uncharacterized genes in infectious disease pathogens will lead to improved genomic annotation and allow for the development of potential new targets for medical diagnostics, therapeutics and vaccines. The program will distribute data, software, and reagents generated from the research projects to the broader scientific community.

The successor of the original project has been funded by NIH as P01 AI143575 (Pathway Analysis in Tuberculosis; program director: S. Ehrt; 2020-2025).

This site is maintained by Tom Ioerger at Texas A&M.
Email questions to: ioerger@cs.tamu.edu