Welcome to the Ioerger Bioinformatics Lab.

We are part of the Department of Computer Science at Texas A&M University.

The Ioerger Bioinformatics Lab does interdisciplinary research that spans computer science and biology. We apply statistical algorithms and Machine Learning to biological data (genomics, transcriptomics, etc) to study antibiotic resistance and to contribute to drug discovery for tuberculosis and other infectious diseases. The projects we work on are often highly collaborative and involve working with a wide range of researchers in the Life Sciences. Our work ranges from genomics (e.g. whole-genome sequencing, RNA-seq, TnSeq), to phylogenetics, to structural biology (analysis of protein structures, protein-ligand interactions, docking).

Computationally, many of the projects we are involved in require development and implementation of novel statistical algorithms for analyzing unique datasets from newly-developed experimental technologies and data types to quantify significance of inferences, while dealing appropriately with uncertainty inherent in this data. We also focus on methods to identify interactions in the data.

The biological applications of our research are primarily focused on tuberulosis (TB), caused by the bacterial pathogen Mycobacterium tuberculosis. TB infects many people around the world, and outbreaks of multi-drug-resistant TB have been increasing at an alarming rate. The TB research community is collectively engaged in trying to understand basic pathways for survival, adaptations to stress, and host-pathogen interactions (e.g. within macrophages). However, only about half the ~4000 genes in the Mycobacterium tuberculosis (Mtb) genome are annotated, and some of these are just generic annotations based on homology (e.g. 'oxidoreductase'). This lack of knowledge about basic functions for so many genes in the Mtb genome hampers drug discovery efforts, because we don't know enough about drug targets and conditions under which they are vulnerable. Our lab has been employing bioinformatics methods to try to better annotate the genome, understand biological pathways, interpret drug resistance mutations in isogenic mutants (to identify new potential drug targets), and analyze patterns of resistance mutations in clinical cohorts (to try to understand how resistance to existing drugs arises and spreads in a natural population). These methods can be applied to other infectious pathogens as well, such as methicillin-resistant Staphylococcus aureus, and other clinically-important mycobacteria like M. avium and M. abscessus.

A major focus of our lab is determining which genes are essential under what conditions using TnSeq (sequencing of transposon-insertion mutant libraries). TnSeq yields information on conditional essentiality of genes and genetic interactions that is useful for elucidating the functions of genes and identifying good drug targets. TnSeq data is intrinsically noisy, and we have been developing tools for rigorous assessment of statistical significance of essentiality predictions derived from this data. We distribute a python-based software package for TnSeq analysis called TRANSIT, which includes implementations of many of the statistical methods we have developed. This work has contributed many insights into aspects of mycobacterial biology, and is ultimately oriented toward facilitating the discovery of new drug candidates for TB.

Current Lab Members

Past Lab Members

Projects in the Ioerger Bioinformatics Lab

TnSeq - Sequencing of Transposon Mutant Libraries

TnSeq is a genome-wide screen that determines which genes are essential for survival under different conditions, which has wide-ranging uses from understanding pathways to virulence to host-pathogen interactions. A major focus in the lab is developed of statistical methods for analysis of TnSeq data - converting large files of raw sequencing reads into lists of essential (or conditionally essential) genes and quantifying their statistically significance. The challenge is that TnSeq data (insert counts in the genome) is intrinsically noisy, and we attempt to draw rigorous inferences while avoiding false positives, using a combination of frequentist and Bayesian methods. We have combined our algorithms together in a software package we distribute called TRANSIT, which is used by labs around the world for TnSeq. TRANSIT encodes best practices for processing TnSeq data, quality assessment, normalization, and analysis, enabling users to draw statistically rigorous inferences from their data. We have used TnSeq to study metabolism (e.g. growth on various carbon sources), antibiotic stress, cell wall synthesis, and variability of essentiality among clinical isolates. We also use TnSeq in knock-out strains to try to infer gene functions through genetic interactions, and thus to annotate genes in the H37Rv genome This work is done with multiple collaborators, including Chris Sassetti at UMass Medical School.

Drug Target Identification


We participate in the Tuberculosis Drug Accelerator (TBDA), which is a consortium of academic and pharmaceutical labs funded by the Bill & Melinda Gates Foundation. Drug discovery is a complex endeavor with many stages, and the Gates Foundation assembled a team of academic labs and pharmaceutical partners, with experts at each stage (from high-throughput screening of compound libraries, to medicinal chemistry, to mechanism and structure determination, to pharmacokinetics in animal models). Our contribution to this pipeline is whole-genome sequencing of isogenic resistant mutants to identify the targets and mechanisms of novel inhibitors. This requires identifying and interpreting resistance mutations (SNPs, indels, duplications, transposon hopping...), assessed against the backdrop of what it currently known about mycobacterial growth (including essentiality), metabolic pathways, regulation, and stress response. Our work on this project has led to downstream development of lead compounds targeting a variety of enzymes like FadD32 and Pks13 (mycolic acid synthesis), GlcB (malate synthase glyoxylate shunt), biotin protein ligase, and PptT (CoA biosynthesis). These projects also involve some chemi-informatics, docking, modeling of protein-ligand interactions, and SAR. This is joint work with Jim Sacchettini in the TAMU Dept. of Biochemistry & Biophysics.

Chemical Genomics

An important step in drug discovery is identifying the protein targets of inhibitors (such as compounds from high-throughput screens). A novel methodology that is being developed for this is to construct a library (pool) of knock-down mutants (e.g. where intracellular levels of genes like DNA gyrase can be artifically reduced), and to profile their behavior (e.g. growth impairment) in the presence of inhibitors using next-generation sequencing. We are developing statistical models to quantify the relative depletion of mutants in the libary, and machine learning methods to detect patterns that will enable us to infer which protein or bological process is the target of a given compound. This work is a collaboration with Dirk Schnappinger at Weill Cornell Medical College in NY, and is funded by the Bill & Melinda Gates Foundation.

Evolution of Drug Resistance

Despite what we might find in the lab about how a drug works, it does not necessarily tell us how the bacteria are going to respond clincially (in terms of frequency and mechanisms of resistance). In order to better understand how resistance to existing antibacterial drugs arises, we are sequencing the genomes of large collections of clinical isolates, doing phylogenetic analysis, and determining resistance mutation profiles for existing drugs (isonizid, rifampicin, pyrazinamide) and novel drugs (bedaquiline, pretomanid, linezolid). We are actively involved in sequencing clinical isolates from disease outbreaks around the world, and performing statistical analyses of associations of polymorphisms with drug resistance (GWAS). We are also examining the acquisition of resistance mutations in animal models of disease. We are interested understanding the effect of fitness costs, compensatory mutations (epistasis), lineage-specific effects, novel mechanisms (efflux, detoxification, metabolic shift, etc), high- vs low-level resistance (e.g. stepping stone mutations), etc. Mtb generally evolves clonally (no recombination) and does not have plasmids (which often facilitate exchange of drug resistance genes in other pathogens like S. aureus). Yet, we have observed that drug resistance often arises independently in different strains easily in Mtb. We are interested in studying how acquistion of resistance is affected by other drugs in combination therapies, roles of latency and transmission, and interactions with diet, patient compliance, and co-morbidities (HIV, diabetes, etc).

Other Mycobacteria


The lab is expanding its research and methods from the M. tuberculosis to other clinically-important mycobacteria, such as M. avium and M. abscessus. We have been doing essentiality studies using TnSeq in these orgnaisms, as well sequencing of clinical isolates to understand drug resistance (and hope to be doing drug screening soon). These mycobacteria are more genetically diverse than M. tuberculosis. The projects employ comparative genomics to try to understand how the biology and pathways and virulence of these organisms is similar to (or different from) M. tuberculosis. Many of the first- and second-line antitubercular drugs are not effective against these bacteria, so drug discovery is just as urgently needed. This work is being done in collaboration with various colleagues, including Eric Rubin (Harvard School of Public Health) and Thomas Dick (Rutgers).

Design of Peptidomimetic Inhibitors of Protein-Protein Interfaces


Many important biological and disease processes involve interactions between proteins. Peptidomimetics are small molecules that can be synthesized that mimic peptides of 3-5 amino acids and can potentially bind in P-P interfaces and disrupt interactions. Previously, we designed a search algorithm (called EKO) for finding clusters of amino acids in interfaces matching a target geometric configuration that can be used to assist in design of peptidomimetics with favorable properties and synthetic routes. This work also involves protein structure modeling and computational evaluation of conformational and interaction energies. EKO is currently being applied to several protein targets involved in cancer. This is a collaboration with Kevin Burgess, TAMU Dept. of Chemistry, and is funded in part by a grant from the Cancer Research and Prevention Institute of Texas (CPRIT).

Resource

We are located in the Interdisciplinary Life Sciences Building on the main campus (College Station) of Texas A&M University.

If you are interested in joining the lab, please send me email at ioerger@cs.tamu.edu. Any level welcome.