Banting & Best
Molecular Genetics
Donnelly CCBR
U of Toronto




We are currently pursuing a variety of bioinformatics and genomics projects. Our own research interests can be summarized in the following areas:Comparative Genomics, Genome Evolution, and microRNAs We are also collaborating with colleagues on large-scale yeast genomics projects, cancer profiling projecast, and sequencing projects.

Comparative genomics and genetic variation
We are interested in using computational and experimental approaches to identify and characterize functional sequences and motifs in the intergenic regions of the genome. We are particularly interested in the primate-specific noncoding RNA transcripts, i.e. those that only arose after the human-mouse split. The recent deep sequencing experiments have revealed many of these candidate transcript regions in the human genome. Using these datasets as starting point, we applied a number of filtering procedures such as probability of forming stable secondary structures to derive a set of candidate regions in the human genome. In particularly we are interested in RNAs involved in chromatin remodeling, aging process, and male reproduction

Genome evolution
We are interested in studying the evolutionary history of genes, genomes, transcriptomes, microRNAs, and biological networks. Having the advantage of working closely with colleagues who are producing large amount of genomics and proteomics data, we hope to be able to elucidate some of the fundamental questions in evolutionary biology and genomics. Specifically we are interested in the following questions: 

  • the evolutionary trajectory of paralog genes in yeast as result of whole-genome duplication (WGD) event (sub-functionlization vs neo-functionalization) and its impact on yeast cell networks; 
  • implication of horizontal gene transfer (HGT) on bacterial operons (transcriptional units) and on protein interaction networks; 
  • expansion of gene families in vertebrate genomes, evolution of gene sequence and expression profiles in the context of broader evolutionary theory, 
  • repetitive elements in the vertebrate genomes.  

Analysis of TF binding sites
We are developing new algorithms to improve existing TF binding sites prediction methods. Specifically we are trying to characterize the structure information and positional co-variations that are hidden in the traditional Positional Specific Weight Matrix (PSWM) approaches. We demonstrated that the structural properties of the double-helix as shown below are important for DNA-protein recognition and likely are the “hidden code” for the degenerate weight matrices.