In the post-genomic era, biologists have uncovered a striking level of conservation among genes from diverse organisms. This conservation goes beyond protein sequence similarity. In many cases, one or more aspect of the function of related genes is shared in distantly related species.
Through genetic study of common model organisms such as yeast, flies, fish, frogs, worms, and mice, we have learned a tremendous amount about the biological functions of many genes. This creates an opportunity to use information learned about a gene in one species to inform our understanding of a similar gene (“ortholog”) in another species, such as humans.
The primary goal of G2F is to facilitate the development of new hypotheses regarding the function of a given gene based on what is known about the function of orthologs of that gene in other species. Notably, this is relevant to both biological and biomedical research studies.
G2F provides a quick view of
what human genes have been associated with a disease (disease term search)
what are the predicted orthologs of a human or other gene in common model organisms (gene search)
what specific amino acid residues and functional domains are shared (alignments)
to what extent has each ortholog been studied (publication and GO term counts)
does the gene/protein interact with other genes/proteins (interaction counts)
where to find out more (links to model organism database gene pages, links to publications, links to GO terms)
G2F is designed to support a search with one gene or one disease term. For a gene search, choose the species corresponding with the gene term you are entering, in addition to the gene name, gene symbol, or other identifier for the gene itself.
A search with a gene term displays the search term you entered, predicted orthologs in other species, information regarding confidence in the ortholog relationship, and summary information. Many of the results displays link to additional information either at G2F (e.g. protein alignments) or elsewhere on the web (e.g. MODs, NCBI PubMed).
A search with a disease term will first retrieve a list of synonym and similar disease names so you can select the subset that best fit what you are looking for, then will display a list of human genes that have associated with the disease(s). From the human gene names, you can go to the gene-level ortholog report with as described for a gene search.
G2F is currently designed to support single gene or disease term searches. For searches of multiple genes, including the ability to download results as a spreadsheet, we recommend our previously developed, related tool, the DRSC Integrative Ortholog Prediction Tool (DIOPT) from our group at Harvard Medical School. The latest versions of DIOPT include many of the same functions, including multiple sequence alignment and view of summary information regarding gene function. For searches of multiple genes originating from a disease term, including the option to download results, we recommend DIOPT-Diseases and Traits (DIOPT-DIST) .
G2F does provide a multi-sequence protein alignment and ortholog function information that might help in analysis of a gene variant, such as by helping you predict the function of the normal human gene and determine if the variant affects a highly conserved residue. For those interested to start with a variant analysis that includes mining of human gene variant data from multiple public resources, we recommend the MARRVEL online tool , which was designed and built by a Baylor College of Medicine, with collaboration from our group on ortholog mapping and gene function information.
The tables below summarize the sources of information that are used to provide summary information at G2F. We would like to acknowledge the NIH NIGMS-funded, curated model organism databases (MODs) including FlyBase, WormBase, ZFIN, SGD, PomBase, Xenbase, MGI, RGD and HGNC as well as InterMine.
G2F was developed by the DRSC/TRiP Functional Resources in the laboratory of Norbert Perrimon at Harvard Medical School in collaboration with the FlyBase Consortium. We also acknowledge helpful input from Hugo Bellen, Joseph Loscalzo, Richard Maas, and Calum MacRae.
Please use our Bug Report Form
Table 1: Sources of the information displayed at G2F
Information |
Source(s) |
Human disease-to-gene relationships |
Online Mendelian Inheritance in Man (OMIM) Genome Wide Association Studies (GWAS) |
Ortholog information DIOPT score Best score Best score with the reverse search Confidence |
DRSC Integrative Ortholog Prediction Tool (DIOPT) |
Publications |
NCBI gene2pubmed |
Gene ontology GO component GO function GO process |
NCBI gene2go |
Interactions Protein interactions Genetic interactions |
BioGrid |
Data |
see additional table below |
Alignments |
DIOPT
|
Table 2: Sources for the “Data” column in a gene search results table
PhenotypeData |
ExpressionData |
---|---|
HumanMine |
HumanMine |
MouseMine |
MouseMine |
Rat Data [not available] |
Rat Data [not available] |
XenMine (GO terms) |
XenMine |
ZebrafishMine |
ZebrafishMine |
FlyMine |
DRSC Gene Expression Tool (DGET) |
WormBase |
WormBase |
Saccharomyces Genome Database |
Saccharomyces Genome Database |
[not available] |
PomBase |