In the post-genomic era, biologists have uncovered a striking level of conservation among genes from diverse organisms. This conservation goes beyond protein sequence similarity. In many cases, one or more aspect of the function of related genes is shared in distantly related species.

Through genetic study of common model organisms such as yeast, flies, fish, frogs, worms, and mice, we have learned a tremendous amount about the biological functions of many genes. This creates an opportunity to use information learned about a gene in one species to inform our understanding of a similar gene (“ortholog”) in another species, such as humans.

What is Gene2Function?

The primary goal of G2F is to facilitate the development of new hypotheses regarding the function of a given gene based on what is known about the function of orthologs of that gene in other species. Notably, this is relevant to both biological and biomedical research studies.

How does G2F achieve this goal?

G2F provides a quick view of

  • what human genes have been associated with a disease (disease term search)

  • what are the predicted orthologs of a human or other gene in common model organisms (gene search)

  • what specific amino acid residues and functional domains are shared (alignments)

  • to what extent has each ortholog been studied (publication and GO term counts)

  • does the gene/protein interact with other genes/proteins (interaction counts)

  • where to find out more (links to model organism database gene pages, links to publications, links to GO terms)

How can I get started?

G2F is designed to support a search with one gene or one disease term. For a gene search, choose the species corresponding with the gene term you are entering, in addition to the gene name, gene symbol, or other identifier for the gene itself.

What do the “gene” search results show?

A search with a gene term displays the search term you entered, predicted orthologs in other species, information regarding confidence in the ortholog relationship, and summary information. Many of the results displays link to additional information either at G2F (e.g. protein alignments) or elsewhere on the web (e.g. MODs, NCBI PubMed).

What do the “disease” search results show?

A search with a disease term will first retrieve a list of synonym and similar disease names so you can select the subset that best fit what you are looking for, then will display a list of human genes that have associated with the disease(s). From the human gene names, you can go to the gene-level ortholog report with as described for a gene search.

How can I mine similar data for many genes at one time?

G2F is currently designed to support single gene or disease term searches. For searches of multiple genes, including the ability to download results as a spreadsheet, we recommend our previously developed, related tool, the DRSC Integrative Ortholog Prediction Tool (DIOPT) from our group at Harvard Medical School. The latest versions of DIOPT include many of the same functions, including multiple sequence alignment and view of summary information regarding gene function. For searches of multiple genes originating from a disease term, including the option to download results, we recommend DIOPT-Diseases and Traits (DIOPT-DIST) .

What if I am interested to start from a human gene variant?

G2F does provide a multi-sequence protein alignment and ortholog function information that might help in analysis of a gene variant, such as by helping you predict the function of the normal human gene and determine if the variant affects a highly conserved residue. For those interested to start with a variant analysis that includes mining of human gene variant data from multiple public resources, we recommend the MARRVEL online tool , which was designed and built by a Baylor College of Medicine, with collaboration from our group on ortholog mapping and gene function information.

What are the sources of the information displayed at G2F?

The tables below summarize the sources of information that are used to provide summary information at G2F. We would like to acknowledge the NIH NIGMS-funded, curated model organism databases (MODs) including FlyBase, WormBase, ZFIN, SGD, PomBase, Xenbase, MGI, RGD and HGNC as well as InterMine.

Who are the people behind G2F?

G2F was developed by the DRSC/TRiP Functional Resources in the laboratory of Norbert Perrimon at Harvard Medical School in collaboration with the FlyBase Consortium. We also acknowledge helpful input from Hugo Bellen, Joseph Loscalzo, Richard Maas, and Calum MacRae.

Who should I contact for further help, with questions, to report a bug, etc.?

Please use our Bug Report Form

Table 1: Sources of the information displayed at G2F



Human disease-to-gene relationships

Online Mendelian Inheritance in Man (OMIM)

Genome Wide Association Studies (GWAS)

Ortholog information

  • DIOPT score

  • Best score

  • Best score with the reverse search

  • Confidence

DRSC Integrative Ortholog Prediction Tool (DIOPT)


NCBI gene2pubmed

Gene ontology

  • GO component

  • GO function

  • GO process

NCBI gene2go


  • Protein interactions

  • Genetic interactions



see additional table below



Table 2: Sources for the “Data” column in a gene search results table







Rat Data [not available]

Rat Data [not available]

XenMine (GO terms)


ZebrafishMine /p>



DRSC Gene Expression Tool (DGET)



Saccharomyces Genome Database

Saccharomyces Genome Database

[not available]