Background

In the post-genomic era, biologists have uncovered a striking level of conservation among genes from diverse organisms. This conservation goes beyond protein sequence similarity. In many cases, one or more aspect of the function of related genes is shared in distantly related species.

Through genetic study of common model organisms such as yeast, flies, fish, frogs, worms, and mice, we have learned a tremendous amount about the biological functions of many genes. This creates an opportunity to use information learned about a gene in one species to inform our understanding of a similar gene (“ortholog”) in another species, such as humans.

Citation

To cite Gene2Function in a paper, please cite G3: Genes, Genomes, Genetics August 1, 2017 vol. 7 no. 8 2855-2858;
Gene2Function: An Integrated Online Resource for Gene Function Discovery
Yanhui Hu, Aram Comjean, Stephanie E. Mohr, The FlyBase Consortium, Norbert Perrimon

Pubmed abstract: 28663344

What is Gene2Function?

The primary goal of G2F is to facilitate the development of new hypotheses regarding the function of a given gene based on what is known about the function of orthologs of that gene in other species. Notably, this is relevant to both biological and biomedical research studies.

How does G2F achieve this goal?

G2F provides a quick view of

  • what human genes have been associated with a disease (disease term search)

  • what are the predicted orthologs of a human or other gene in common model organisms (gene search)

  • what specific amino acid residues and functional domains are shared (alignments)

  • to what extent has each ortholog been studied (publication and GO term counts)

  • does the gene/protein interact with other genes/proteins (interaction counts)

  • where to find out more (links to model organism database gene pages, links to publications, links to GO terms)


How can I get started?

G2F is designed to support a search with one gene or one disease term. For a gene search, choose the species corresponding with the gene term you are entering, in addition to the gene name, gene symbol, or other identifier for the gene itself.


What do the “gene” search results show?

A search with a gene term displays the search term you entered, predicted orthologs in other species, information regarding confidence in the ortholog relationship, and summary information. Many of the results displays link to additional information either at G2F (e.g. protein alignments) or elsewhere on the web (e.g. MODs, NCBI PubMed).


What do the “disease” search results show?

A search with a disease term will first retrieve a list of synonym and similar disease names so you can select the subset that best fit what you are looking for, then will display a list of human genes that have associated with the disease(s). From the human gene names, you can go to the gene-level ortholog report with as described for a gene search.


How can I mine similar data for many genes at one time?

G2F is currently designed to support single gene or disease term searches. For searches of multiple genes, including the ability to download results as a spreadsheet, we recommend our previously developed, related tool, the DRSC Integrative Ortholog Prediction Tool (DIOPT) from our group at Harvard Medical School. The latest versions of DIOPT include many of the same functions, including multiple sequence alignment and view of summary information regarding gene function. For searches of multiple genes originating from a disease term, including the option to download results, we recommend DIOPT-Diseases and Traits (DIOPT-DIST) .


What if I am interested to start from a human gene variant?

G2F does provide a multi-sequence protein alignment and ortholog function information that might help in analysis of a gene variant, such as by helping you predict the function of the normal human gene and determine if the variant affects a highly conserved residue. For those interested to start with a variant analysis that includes mining of human gene variant data from multiple public resources, we recommend the MARRVEL online tool , which was designed and built by a Baylor College of Medicine, with collaboration from our group on ortholog mapping and gene function information.


What are the sources of the information displayed at G2F?

The tables below summarize the sources of information that are used to provide summary information at G2F. We would like to acknowledge the NIH NIGMS-funded, curated model organism databases (MODs) including FlyBase, WormBase, ZFIN, SGD, PomBase, Xenbase, MGI, RGD and HGNC as well as InterMine.


Who are the people behind G2F?

G2F was developed by the DRSC/TRiP Functional Resources in the laboratory of Norbert Perrimon at Harvard Medical School in collaboration with the FlyBase Consortium. We also acknowledge helpful input from Hugo Bellen, Joseph Loscalzo, Richard Maas, and Calum MacRae.

Who should I contact for further help, with questions, to report a bug, etc.?

Please use our Bug Report Form


Table 1: Sources of the information displayed at G2F

Information

Source(s)

Human disease-to-gene relationships

Online Mendelian Inheritance in Man (OMIM)

https://www.omim.org

Genome Wide Association Studies (GWAS)

https://www.ebi.ac.uk/gwas

Ortholog information

  • DIOPT score

  • Best score

  • Best score with the reverse search

  • Confidence

DRSC Integrative Ortholog Prediction Tool (DIOPT)

http://www.flyrnai.org/diopt

Publications

NCBI gene2pubmed

Gene ontology

  • GO component

  • GO function

  • GO process

NCBI gene2go

Interactions

  • Protein interactions

  • Genetic interactions

BioGrid

https://thebiogrid.org

Data

see additional table below

Alignments

DIOPT



Table 2: Sources for the “Data” column in a gene search results table

PhenotypeData

ExpressionData

HumanMine

http://www.humanmine.org

HumanMine

http://www.humanmine.org

MouseMine

http://www.mousemine.org

MouseMine

http://www.mousemine.org

Rat Data [not available]

Rat Data [not available]

XenMine (GO terms)

http://www.xenmine.org

XenMine

http://www.xenmine.org

ZebrafishMine

http://www.zebrafishmine.org

ZebrafishMine

http://www.zebrafishmine.org

FlyMine

http://www.flymine.org

DRSC Gene Expression Tool (DGET)

http://www.flyrnai.org/tools/dget/web/

WormBase

http://www.wormbase.org

WormBase

http://www.wormbase.org

Saccharomyces Genome Database

http://www.yeastgenome.org

Saccharomyces Genome Database

http://www.yeastgenome.org

[not available]

PomBase

http://www.pombase.org


Versions

  • 1.2.2 - Feb 2019 - Updated/ fixes to publication count to match new backend.
  • 1.2.1 - June 2017 - Fixed bug which could cause # diseases to be incorrectly reported in result table.
  • 1.2 - Nov 2017 - added Go/ Go Slim heatmap, updated RNAi search to use exact match
  • 1.1 - Oct 2017 - added PI publication counts
  • 1.0 - July 2017 - bugfixes, Gene annotation overview section.
  • 0.9.3 - April 2017 - added downloading links
  • 0.9.4 - April 2017 - improved gene search.
  • 0.9.5 - May 2017 - Added Max Diopt Score column, better ambiguous search and Marrvel link.