GOLEM Instructions

About GOLEM

Starting the Program

Searching

Find Enriched Nodes

Browsing the Ontology

Viewing Annotated Genes

FAQ


About GOLEM

Hello and welcome to GOLEM! GOLEM (Gene Ontology Local Exploration Map) is a visualization and analysis tool for focused exploration of the gene ontology graph. GOLEM allows the user to dynamically expand and focus the local graph structure of the gene ontology hierarchy in the neighborhood of any chosen term. It also supports rapid analysis of an input list of genes to find enriched gene ontology terms. The GOLEM application permits the user either to utilize local gene ontology and annotations files in the absence of an Internet connection, or to access the most recent ontology and annotation information from the gene ontology web page. GOLEM supports global and organism-specific searches by gene ontology term name, gene ontology id and gene name. Read the instructions below for help using GOLEM.

 

 


 

Launching GOLEM:

Choose an ontology file                      

-(Application version only) When GOLEM first loads it will prompt the user to choose to either load a local ontology file, or to download the latest version of the "gene_ontology.obo" file from the Gene Ontology Consortium webpage.  Loading a local file is typically faster as no download wait is involved, but to always stay up to date you have the option to download the latest copy.  (The applet version of GOLEM uses a cached copy of the ontology and cannot offer the ability to use your own copy due to the security restrictions of Java.)

Initial GOLEM window                     

-Once GOLEM loads in the gene ontology, the main window will be displayed.  From this window you can navigate the current view of the gene ontology, search for specific terms or genes, load a species annotation file, find GO terms enriched for a list of genes, and display these instructions.

Loading an annotations file                

-Clicking on the "Choose Organism" button in the upper right will display a dialog allowing you load a species annotation file.  Choices are listed that will automatically download the latest version of an annotation file from the Gene Ontology Consortium webpage for yeast, fly, human, mouse, or worm.  The "Show all nodes" option reverts GOLEM to a state where no annotation file is loaded.  (Application version only:  If you wish to use your own set of annotations, you may click on the "Browse" button to choose a local annotations file.  This file should be in the format specified by the GO Consortium.)


Searching:

Searching by GO term name               

-To search for a gene ontology node by name, first select "Search by GOTerm name" from the dropdown menu. Then, type the name of the node in the text field in the toolbar, and press enter. For example: typing "acetate biosynthesis" will display the acetate biosynthesis node, its immediate children, and all possible paths back to the root node. As a search aid, a popup window will display all nodes whose names begin with the typed letters.

Searching by GO term id

-To search for a gene ontology node using its identification number, first select "Search by GO Term id" from the dropdown menu at the top of the screen. Then, type the gene ontology identification number in the text field in the toolbar, and press enter. For example, entering "GO:0000001" will display the mitochondrion inheritance node, its immediate children, and all possible paths back to the root node.

Searching by Gene Name

-To search for all nodes annotated to a particular gene in a specific ontology, select GeneFinder(Process), GeneFinder(Component), or GeneFinder(Function) from the dropdown menu. (These tools search the biological process, cellular component, and molecular function ontologies). Then, type the name of the gene of interest in the text field, and press enter.

NOTE: An annotations file must be selected to use this tool.  Gene names must be in the format specified in the annotations file DB_Object_Symbol or DB_Object_Synonyms field (for yeast these are common name and YORF (such as PUP1 or YOR157C), check the GO Consortium guidelines for other organisms).

 

 


 

Find Enriched Nodes

-Once an annotations file is loaded an option to "Find Enriched Nodes" is available from the dropdown menu. Selecting this option will cause a dialog window will appear. Choose a multiple hypothesis testing correction method and enter the desired cutoff p-value or false discovery rate. Then, enter a tab, space, comma, or newline-delimited list of genes of interest, and press the "Find Enriched Nodes" button. GOLEM will return a table showing enriched nodes, the query genes annotated to these nodes, and the p-values of the nodes.

Statistical Analysis

GOLEM computes p-values based on the hypergeometric distribution. If an annotations file contains N genes, a given GO term has M annotated genes, and the user inputs a list of n genes of interest, the probability of seeing k or more genes of interest annotated to a given GO term is computed as:

 

Multiple Hypothesis Testing Correction Options

GOLEM provides the user with a choice between three multiple hypothesis testing alternatives: Bonferroni correction, false discovery rate (Benjamin & Yeuketieli algorithm, 2001), and no correction.   The Bonferroni correction is a very conservative correction method, and is computed by multiplying the p-value by the number of GO nodes examined. The correction for false discovery rate is less conservative.  The FDR correction method controls the percentage of total positive results that are likely to be false, rather than the probability that there exists some false result in the list. 

 

NOTE:  The background of genes used for the calculation of p-values is determined by the annotations file loaded into GOLEM.  If the appropriate background of genes is anything other than all known genes in an organism, you should load a customized annotation file containing only those genes appropriate for your needs.  (For example, if a high throughput assay tested all viable yeast knockouts, the appropriate background would be only non-essential genes, rather than all yeast genes.)

 

NOTE:  There are other alternatives to the hypergeometric distribution for calculating enrichment scores.  If you wish to use another method to determine enrichment, the GOLEM source code is modularized such that this change only requires altering one file.  (See the FAQ below for details.)

Visualizing Enriched Nodes

Selecting a single term in the table of enriched GO terms displays the GO graph focused around the selected node.  Selecting multiple terms within the same ontology in the table of enriched GO terms displays the GO graph focused around all selected terms.


Browsing the Ontology:

-Clicking on a node selects it.

-Right-clicking (Apple-clicking for mac) on a node displays a popup menu, with the following choices:

· -Focus Graph: Shows all children and all paths back to the root node for the selected node.

· -Select Node: Selects this node in the graph.

· -Show Children: Shows the children of the selected node.

· -Show Ancestors: Displays all possible paths back to the root node.

-Clicking and holding a node permits you to move it around. Clicking on a node and moving it slightly brings the node that you have clicked on to the top.


Viewing Annotated Genes:

-The bottom panel displays the name and gene ontology identification number of the selected node, as well as a button showing the number of genes annotated to the node. Clicking on the button displays a dialog panel with a list of these annotated genes.


FAQ

Q. The program crashed when I ran it using a large annotations file, such as the human annotations file. How can I run GOLEM on a large annotations file?
A. You need to allocate more memory for the Java virtual machine. Try running the jar file from the command line with the command java -Xmx512m -jar GOLEM.jar.

Q. I want to compute p-values for nodes enriched in a list of genes that I found to be regulated in a microarray experiment. How can I use only the genes on my microarray as the background set for computing p-values?
A. Define your own annotations file containing information only about the genes on your microarray. Then, browse for your annotations file using the “Choose Organism" tool.

Q. I don't like using the hypergeometric distribution to compute the enriched GO-term p-values. How can I modify the GOLEM source code so that p-values are computed using my favorite distribution?
A. The GOLEM source code is modularized so that this change is easy to make.  P-values are computed in the function "computepValues" in the file "Hypergeometric.java". This function is called by the function "goTermFind" in "OntologyWeb.java". You can implement your favorite function for computing p-values, and call your function instead of our implementation of computepValues. (If you want to use a multiple hypothesis testing correction method that we haven't implemented, you can also make this alteration to the goTermFind function.