Hello and welcome to GOLEM! GOLEM (Gene Ontology Local Exploration Map) is a visualization and analysis tool for focused exploration of the gene ontology graph. GOLEM allows the user to dynamically expand and focus the local graph structure of the gene ontology hierarchy in the neighborhood of any chosen term. It also supports rapid analysis of an input list of genes to find enriched gene ontology terms. The GOLEM application permits the user either to utilize local gene ontology and annotations files in the absence of an Internet connection, or to access the most recent ontology and annotation information from the gene ontology web page. GOLEM supports global and organism-specific searches by gene ontology term name, gene ontology id and gene name. Read the instructions below for help using GOLEM.
-(Application version only) When GOLEM first loads it will prompt the user
to choose to either load a local ontology file, or to download the latest
version of the "gene_ontology.obo" file from the Gene Ontology Consortium webpage. Loading a local file is typically
faster as no download wait is involved, but to always stay up to date you have
the option to download the latest copy.
(The applet version of GOLEM uses a cached copy of the ontology and
cannot offer the ability to use your own copy due to the security restrictions
of Java.)
-Once GOLEM loads in the gene ontology, the main window will be
displayed. From this window you
can navigate the current view of the gene ontology, search for specific terms
or genes, load a species annotation file, find GO terms enriched for a list of
genes, and display these instructions.
-Clicking on the "Choose Organism" button in the upper right will
display a dialog allowing you load a species annotation file. Choices are listed that will
automatically download the latest version of an annotation file from the Gene Ontology Consortium webpage for
yeast, fly, human, mouse, or worm.
The "Show all nodes" option reverts GOLEM to a state where no
annotation file is loaded.
(Application version only:
If you wish to use your own set of annotations, you may click on the
"Browse" button to choose a local annotations file. This file should be in the format
specified by the GO
Consortium.)
-To search for a gene ontology node by name, first select "Search by
GOTerm name" from the dropdown menu. Then, type the name of the node in
the text field in the toolbar, and press enter. For example: typing
"acetate biosynthesis" will display the acetate biosynthesis node,
its immediate children, and all possible paths back to the root node. As a
search aid, a popup window will display all nodes whose names begin with the
typed letters.
-To search for a gene ontology node using its identification number, first
select "Search by GO Term id" from the dropdown menu at the top of
the screen. Then, type the gene ontology identification number in the text
field in the toolbar, and press enter. For example, entering
"GO:0000001" will display the mitochondrion inheritance node, its
immediate children, and all possible paths back to the root node.
-To search for all nodes annotated to a particular gene in a specific
ontology, select GeneFinder(Process), GeneFinder(Component), or
GeneFinder(Function) from the dropdown menu. (These tools search the biological
process, cellular component, and molecular function ontologies). Then, type the
name of the gene of interest in the text field, and press enter.
NOTE: An annotations file must be selected to use this tool. Gene names must be in the format specified in the annotations file DB_Object_Symbol or DB_Object_Synonyms field (for yeast these are common name and YORF (such as PUP1 or YOR157C), check the GO Consortium guidelines for other organisms).
-Once an annotations file is loaded an option to "Find Enriched
Nodes" is available from the dropdown menu. Selecting this option will
cause a dialog window will appear. Choose a multiple hypothesis testing
correction method and enter the desired cutoff p-value or false discovery rate.
Then, enter a tab, space, comma, or newline-delimited list of genes of
interest, and press the "Find Enriched Nodes" button. GOLEM will
return a table showing enriched nodes, the query genes annotated to these
nodes, and the p-values of the nodes.
GOLEM computes p-values based on the hypergeometric distribution. If an annotations file contains N genes, a given GO term has M annotated genes, and the user inputs a list of n genes of interest, the probability of seeing k or more genes of interest annotated to a given GO term is computed as:

GOLEM provides the user with a choice between three multiple hypothesis testing alternatives: Bonferroni correction, false discovery rate (Benjamin & Yeuketieli algorithm, 2001), and no correction. The Bonferroni correction is a very conservative correction method, and is computed by multiplying the p-value by the number of GO nodes examined. The correction for false discovery rate is less conservative. The FDR correction method controls the percentage of total positive results that are likely to be false, rather than the probability that there exists some false result in the list.
NOTE: The background of genes used for the calculation of p-values is determined by the annotations file loaded into GOLEM. If the appropriate background of genes is anything other than all known genes in an organism, you should load a customized annotation file containing only those genes appropriate for your needs. (For example, if a high throughput assay tested all viable yeast knockouts, the appropriate background would be only non-essential genes, rather than all yeast genes.)
NOTE: There are other alternatives to the hypergeometric distribution for calculating enrichment scores. If you wish to use another method to determine enrichment, the GOLEM source code is modularized such that this change only requires altering one file. (See the FAQ below for details.)
Selecting a single term in the table of enriched GO terms displays the GO graph focused around the selected node. Selecting multiple terms within the same ontology in the table of enriched GO terms displays the GO graph focused around all selected terms.
-Clicking on a node selects it.
-Right-clicking (Apple-clicking for mac) on a node displays a popup menu,
with the following choices:
· -Focus Graph: Shows all children and all paths back to the root node for the selected node.
· -Select Node: Selects this node in the graph.
· -Show Children: Shows the children of the selected node.
· -Show Ancestors: Displays all possible paths back to the root node.
-Clicking and holding a node permits you to move it around. Clicking on a
node and moving it slightly brings the node that you have clicked on to the top.
-The bottom panel displays the name and gene ontology identification number
of the selected node, as well as a button showing the number of genes annotated
to the node. Clicking on the button displays a dialog panel with a list of these
annotated genes.
Q. The program crashed when I ran it using a large annotations file, such as
the human annotations file. How can I run GOLEM on a large annotations file?
A. You need to allocate more memory for the Java virtual machine. Try running
the jar file from the command line with the command java -Xmx512m -jar
GOLEM.jar.
Q. I want to compute p-values for nodes enriched in a list of genes that I
found to be regulated in a microarray experiment. How can I use only the genes
on my microarray as the background set for computing p-values?
A. Define your own annotations file containing information only about the genes
on your microarray. Then, browse for your annotations file using the “Choose
Organism" tool.
Q. I don't like using the hypergeometric distribution to compute the
enriched GO-term p-values. How can I modify the GOLEM source code so that
p-values are computed using my favorite distribution?
A. The GOLEM source code is modularized so that this change is easy to
make. P-values are computed in the
function "computepValues" in the file
"Hypergeometric.java". This function is called by the function
"goTermFind" in "OntologyWeb.java". You can implement your
favorite function for computing p-values, and call your function instead of our
implementation of computepValues. (If you want to use a multiple hypothesis
testing correction method that we haven't implemented, you can also make this
alteration to the goTermFind function.