}
For an editable version of this page, see our wiki.

Reactome Database User Guide


Reactome User's Guide

Contents

Introduction

This document is an overview of the Reactome database of biological pathways and processes and its web site. This is not a comprehensive guide, but should provide you with enough information to browse the database and use its principal tools for data analysis. Please read through it and contact us with any comments or questions.

To explain some of the terms used in this User Guide, we have created a Glossary.

The Frequently Asked Questions (FAQ) about Reactome is also available.

What is Reactome?

Reactome is a curated database of pathways and reactions (pathway steps) in human biology. The Reactome definition of a 'reaction' includes many events in biology that are changes in state, such as binding, activation, translocation and degradation, in addition to classical biochemical reactions. Information in the database is authored by expert biologist researchers, maintained by Reactome editorial staff, and extensively cross-referenced to other resources e.g. NCBI, Ensembl, UniProt, UCSC Genome Browser, HapMap, KEGG (Gene and Compound), ChEBI, PubMed and GO. Inferred orthologous reactions are available for over 20 non-human species including mouse, rat, chicken, puffer fish, worm, fly, yeast, rice, Arabidopsis and E.coli.

The Reactome Front Page

This is the normal entry point for most users of Reactome. To view this page, type: into the URL slot on your browser.

Reactome Home Page


This page is divided into 3 main parts:

  • The Navigation Bar at the top of the page has dropdown menus giving access to data and functionality available on the Reactome website.
  • The blue-colored Toolbar on the left-hand side of the page provides shortcut buttons for to the most popular analysis tools and downloadable Reactome datasets.
  • The Main Text section includes a description of Reactome, a featured pathway, an online tutorial and the latest Reactome news.

The Navigation Bar

Most Navigation Bar items are dropdown menus. It is found on many pages of the Reactome website. Links from the Navigation Bar include:

  • Home. Return to the Reactome home page.
  • About. Background information such as a description of Reactome and the people involved.
  • Content. An overview of the Reactome database, the areas of biology we cover, our plans for content expansion, the structure of our data, and statistics about the current content.
  • Documentation. Includes this User Guide, plus technical aspects of Reactome's functionality and data.
  • Tools. Tools in addition to those linked as shortcut buttons on the Sidebar. Some of these tools are covered in more detail in the Reactome Tools section.
  • Download. Links to a page where Reactome data and code is available in various formats. See below for more details.
  • Contact Us. Launches your email program.
  • Outreach. Details of Reactome training, publications and representation at conferences.

The Tool bar

The tool bar is divided into upper and lower sections. The upper section provides easy access to tools, the lower has shortcuts to popular Downloads, Featured tools, and Help. From top to bottom, the features are:

  • Search. Type in text (words) or identifiers to locate them in Reactome. For details see Searching Reactome section. Search Examples are listed below the Search textbox.
  • Tool buttons. Directly access the most popular Reactome tools; see Reactome Tools for detailed descriptions.
  • Downloads. Popular Reactome data downloads.
  • Try this. A featured Reactome tool, intended to explain tools to new users or introduce new features.
  • Comments. Send anonymous feedback to Reactome.

The Main Text

This is the text area under the Navigation Bar, to the right of the Tool bar. This area may change from time to time. Currently, it contains the following:

  • About Reactome. A one paragraph statement linked to more detailed information.
  • Featured pathway. A pathway of topical interest.
  • Tutorial. Link to a video tutorial introducing Reactome.
  • News and Notes. What's new in Reactome.

The Pathway Browser and Tools

The Pathway Browser is the primary means of viewing and interacting with specific pathways. It includes a search tool, interactive pathway viewer and set of tools for several types of analysis including:

  • Comparison of a pathway with its equivalent in another species
  • The overlay of user-supplied expression data onto a pathway
  • The overlay of protein-protein or protein-compound data from external databases or user-supplied data onto a pathway.

The Tool bar on Reactome's Homepage gives access to extended versions of these tools that query across all Reactome pathways (see Reactome Tools).

The Pathway Browser

The Pathway Browser is launched by clicking the top button on the Tool bar. The Pathway Browser has 4 panels:

Pathway Browser


Analyze bar – across the top

  • Includes a key to the most common pathway objects, and a link to a full Diagram Key.
  • The Analyze, Annotate & Upload button opens a panel that controls the interactive tools associated with pathway diagrams (explained in detail in the section ‘The Analyze, Update and Annotate Button’ below).
  • On the far left is a Home button that returns you to the homepage.

Sidebar - left side

  • At the top is a species selector, with dropdown list of species. The default is Homo sapiens. Change to another species and the pathway displayed will be computationally inferred from the curated human pathway. N.B. Reactome data is human-centric, data for other species is inferred from human pathways - pathway steps may be missing for other organisms if they are not identified by the inference process (described here). Currently, infectious disease pathways involving pathogen proteins are only listed under the species "Homo sapiens". This include the HIV, Influenza, and botulinum neurotoxicity pathways.
  • Below the Species selector the Tool bar has two tabs:
    • Pathways tab - displays the hierarchy of Reactome pathways. Similar to the Windows File Manager, sub-pathways can be revealed by clicking on the + symbol to the left of the pathway name.
    • Help tab - help.
  • Click the arrowhead on the right edge of the tool bar to hide/reveal it. The width of the Sidebar can be adjusted by clicking on and dragging the grey line, separating the Sidebar from the pathway browser, to the intended depth.

Pathway Diagram Panel - upper right

  • This is where pathway diagrams are displayed, when selected in the Pathways tab. This panel will display a brief guide until you select a pathway. Top-left of this panel is a navigate/zoom tool. Click on the arrows to move across the diagram. Top-left of this panel is a navigate/zoom tool. Click on the arrows to move across the diagram. The central button resizes the pathway diagram to fit the available space. Click on the circles to zoom in and out (the mouse wheel also zooms).

Details Panel - at the bottom of the Pathway Browser.

  • It gives details of the selected pathway, reaction, complex, set or proteins, when they are selected in the pathway diagram or Pathways tab. This panel can be revealed/hidden using the small yellow triangle on the border with the Pathway Diagram panel.


The Pathways tab and pathway hierarchy

One of the tabs on the Sidebar is the Pathways tab. This contains a list of Reactome pathway topics, sorted alphabetically. Some topics, such as apoptosis, are too large to represent as a single pathway. Instead they are divided into sub-pathways, which may be further divided into sub-pathways. Most Reactome pathway topics are divided into smaller sub-pathways. Individual steps in a pathway are known as reactions. The organisation of pathways, sub-pathways and reactions is represented on the Pathways tab as a pathway hierarchy. This view functions in a similar manner to the Windows File Manager; sub-pathways are revealed by clicking on the + symbol to the left of the pathway name, and hidden by clicking on the – symbol.

Pathways and reactions can be differentiated by a representative symbol to the left of the name, see the image Pathway Hierarchy Symbols below. You may also see a symbol representing 'black-box' reactions, where complete details have been omitted as unnecessary or are not completely determined.

Within a pathway, the order of reactions in the hierarchy from top to bottom usually follows their order in the pathway, so that preceding reactions are above the subsequent reaction, but note that this is not always the case. Reactome has a more formal way of identifying connected reactions called Preceding and Following Events, visible in the Reaction Details (see Details Panel below).


Pathway Hierarchy
Pathway Hierarchy Symbols

Pathway Diagrams

Pathway diagrams represent the steps of a pathway as a series of interconnected pathway steps, known in Reactome as 'reactions'. Reactions are the core unit of Reactome's data model. They encapsulate 'changes of state' in biology, such as the familiar biochemical reaction where substrates are converted into products by the action of a catalyst, but also include processes such as transport of molecules from one cellular compartment to another, binding, dissociation, phosphorylation, degradation and others.

A "pop out" diagram key is found in the upper right corner of the webpage.

Prediagramkey.jpeg

The diagram key provides a descriptions of the icons used in the diagrams:

Diagram Key

Objects on the diagram represent reaction inputs and outputs, and the catalyst if relevant (see A below).

  • Inputs, outputs and catalysts are represented as boxes or ovals.
  • Green ovals are small molecules or sets of small molecules.
  • Green boxes with rounded off corners are individual proteins or sets of proteins or mixed sets.
  • Green boxes with square corners are proteins that have no Uniprot accession (or did not at the time the reaction was created).
  • Blue boxes are complexes, , i.e. proteins and/or small molecules that are bound to each other.
  • Input and output molecules are joined by lines to a central 'reaction node' (surrounded by a green box in Reaction Objects A, below). Clicking this node selects the reaction.
  • The outputs of a reaction have an arrowhead on the line connecting them to the reaction node.
  • Numbered boxes on the line between an input/output and the reaction node indicate the number of units of this input/output in the reaction (when n >1).
  • Reaction input/output molecules are often connected by arrows to molecules that take part in preceding or subsequent reactions (i.e. the preceding/subsequent steps in the pathway).
  • Catalysts are connected to the reaction node by a line ending in an open circle.
  • Molecules that regulate a reaction are connected to the reaction node by a line ending in an open triangle for positive regulation or a 'T'-shaped head for negative regulation (see B below).


Reaction Objects A
Reaction Objects B

Reactions are superimposed onto pink boxes that represent cellular compartments - a typical diagram has a box representing the cytoplasm, bounded by a double-line representing the plasma membrane. The white background beyond this represents the extracellular space. Other organelles are represented as additional labelled boxes within the cytoplasm. Reaction objects are placed in the physiologically correct cellular compartment, or span the boundary of a compartment to indicate they are in the corresponding membrane, e.g. the boundary of the cytosol is the plasma membrane.

The reaction node has 5 subtypes, indicating subclasses of reaction (see Reaction Objects C, below):

  • Open squares represent a 'transition'
  • Filled circles represent 'association', i.e. binding
  • Double-circles represent 'dissociation'
  • Squares with two slashes represent 'omitted process'. This is used to denote a reaction where the full details have been deliberately omitted. This is most commonly used for events that include specific members of a protein family to illustratethe general behaviour of the larger group. It is used for reactions that occur with no fixed order or stoichiometry, and for degradation where the output is a random set of peptide fragments.
  • Squares containing a question mark represent 'uncertain process', where some details of the reaction are known, but the process is thought to be more complex than it is represented. Explanatory details are typically included in the Description.


Reaction Objects C

Navigating Pathway Diagrams

The Pathways tab, Pathway Diagram panel and Details panel are connected. Clicking on a pathway or reaction name in the Pathways tab causes it to be highlighted in green, opens the corresponding pathway diagram in the Pathway Diagram panel and populates the Details panel. N.B. The default for the Details panel is hidden - click on the small blue triange at the bottom of the Pathway Browser to reveal it.

When a pathway diagram is visible in the Pathway Diagram panel, moving the mouse pointer over the name of one of it's sub-pathways or reactions on the Pathways tab will highlight the reaction node(s) for that sub-pathway/reaction in green on the diagram. If the sub-pathway or reaction name is selected (clicked) on the Pathways tab, it is highlighted in green, it's parent pathway or pathways are highlighted in yellow-green, and the reaction node(s) corresponding to the sub-pathway or reaction are surrounded by a green box in the pathway diagram. In addition, the Details panel will update to show details of the sub-pathway or reaction selected. If the selected sub-pathway or reaction is not currently visible in the pathway diagram, it will re-centre on the selected object(s).

Selecting a reaction in the Pathway Diagram will cause it's name and the name of parent pathway(s) in the hierarchy to be highlighted on the Pathways tab.

Highlighted reactions are also visible in the thumbnail diagram and can be used to navigate quickly to the region of interest in the diagram using the thumbnail window, bottom left of the Pathway Diagram.

There is a navigation tool top-left of the Pathway Diagram. With this you can navigate left, right up or down and zoom in or out of the diagram. You can also zoom using the mouse wheel, and click and drag the diagram.

Pathway Browser showing a selected sub-pathway highlighted in green and parent pathways highlighted in yellow-green, reaction nodes for all reactions in the sub-pathway are surrounded by green squares in the Pathway Diagram


Sub-pathway Diagrams

Large topics such as apoptosis contain too much information to be displayed as a single pathway or diagram. Where this is the case, the topic is divided into sub-pathways in the pathway hierarchy, and sub-pathway diagrams in the Pathway Diagram Panel. If you select a pathway in the Pathways tab that only has sub-pathways diagrams, the Pathway Diagram displays an overview diagram, containing boxes with green borders, the symbol for a sub-pathway diagram. This symbol indicates that a detailed diagram is available but is not part of the currently displayed Pathway Diagram. The subpathway diagram symbol can also be used within a pathway diagram if the details of a subpathway are too complicated for inclusion, to indicate that the sub-pathway has it's own, separate diagram.

There are two ways to access a sub-pathway diagram. The simplest is to right-click on the sub-pathway diagram symbol. This produces a menu, select the option 'Go To Pathway'. Alternatively, left-click to select the sub-pathway diagram symbol - the corresponding sub-pathway name will be highlighted on the Pathways tab. In the example shown below, the sub-pathway diagram symbol for 'Intrinsic Pathway for Apoptosis' is selected (boxed in green) causing the corresponding sub-pathway and it's parent pathway to be highlighted green on the Pathway tab. The sub-pathway name has not been clicked. Clicking on the sub-pathway name in the hierarchy causes the corresponding pathway diagram to open in the Pathway Diagram panel.

Pathway Browser showing a selected sub-pathway diagram symbol surrounded by green square, corresponding sub-pathway and parent pathway are highlighted in green on the Pathways tab.

Details Panel

The Details Panel is at the bottom of the Pathway Browser. It gives details of the selected pathway, reaction, complex, set or proteins, when they are selected in the pathway diagram or Pathways tab. This panel can be revealed/hidden using the small blue triangle on the border with the Pathway Diagram panel. A feature shared by both the Pathway, Pathway and Complex details panel is the “Participating molecules” button that lists all proteins, nucleic acids, complexes and small molecules, and complexes of these entities that are involved in the pathway, reaction or complex. The height of the Details panel can be adjusted by clicking on and dragging the grey line, separating the details panel from the pathway browser, to the intended depth.

Pathway Details

Pathway details typically include a description of the pathway, the GO biological process term for the pathway, the GO cellular compartment term (if applicable), and references linked to PubMed providing background information relevant to the pathway. A figure may be included. If the pathway is not supported directly by human experimental data but has been inferred from another species, this is indicated by the phrase 'This event is deduced on the basis of event(s)' and a link to that pathway. The field 'Equivalent Event in Other Species' lists species that have been computationally predicted to have the same pathway. Also available are links to download/export the pathway in several formats.

Details panel displaying the pathway description below the pathway diagram.



The molecules that participate in a pathway, reaction, complexes or sets can be displayed and/or downloaded by clicking on the "Load participating molecules" button at the bottom of the details section.

Participating molecule1.jpeg


A list of participating molecules, grouped by type, may be viewed in the "List" tab. The number of each type is shown in parentheses.

Participating molecule2.jpeg


Each list can be unfurled, and the download options may be revealed by clicking on the download options tab.

Participating molecule3.jpeg


The user can select the type(s) of molecules to download, the type of information desired for each molecule type, and the format of the download file.

Participating molecule4.jpeg


This event is deduced on the basis of event(s) in other organism(s)
This indicates that event has not been experimentally demonstrated in humans, but has been inferred on the basis of data acquired for another species. The link points to a web page containing the experimental data for the reaction in that species.
Equivalent event(s) in other organism(s)
Provides links to descriptions of the events in other species that are either confirmed to occur in a very similar way in both species, or have been electronically inferred .

Note that you can view the descriptions of these "equivalent events" in the bottom pane of the web page. Another way to view the diagram for a non-human species pathway is to select the species (in the species tab at the upper left corner of the page) and the pathway in the pathway hierarchy panel in the pathway browser.
Note that, when viewing pathways that have been electronically inferred in Gallus gallus, it is currently not possible to view the hierarchy of inferred events in the left panel. By default, the manually curated pathways for Gallus are listed there. To navigate within these inferred Gallus pathway diagrams, you can select individual reaction nodes or reaction participants and see their descriptions in the details panel.

Download pathway in one of the formats
This slot provides export options for the pathway being displayed. These are:
  • [SBML]. An exchange format used by systems biologists for their models.
  • [BioPAX2]. An exchange format used by systems biologists for their models.
  • [BioPAX3]. An exchange format used by systems biologists for their models.
  • [PDF]. Text dump of the pathway, organized to look like a research report.
  • [Protege]. A format used for ontology exchange.

SBML and BioPAX are exchange formats of interest to bioinformaticians. PDF is the familiar document format, providing you with a convenient "document" of the pathway. Protégé is an extensible, platform-independent environment for creating and editing ontologies and knowledge bases, this download is likely to be of interest to those wishing to extend Reactome functionality.

The complete Reactome textbook of biological processes in PDF or RTF format, the complete set of human reactions in Reactome (in SBML or BioPAX level 2 or 3 format), and a list of human protein-protein interaction pairs are available to download from the Download page, linked to the Menu Bar on the Reactome homepage.

To find out about how Reactome generates SBML, see SBML At Reactome.

For general information about SBML, click here. For more information about BioPAX click here.

Reaction Details

Reaction details

Reaction details example


Included in the details are a summary of the reaction and possibly a figure. Reactions must contain literature references that contain experimental data verifying the reaction, or an 'Inferred From' link if the reaction has been manually inferred from experimental data from model organisms. For pathway details see the previous section.

Other details within this section include:

Authored
the expert biologists that contributed materials that allowed this pathway to be created in Reactome.
Reviewed
the expert biologists that verified the content for this pathway.
Input/Output
identifies the input/output molecules, sets or complexes for this reaction. Icons to the right of these named items link to further information in external resources, e.g. the red U on a grey background links to Uniprot. Other linked resources include ENSEMBL, EC, Entrez Gene, HAPMAP, IntEnz, KEGG, OMIM, CTD Gene, RefSeq:NM, RefSeq:NP, UCSC, Protein Data Bank, BioGPS, Brenda, dbSNP, ChEBI, COMPOUND, PubChem Substance and DOCK Blaster.
Icons associated with input/output molecules provide hyperlinks to many external data resources
Catalyst (when relevant)
the protin or complex that catalyzes the reaction.
Essential catalyst component (when relevant)
if the catalyst is a complex, the component, or domain in a simple catalyst, that enables the reaction to occur.
GO molecular function
Gene Ontology term that represents the activity of a catalyst or transporter within the reaction. For further description of the Gene Ontology, click here. A description of the GO term can be viewed via the GOID which links out to the QuickGO Gene Ontology browser.
Preceding event(s)
A list of events that occur immediately before the event being viewed.
Section of Reaction details showing preceding event(s)
Following event(s)
A list of events that occur immediately after the event being viewed.
Cellular compartment
The location within cell where the reaction occurs.
References
The research publications that describe the evidence supporting the reaction, each hyperlinked to its PubMed abstract (when applicable).
Taxon
The species in which the event occurs.
This event is deduced on the basis of event(s) in other organism(s)
This indicates that event has not been experimentally demonstrated in humans, but has been inferred on the basis of data acquired for another species. The link points to a web page containing the experimental data for the reaction in that species.
Equivalent event(s) in other organism(s)
Provides links to descriptions of the events in other species that are either confirmed to occur in a very similar way in both species, or have been electronically inferred .

Note that you can view the descriptions of these "equivalent events in the bottom pane of the web page, it is possible to navigate directly to the corresponding pathway diagram for the non-human species by selecting the event in the description section. To view the diagram for a non-human species pathway, you may also select the species and the pathway in the pathway hierarchy panel.
Note that, when viewing pathways that have been electronically inferred in Gallus gallus, it is currently not possible to view the hierarchy of inferred events in the left panel. By default, the manually curated pathways for Gallus are always listed in the event hierarchy. To navigate within these inferred Gallus pathway diagrams, you can select individual reaction nodes or reaction participants and see their descriptions in the details panel.

Section of the Details panel displaying "Equivalent event(s) in other organism" of the reaction.


Download reaction in one of the formats
This slot provides export options for the reaction being displayed. These are:
  • [SBML]. An exchange format used by systems biologists for their models.
  • [BioPAX2]. An exchange format used by systems biologists for their models.
  • [BioPAX3]. An exchange format used by systems biologists for their models.
  • [PDF]. Text dump of the pathway, organized to look like a research report.
  • [Protege]. A format used for ontology exchange.

SBML and BioPAX are exchange formats of interest to bioinformaticians. PDF is the familiar document format, providing you with a convenient "document" of the pathway. Protégé is an extensible, platform-independent environment for creating and editing ontologies and knowledge bases, this download is likely to be of interest to those wishing to extend Reactome functionality.

The complete Reactome textbook of biological processes in PDF or RTF format, the complete set of human reactions in Reactome (in SBML or BioPAX level 2 or 3 format), and a list of human protein-protein interaction pairs are available to download from the Download page, linked to the Menu Bar on the Reactome homepage.

To find out about how Reactome generates SBML, see SBML At Reactome.

For general information about SBML, click here. For more information about BioPAX click here.

Protein, Small Molecule, Complex and Set Details

The details of any protein, small molecule, complex or set represented in the pathway diagrams can be displayed in the Details panel by selecting the object within the diagram. For pathway details see the previous section. These views are all very similar, an example of protein details is represented below.

Fields within the details for proteins include:

Links to corresponding entries in other databases
Cross-reference to identifiers used for this molecule in external reference databases, with hyperlinks to the record.
Other identifiers related to this sequence
Additional external identifiers associated with this molecule
Reference entity
This identifies the primary external source used to derive the Reactome record for this molecule. The preferred source for proteins in Reactome is UniProt.
Cellular compartment
The cellular compartment that contains this molecule, with hyperlink to the corresponding GO term.
Component of
If this molecule is part of any complexes, they are listed here
Consumed by events
If this molecule is 'consumed', e.g. becomes incorporated into a complex or degraded, the appropriate reactions are listed.
Other forms of this molecule
If this molecule exists within Reactome in an alternate, post-translationally modified form, the appropriate molecules are listed.
Entities deduced on the basis of this entity
Lists equivalent molecules in other species if these have been inferred to exist. A description of the inference process can be found here.

Molecules that are catalysts will have these additional fields:

Biochemical activities
Identifies the GO molecular activity term(s) associated with the activity of the catalyst
Catalyses events
Lists reactions that include this molecule as a catalyst

The details for complexes and sets will include:

Hierarchical view of the components
A tool for examining the hierarchy of objects within a set or complex. This functions in a similar way to the pathway hierarchy. Clicking on a name will update the Details panel with details of the selected object. Note that you cannot subsequently return to the previous details without re-selecting the object on the pathway diagram.
Hierarchical view of the components in a set, with hierarchy type symbols shown

Clicking on the Show/Hide symbols button reveals/hides symbols that indicate the status of the objects within the complex or set.

  • Objects within a complex have a blue 'component' symbol.
  • Objects within a set may have red or yellow symbols for known member and possible member respectively. Possible member is used for sets that have 'candidates' - members of the set that have not been proven experimentally to participate in the reaction, but are widely believed to have properties identical to the known members. This is typically used to indicate that close family members are considered to have similar functions to their experimentally determined relatives.

Context Sensitive Menus for Molecules in Pathway Diagrams

Within pathway diagrams right-clicking on the box representing a molecule, complex, set or reaction presents the user with a menu or list of features dependent on the nature of the item selected. Note that unavailable options do not appear, so very few items will have the full range of options.

Right-click Menu Items
Other Pathways
Lists other pathways that include the selected item as a participant. Clicking on any of the pathway names will display that pathway in the Pathway Diagram Panel.


Popup menu with Other Pathways highlighted


Display Interactors
Surrounds the selected diagram item with a set of boxes representing protein-protein or protein-small molecule interactors. For more information see Molecular Interaction Overlay.N.B. Interactors can only be displayed for individual proteins, not for complexes or molecule sets.
Hide Interactors
Removes interactors from the selected diagram item, unless it is an interactor of another item in the display. For more information see Molecular Interaction Overlay
Popup menu with Display Interactors highlighted
Export Interactors
Exports all interactors of the selected diagram item from the pre-selected source interaction database and displays them in a new browser tab/window. Interaction data is displayed in PSI-MITAB format. For more information see Molecular Interaction Overlay
Participating Molecules
Lists the component molecules for the selected item, e.g. lists the proteins that make up a complex. Clicking on any molecule in that list will open a new window with details.
Participating Molecules highlighted on the popup menu
Display Participating Molecules
This popup menu item is displayed following Species Comparison or Expression Analysis. It causes the display of a grid of cells that represent the components of the set or complex. See the relevant sections of the User Guide for more details.
Go To Pathway
Associated with symbol for sub-pathway diagrams, selecting this from the popup menu causes the sub-pathway diagram to be displayed. See Sub-pathway Diagrams

Navigating Disease Pathway Diagrams

Biological processes are captured in Reactome by identifying the molecules (DNA, RNA, protein, small molecules) involved in them and describing the details of their interactions. From this molecular viewpoint, human disease pathways have three mechanistic causes:

  • the inclusion of microbially-expressed proteins
  • altered functions of human proteins
  • changed expression levels of otherwise functionally normal human proteins

For some Reactome disease pathways, such as "Signaling by EGFR in cancer" it is helpful to describe the effects of disease related reactions and entities on the normal human pathway. In this case, disease related events and normal events are grouped together in the pathway hierarchy and the disease events (reactions and associated molecular entities) are overlaid on top of the diagram of the normal human pathway that they affect. This “merged" diagram can be viewed by clicking on the "parent" disease pathway. Disease related entities and reactions are outlined in red whereas the normal events and entities are “greyed out”. In the event hierarchy, the parent pathway contains both the normal pathway (Signaling by EGFR) and a pathway containing uniquely the disease associated reactions (Signaling by constitutively active EGFR).

Parent Disease pathway

Clicking on the “normal" subpathway hides the diseased entities and reactions entirely from the diagram allowing the user to browse just the normal events.

Normal pathway

Clicking on the “disease" subpathway shows the merged (disease/normal) diagram again. The disease reaction nodes are highlighted with green boxes. Disease related pathways, reactions, and entities may be flagged, in the details section, with the most relevant EBI Human disease ontology term. *Please note that it is not currently possible to link to the Human disease ontology website directly from the disease tagged entity/event, but this feature will appear in the V40 release.

Disease pathway


Some disease pathways in Reactome have not been grouped, in the hierarchy, with a separately annotated "normal" pathway and diagram. These include infectious disease pathways (and the Amyloids pathway) in which the majority of annotated reactions involve pathogen (abnormal) proteins only and/or involve interactions among pathogen (abnormal) and human proteins. These disease pathways have each been placed in a single diagram. Red lines are used to emphasize disease events. Reactions involved in disease or disease progression are highlighted in red as are any entities that are disease-associated or derived from another species (pathogen). Host/human or normal entities are black. Complexes that contain both host and disease associated/pathogen derived entities are colored red.

Disease single ELV1.jpg

View Equivalent Pathway In Another Species

Reactome is human-centric and aims to represent human biology. Pathway in other species are electronically inferred from curated human pathways - a description of the inference process can be found here .

To view the predicted conservation of the displayed pathway in another species, use the "Switch Species" dropdown menu at the top of the Pathway Browser Sidebar. Selecting causes the pathway diagram to re-draw (a revolving arrow icon on the Pathways tab indicates the diagram is re-drawing). The layout of the diagram is preserved, but any reactions that were not inferred will be absent. The event hierarchy is also updated to display the events present in the selected species. Note however that when the species Gallus gallus. is selected (this species has both manually curated and electronically inferred pathways in Reactome), it is currently only possible to view the manually curated Gallus pathways. To view the electronically inferred Gallus pathways, you must go to a pathway of interest in human and select the electronically inferred Gallus event in the details section. The inferred Gallus pathway diagram will then be displayed, but the manually curated pathways for Gallus will be listed in the event hierarchy. To navigate within these inferred Gallus pathway diagrams, you can select individual reaction nodes or reaction participants and see their descriptions in the details panel.


A more sophisticated method of comparing pathways between species is available as the Species Comparison tool.


Selection of a species to compare the predicted pathway coverage

Molecular Interaction Overlay

Molecular Interaction (MI) overlay allows protein-protein or protein-compound interactions to be overlaid (superimposed) onto the pathway diagram. The source depends on the currently selected interaction database. The default is IntAct, other sources of interaction data (protein-protein and protein-compound) including a user-supplied list can be selected using the Analyze, Update & Annotate button on the Menu Bar.

A maximum of 10 interactors are displayed as a ring of blue-bordered boxes connected by blue lines to the selected protein. A white box superimposed onto the selected protein displays the number of interactors up to a value of 50, if more than 50 are available 50+ will be displayed.

Several items can be sequentially selected for interactor display. If the same interactor is connected to more than one item it is re-used, i.e. connected to all the selected items in the diagram.

Inteactors displayed for Syk.
  • Hovering the mouse pointer over a protein interactor produces a pop-up containing the name and identifier of the protein.
  • Hovering over a chemical interactor displays the formula and name of the chemical.
  • Clicking on a protein interactor opens the Uniprot entry for that protein in a new window.
  • Clicking on a chemical interactor opens up the ChEBI entry for that chemical in a new window.
  • Clicking on the line that connects the interactor to the pathway item opens a new window containing details of the interaction at the source database (see example below) ** Please note that link to the source database is not currently working for Internet Explorer 8**.


The IntAct entry for the interaction between Syk and CD18


Details of up to 50 interactors for every protein in the pathway diagram can be viewed as a table accessed via the Analyze, Update & Annotate button. An extended version of this table with no limit to the number of interactors per protein can be exported. See Analyze, Update & Annotate button for details.

While interactors are displayed, right clicking on the selected protein produces two new options, Hide Interactors which removes them, and Export Interactors. Interactions are exported in PSI-MITAB format.

The Analyze, Update & Annotate button

This button, located on the Menu bar across the top of the Pathway Browser, opens a control panel that has several functions. It has two tabs - one allows configuration of Molecular Interactions, the other contains tools for overlaying expression data onto the pathway diagram and for generating an alternative inter-species pathway-comparison view.


The Analyze, Update & Annotate control panel with MI Overlay tab selected
MI Overlay tab

This tab contains the following features:

  • Interaction database - a dropdown list allows selection of the source of interactors. This is automatically populated by querying the PSICQUIC Registry. If a new database is selected while interactors are displayed on the pathway diagram, the proteins represented by those items will be used as queries in the new database and the display will automatically update. If the table of interactions is displayed, it will automatically update to list the interactions found in the new database.
  • Upload a file - allows a user supplied list to be used as the source of interactors. Data must be in PSI-MITAB format though the only columns that need to be filled in are the accession number, gene name and confidence score columns (confidence score is only necessary if you want to use the color interactions feature). If the upload is successful, the label you submit when prompted will appear on the 'Interaction Database' drop down and can be selected as a source of interactions. A data set submitted in this manner will persist for the user session.
Upload dialogue
  • Clear overlay - removes all interactors from the pathway diagram.
  • Submit a New PSICQUIC service - can be used to add a new source interactions database. The URL field should contain the URL of the PSICQUIC service (REST interface). A service added this way will appear on the interaction database drop down for a period of three days, assuming you connect from the same computer.
  • Set Confidence Level Threshold - this allows the user to set a confidence threshold used for colouring interactors. The default threshold is 0.5. Interactions with a confidence level below this threshold will be coloured according to the colur set as the 'Below' colour, interactions with confidence scores equal to or above the threshold will be coloured with the 'Above' colour.
  • Colour button - activates colouring of the pathway diagram - interactions will be coloured only if a confidence score is available at the source database. The colours used can be changed by clicking on the coloured squares for 'Above' and 'Below'. A dialog allows selection of an alternative colour. Click 'Apply' to update the colours displayed.
  • Colouring Off button - removes colouring.
Interactions colored based on confidence scores:interaction with ABL exceeds the threshold and is coloured green, other interactions are below the threshold and are coloured red

The colours used for interaction colouring can be changed by clicking on the coloured squares for 'Above' and 'Below'. A dialog allows selection of an alternative colour, click 'Apply' to update the colours displayed.

Interaction colouring selection
  • Display/Hide Table of all interactors for pathway - this button switches on/off the table of pathway item-interactor pairs.
    • Clicking on the blue 'toggle' squares within this table will cause the pathway diagram to centre on the protein represented in the first column and display its interactors.
    • All entries (rows in the table) for that pathway protein will be shaded a pale blue when interactors are displayed. Clicking the toggle for a particular pathway protein a second time will hide the interactors and its table entries will be coloured white.
    • Clicking on a pathway protein ID (listed in the second column of the table) will re-centre the pathway diagram on that protein.


Table view of interactors. The first column contains a toggle that controls the display of interactors. Blue-shaded rows indicate a protein whose interactors are currently displayed. The second column identifies proteins in the pathway. The third column lists the interactors retrieved from the current interaction database (in this case IntAct) for each protein in the pathway.
  • Export all interactors for pathway - exports a full list of interactors for each protein on the pathway diagram, in PSI-MITAB format, as a new window.
The Expression & Species tab

This tab controls the colouring of pathways according to expression values or generation of an alternative species comparison view for the pathway.

Expression Analysis - expression painter To activate colouring by expression values, the user must submit a file of protein identifiers and numeric values, typically expression levels. The first column must be a protein identifier, ideally Uniprot ID or another identifier that can be mapped to proteins, e.g. Affymetrix or Illumina probe IDs. Many ID types are acceptable, see the User Guide for a complete list. The second and any subsequent columns contain numeric (expression) values. Each column is treated as a new 'sample' and used to generate an independently-coloured pathway diagram. These images can be viewed sequentially as a 'movie'. This is particularly useful for timepoints or disease progressions. Use the Browse button to identify the file, and once the name is displayed, click the Submit button to upload. To 'paint' the pathway diagram click the Expression Painting On checkbox (if the file did not upload this will produce a warning message). This will overlay colours representing the numeric values on the pathway diagram (see below).


Pathway diagram with proteins coloured by expression values following Expression Analysis.

Proteins in the pathway diagram are coloured according to their associated numeric values (typically expression levels, but could be differential expression, or any other measure). The colours form a continuous spectrum from red for the highest values to dark blue for the lowest values. The scale automatically adjusts to fit the range represented in the dataset. The protein identifier and numeric value are overlaid onto the protein box. Grey boxes are proteins (or small molecules) with no associated values in the input data. Black nodes represent complexes that include at least one protein represented by numeric data. The values associated with each component of a complex can be viewed; right click on the complex and select 'Display Participating Molecules'. A popup is displayed representing the complex as a grid of coloured cells corresponding to proteins within the complex. Grey cells represent proteins with no associated numeric values in the data. Hovering the mouse pointer over a cell displays the name of the protein. Clicking on this name opens Reactome details for that protein. The grid can be closed by clicking the 'x' in the top right corner. At the bottom of the diagram a pale blue Experiment Browser toolbar is displayed. This allows you to step through timepoints or experiments if more than one column of numeric values was included in the submitted data. Move between these by pressing the arrow buttons. The header of the data column is displayed between these arrows. The pathway diagram will re-colour to reflect the new numeric values. To turn off expression painting mode, uncheck the box to the right side of the Analyze, Update and Annotate button.

For more details see Expression Analysis.

Species comparison - Other species view

To produce the Other species pathway comparison view, select a species from the dropdown list. The pathway diagram will be coloured according to the results of Reactome's pre-computed inference of equivalent reactions in the non-human species. Objects on the pathway diagram are colour coded:

  • Yellow indicates that the protein's orthologue is present in the comparison species.
  • Blue indicates that the protein is only known in human, no orthologue could be found in the comparison species.
  • Grey indicates that the molecule was not inferred to exist in the comparison species, but is also used for small molecules where this comparison is not relevant.
  • Black means the entity is a complex. Right click on the complex to reveal a grid representing the proteins in that complex:
Detail grid for a complex following Species comparsion

Within the grid each cell represents one of the proteins in that complex, hover the mouse over it to see its name. The calls are colour coded as described above. N.B. The grid is always 3 cells wide, pale grey is the background, not a protein 'missing' in the comparison species.

For more details see Species Comparison

Searching Reactome

A number of options are available for searching Reactome, depending on your requirements. Most users, most of the time, will probably find the "simple search" quite sufficient. This is available from the Reactome home page. If you need to make more complex, logical, queries, then you will need to use the Advanced search.

Simple searches

The simple text search tool is located top left of the homepage. Type the term of interest, and then click on the "Search" button. You will be presented with a results page similar to this:

Text search.png

Results have an associated type: Reaction; Pathway; Protein or Other (includes literature). Type is indicated by an icon and a type name preceding the title of the result. Click on the title to go to the corresponding Reactome web page. Most results will have descriptive text details. Your search terms will be highlighted if they appear within the title or descriptive text.

At the top of the results page is a set of tick boxes, the Type Selector Bar. This includes a count by type (pathways, reactions, proteins, other) for the displayed results, and allows you to limit results by type. Uncheck boxes for type categories that you don't want to see, e.g. if you only want results of type Proteins, uncheck Pathways, Reactions and Others, then click the Show button.


Ten hits are displayed per page. At the bottom of the page is a navigation tool to see additional hits. You can also display all the results in a single page by clicking 'Show all results'.

Each results returned is clickable and will take you to the appropriate Reactome pathway diagram when selected.


Advanced Search

The Advanced Search form can be accessed under Tools on the Reactome homepage Menu Bar, visible from all pages except the Pathway Browser. This search method allows the combination of query terms for exact phrases, any of a list of words, all of a list of words (but in any order), regular expressions, the NOT logical operator, any value or no value. Query terms are joined by 'AND'. This can generate powerful and highly specific searches that make use of fields in the Reactome database schema. To make full use of the Advanced Search we recommended that you read the document Data Model (avialable under Documents from the Menu Bar on the Reactome homepage) and refer to the Database Schema (linked to the Data Model document and under Contents from the Menu Bar).

Advanced Search form. This example would identify Reactome items (pathway events or molecules) with names that include the words serotonin or dopamine where the cellular compartment matches the exact phrase 'Plasma membrane' and the database has no value for 'inferredFrom' (therefore any reactions or pathways in the results will be based on human experimental data, not inferred from a model organism)


Small Molecule Search Tool

This is accessed via the Menu bar on the Reactome homepage, select Tools, Small molecule search. This will open the Small molecule search page where you can specify the small molecule by name or by structure. Structures can be hand-drawn, or pasted as a SMILES string. Options include search for molecules that contain this structure or those that resemble this structure. Click on the Search button to start the search. The tool will search the ChEBI database. A list of matching compounds is returned, with links to the appropriate Reactome pathways.

Reactome Tools

Pathway Analysis

The Pathway Analysis tool has two alternate functions. In ID mapping mode it takes a user-supplied set of gene or protein identifiers and shows the Reactome pathways they match. In Over-representation analysis mode, it takes a user-supplied set of gene or protein identifiersand performs a statistical test to determine whether any Reactome pathways are over-represented (enriched) in the submitted data, i.e. it answers the question 'does the list represent the proteins within a specific pathway more than would be expected if the set were random?'. A one-tailed Fisher's exact test is used to calculate the probability. Note that p-values are not corrected for multiple testing that arises from evaluating the submitted list of identifiers against every pathway.

Pathway Analysis is launched by pressing the "Pathway Analysis" button on the sidebar, on the left-hand side of the homepage.

The tool opens as a data entry page, a box for pasting your list of protein identifiers (one per line). Alternatively use the browse button to locate a file. Valid identifiers include UniProt accession numbers and IDs, GenBank/EMBL/DDBJ, RefPep, RefSeq, EntrezGene, MIM and InterPro IDs, Affymetrix and Agilent probe IDs, Ensembl protein, transcript and gene identifiers. Identifiers that contain only numbers, such as OMIM and EntrezGene IDs must be prefixed by the source database name and a colon e.g. Uniprot:P12345, MIM:602544, EntrezGene:55718. Lists with mixed identifier types can be used. Mixed species lists can be used, but only identifiers from the most frequent species are submitted for analysis.


Pathway analysis.png

You can type or paste your IDs into the text area provided, or upload a file of identifiers from your computer using the "Browse" button.

If you would just like to see what the expected data format is, or you would like to try the tool without submitting your own data, click on the "Example" button, and a test dataset will automatically be loaded for you.

Select one of the two radio buttons to determine whether simple ID matching or over-representation analysis will be performed. Further options may be added in the future.

Click the Analyse button to begin the analysis. This may take several minutes, depending on the size of the data set submitted and load on the server. While the analysis is in progress a progress indicator is displayed:

Progress bar.png


The subsequent subsections will discuss the results pages produced by Pathway Analysis.

ID mapping and pathway assignment

If you selected "ID mapping and pathway assignment" analysis (the default setting), this is how to interpret the results. Typically, you will end up with a table like this:

An example Pathway Analysis result with the default 'ID mapping and pathway assignment' selected

The table represents every protein identified from the submitted list. If the ID was not recognized, it is displayed in column 1 but all other columns will be blank. The columns can be sorted. Click once on the title of the column. After a while, a small white arrow appears next to the column title, indicating the direction of the sort. Click the title a second time to sort in the opposite direction. N.B. Large datasets may take a couple of minutes to sort – please be patient!

The columns represent:

  1. The IDs you supplied.
  2. The corresponding UniProt ID
  3. Species.
  4. List of names of pathways in which this protein is found.


Clicking a pathway name opens the Pathway Browser (in a new window) and displays the appropriate pathway diagram. The protein corresponding to the ID in the pathway analysis results table will be highlighted in the pathway diagram. See Pathway Diagrams and Navigating Pathway Diagrams to learn how to interpret and navigate these diagrams.

The table of results can be exported in several formats using the Download button at the top of the table.

Overrepresentation analysis

If the option 'Overrepresentation analysis' was selected, the top of the Pathway Analysis results will look something like those below:

Overrep.png

This uppermost results section, 'Statistically over-represented events in hierarchy', represents all Reactome pathways that contain proteins identified from the submitted list.

N.B. Clicking on a pathway name navigates away from the results and opens the pathway diagram in the Pathway Browser.

The colour used to highlight the pathway name indicates the level of over-representation, i.e. the bias in your dataset towards proteins in that pathway; the warmer the color, the higher the level of overrepresentation in the pathway. To the right of the pathway name is the p-value, followed by the number of proteins from the submitted set that matched the pathway/the total number of proteins in the pathway.

The order of pathways is also determined by p-value though this may not be immediately obvious:

By default, over-representation analysis results display only 'top-level' pathways, i.e. the list of pathways seen in the Table of Contents or Pathway Browser Pathways tab when the list is unexpanded. These pathways represent the top of a hierarchical tree, and may contain sub-pathways with lower p-values that might therefore be of greater interest. To draw attention to sub-pathways with highly significant scores, the list of top-level pathways is ordered by the most significant p-value within it's hierarchy. In the image above, the pathway 'Cell Cycle, Mitotic' highlighted in dark blue contains the sub-pathway 'G1/S transition' highlighted in yellow, which has a considerably lower p-value and consequently has pushed 'Cell Cycle, Mitotic' up the list.

The top-level pathways, and any sub-pathways they contain, can be expanded using the + symbols, or alternatively, the entire hierarchy can be expanded/compressed using the Open All and Close All buttons.

For each pathway, there is a 'Matching identifiers' list of the identifiers and associated proteins that contributed to the over-representation score. Clicking on the + symbol next to the pathway name reveals the list.

Matching identifiers.png

The second section 'Statistically over-represented events as an ordered list' gives the same information in a downloadable tabular form.

The third section, 'Reactions coloured according to the number of genes or compounds (as specified by the submitted list of identifiers) participating in the given reaction' is a map of all reactions coloured by the number of participants in the reaction that were included in the submitted list. Reactions with no participants represented in the list are coloured grey. Additional sections allow downloads of the graphics, and a breakdown of the mapping from submitted identifiers to reactions, including those that did not match - this is useful to identify problems with the list, or identify proteins that could not contribute to the over-representation scores.

Species Comparison

Reactome uses manually-curated human pathways to electronically 'infer' their equivalents in 19 other species. A full description of the inference process can be found here. The Species Comparison tool allows you to compare human pathways with these predicted pathways, to see what is common to both or perhaps missing in the model organism.

Species Comparison is launched using a button on the sidebar, on the left side of the homepage. On the resulting page is a selection tool that reveals a dropdown list of species.


Species comparison.png

Choose one and click the Apply button. It may take some minutes before the results appear. The results page will look something like this:

Species comparison table.png

The table contains one row for each Reactome pathway. The columns represent:

  1. Pathway name.
  2. The species used for comparison with human.
  3. Number of proteins in the human pathway.
  4. Number of proteins inferred to exist in the comparison species.
  5. Graphic representing the ratio of values in column 3 and 4.
  6. A View button that launches the Pathway Browser and displays the relevant pathway diagram.

These columns can be sorted by clicking on the title of the column at the top of the table. A small white arrow appears next to the title, indicating the direction in which the column's contents have been sorted. Clicking the title a second time causes the column to be sorted in the opposite direction.

Clicking on a view button launches the Pathway Browser and displays the relevant pathway diagram (see example below).

Species comparison diagram.png

The nodes are colour coded:

  • Yellow indicates that the protein's orthologue is present in the comparison species.
  • Blue indicates that the protein is only known in human, no orthologue could be found in the comparison species.
  • Grey indicates that inference was not possible, used for small molecules and genomic objects that have no UniProt entry (or did not at the time the pathway was constructed).
  • Black means the entity is a complex. Right click on the complex to reveal a grid representing the proteins in that complex:
Speciescomparison complex.png

See the Pathway Diagrams and Navigating Pathway Diagrams sections for more information on pathway diagrams.

Expression Analysis

This tool allows you to visualize user-supplied expression data (or any other numeric value, e.g. differential expression) superimposed onto Reactome pathways.

It is launched using the Expression Analysis button on the Sidebar, on the left-hand side of the homepage. A submission form will appear:

The Expression Analysis submission form with an example dataset

To see an example of the expected data format, or to try the tool without submitting your own data, click on the 'Example' button. You can type or paste your IDs into the text area provided, or upload a file using the Browse button. Microsoft Excel and tab separated text files are both accepted. If tab-separated text is used there must be a newline at the end of each row.

The first data column of data must identify the protein, ideally with UniProt IDs. Subsequent columns should be numeric, representing expression levels. The following identifier types are currently supported: UniProt accession numbers and IDs, GenBank/EMBL/DDBJ/IPI protein ids, RefPep, RefSeq, EntrezGene, OMIM, InterPro, Affymetrix, Agilent, Illumina and Ensembl protein, transcript and gene identifiers. All purely numeric identifiers, such as those from OMIM and EntrezGene must be preceded by the abbreviated database name and colon, i.e. MIM:602544, EntrezGene:55718.

Having identified the data to submit, click the "Apply" button. The results may take some minutes to appear. The results page will look something like this:

Expression analysis table.png

The table contains one row for each Reactome pathway. The columns represent:

  1. Pathway name
  2. Species
  3. Total number of proteins in the pathway.
  4. Proteins in the pathway represented in the submitted data.
  5. A graphic representing the ratio of the values in column 3 and 4.
  6. A View button that launches the Pathway Browser and displays the relevant pathway diagram.


The columns can be sorted by clicking on the title of the column at the top of the table. A small white arrow appears next to the title, indicating the direction in which the column's contents have been sorted. Clicking the title a second time causes the column to be sorted in the opposite direction.

Clicking on any of the view buttons launches the Pathway Browser and displays the relevant pathway diagram (see example below).

Expression overlay.png
  • Proteins in the pathway diagram are coloured according to their values.
  • The colours form a continuous spectrum from red for the highest values to dark blue for the lowest values. The scale automatically adjusts to fit the range represented in the dataset.
  • The submitted identifier and value are overlaid onto the protein box.
  • Grey boxes are proteins or small molecules with no associated values in the input data.
  • Black nodes represent complexes that have values for at least one of the proteins. The value associated with each component of a complex can be viewed; right click on the complex and select 'Display Participating Molecules'. A popup is displayed representing the complex as a grid of coloured cells, each representing a protein within the complex. Grey cells represent proteins with no associated numeric values in the data. Hovering the mouse pointer over a cell displays the name of the protein. Clicking on the cell displays details of the protein in the Details pane. The grid can be closed by clicking the 'x' in the top right corner.
Expression complex.png
  • The Experiment Browser toolbar (pale blue, at the bottom of the Pathway Diagram) allows you to step through the columns of your data, e.g. time-points or disease progression. Move between them by pressing the arrow buttons. The header of the data column (if present) is displayed between the arrows. The pathway diagram will re-colour to reflect the new values.
Experiment toolbar.png

See the Pathway Diagrams and Navigating Pathway Diagrams sections for more information on pathway diagrams.

Reactome BioMart

What is Reactome BioMart?

This section is intended to give some basic ideas how Reactome BioMart helps biologists in searching data of their interests. The Reactome BioMart is a tool that allows fast bulk querying and download of Reactome data and associated data from a number of other databases, including UniProt and ENSEMBL. BioMart can link queries together, so that the results contain information from more than one database. For example, it is possible to find the Affymetrix IDs associated with the genes in selected Reactome pathways by linking a Reactome query to an ENSEMBL query.

To access Reactome BioMart, click on the "BioMart: query, link" item in the Tools menu on the navigation bar.

There are two ways to use Mart. Firstly, the Reactome canned queries that can be accessed at the top of the page and secondly, the Regular BioMart query interface that is below the canned query selector.

Entry Page.jpg

Canned Queries

Reactome provides a small set of canned queries. You can use these without needing to understand the details of the BioMart query interface. The canned query selecter allows you to choose a canned query. Once you have done so, clicking on the button "Go!" takes you to the page where you enter your data.

The following queries are currently available:

  • Find list of pathways for specific species (single data item). You can use this to list all pathways known to Reactome for a species of your choice.
  • Find list of reactions for specific pathways (multiple data items). If you have a list of Reactome stable identifiers for pathways that interest you, you can use this canned query to find all of the reactions involved in the pathways. If you use this query without submitting any data values, all reactions involved in all known pathways will be returned.
  • Find list of proteins for specific pathways (multiple data items). If you have a list of Reactome stable identifiers for pathways that interest you, you can use this canned query to find all of the proteins involved in the pathways. If you use this query without submitting any data values, all proteins involved in all known pathways will be returned. Proteins are characterized by their UniProt IDs.
  • Find list of complexes for specific proteins (multiple data items). If you have a list of protein UniProt IDs, you can use this canned query to find all of the complexes in Reactome involving those proteins. If you use this query without submitting any data values, all complexes and their associated proteins will be returned.
  • Find list of pathways for specific genes (multiple data items). If you have a list of Entrez gene IDs, you can use this canned query to find all of the pathways in Reactome involving those genes. If you use this query without submitting any data values, all pathways and their associated genes will be returned.
  • Find list of genes for specific pathways (multiple data items). This is the converse of the previous query: if you have a list of Reactome stable identifiers for pathways that interest you, you can use this canned query to find all of the genes involved in the pathways. If you use this query without submitting any data values, all genes involved in all known pathways will be returned. Genes are characterized by their Entrez gene IDs.
  • Find list of reactions for specific genes (multiple data items). If you have a list of Entrez gene IDs, you can use this canned query to find all of the reactions in Reactome involving those genes. If you use this query without submitting any data values, all reactions and their associated genes will be returned.
Canned Query.jpg

The data entry page will be different, depending on whether only a single data item is allowed or whether multiple data items are allowed. If only a single item is allowed, then you will be presented with a selecter to choose the item, e.g. species.

Canned Query Pathways.jpg

If multiple data items are allowed, you will get a text area, which you can use to enter the items separated by newlines, e.g. a set of UniProt IDs.

Canned Query Pathway Proteins.jpg

In this case you will also see some extra buttons. Above the text field will be the button "Show example". Pressing this causes example values to be loaded into the text area, which you can use for guidance or testing purposes. Below the text field is the "Browse..." button, which allows you to choose a file from your local computer to upload as data. By default, the contents of this file will not be displayed, but you can examine (and edit) it by clicking on the button "Preview file content".

Once you have your data, you can click on "Run query" to get the results. Additionally, there is a button "Reset", which clears the page and allows you to start again, if you wish. If you do not enter any data, then the query will be performed over all known data items.

Once the query has been performed, the results are presented in a regular BioMart results page. This allows you to export the results as tab-separated values or as an Excel file, and additionally, to perform more complex queries. See below for more details.

Regular BioMart Query Interface

The regular BioMart query interface is situated directly below the canned query selecter. It is powered by the BioMart engine and you can find full documentation at www.biomart.org .

On the right hand side of the page, you can select database and dataset. In addition to the Reactome database, there are a number of other databases available, currently UniProt, ENSEMBL and PRIDE.

Select Database.jpg

Reactome provides three datasets, "complex", "pathway" and "reaction". The interactions dataset is a test dataset. Choose the one most appropriate to the kind of query you want to make. E.g. if you would like to find all pathways associated with a given GO accession, start by selecting the "pathway" dataset.

Select Dataset.jpg

Filters are also split into categories, but in a different way to the attributes. The first one is labelled "Limit to ... containing these IDs". This allows you to enter a list of IDs that will restrict the results returned by the query. E.g. if you have selected the "complex" dataset, you can supply a list of UniProt IDs. This returns only those complexes containing the proteins corresponding to the UniProt IDs.

Regular Query Filters.jpg

The next filter is labelled simply "Species". If you do not use this filter, then the results will contain information from all species known to Reactome. If you select a species, then the results will be restricted to the (single) chosen species.

For the "pathway" and "reaction" datasets, the next filter will be labelled "GO accession". This allows you to enter a GO biological process accession number, and restrict the results to reactions or pathways containing that accession.

Finally, the "Miscellaneous" filter allows you to restrict either by version number or name. Version number means the stable ID version. This is a number that gets incremented every time something gets changed by a curator. E.g. if you are only interested in things that have never been changed, you could set the contents of this slot to "1". If you want to restrict by name, you should note that you need to enter the full name.

The attributes to be displayed in the results table can be selected on the left hand side of the page. They are split into several different categories, which are pretty much the same for all datasets. The first category contains attributes that are directly taken from the dataset itself. E.g. for "complex", you will find things like the stable ID, but also associated complex name and species name. Subsequent categories may include other Reactome classes you can link to, plus in all cases, DNA, protein and small molecule. This means, for example, that you can show all of the proteins and small molecules associated with the reactions that interest you, assuming you have initially selected the "reaction" dataset.

Regular Query Attributes.jpg

You can use the second "Dataset" link in the left hand panel to choose another dataset to link to. E.g. if you want to find the UniProt Protein existence (type of evidence that supports the existence of the protein) associated with a set of pathways, select "pathway" as your first dataset, then select "UNIPROT (EBI UK)] UNIPROT" as your second dataset. In the second dataset, click on "Attributes" in the leftmost panel, and expand the "Protein attributes" category by clicking the "+" symbol in the right panel. Select the "Protein existence" to include this attribute in the final results display.

Second Dataset.jpg

There is a little pitfall that you will need to be aware of when you link from Reactome to other datasets. Let's take linking pathways from Reactome to ENSEMBL transcripts as an example. On the Reactome side, there is gene and protein information associated with a pathway, and, if you wanted, you could select ENSEMBL gene ID or UniProt ID as Reactome attributes. However, if the database you are linking to also provides these attributes (ENSEMBL does) then you are strongly advised to select these attributes only in the linked-to database (ENSEMBL, in this case).

The reason is, you are making the link from the pathway and not from the gene or protein IDs. So, there will be no correspondence between, say, UniProt IDs coming from Reactome and transcript IDs coming from ENSEMBL.

To run a regular BioMart query, click on the "Results" link.

Once you have run your query and have produced a results page, you have a number of options for viewing the information. By default, the format will be "HTML", and you will be presented with the first 10 lines of the results on the web page.

Regular Query Results.jpg

Using the selecter labeled "Display maximum", you can increase the number of lines displayed by your browser to a maximum of 200. If you would like to see more lines, then you need to dump to a file, by clicking the "Go" button. Make sure you select the right output format before you do this. The default "HTML" might not be what you want. Other options are "TSV" (tab separated value, generates a .txt file with columns separated by tabs) and "XLS" (Excel spreadsheet, generates a .xls file that you can display with Excel - does not necessarily look good in OpenOffice).

If you have used other BioMart sites, you might be wondering why there is no "CSV" (comma-separated value) output format. This is because many of the values that are returned by Reactome, e.g. pathway names, contain commas, which would lead to confusing and unparsable output.


The Reactome Pathfinder

The Pathfinder tool is used to identify or discover pathways that connect a given input and one or more output molecules or events. When multiple output molecules/events are designated, the shortest path is displayed. (See notes on requirements below). A link to the Pathfinder tool can be found in the Tools menu of the main menu bar on each page.

For example, to search for a human pathway between the enzyme G6PD, which catalyzes the first reaction of the pentose phosphate pathway, and xylulose 5-phosphate, one of the intermediate products of the pathway, G6PD is entered as the input and xylulose as the output. Very common molecules are excluded to reduce the number of irrelevant pathways generated. These are listed as "non-connecting compounds". Additional molecules can be excluded by adding their names to the precompiled list. Hitting GO! will search Reactome for the start and end compounds/events entered.

Pathfinder Start.jpg

A drop-down list of hits to the entered names is provided. Here G6PD is selected as the input and several reactions that yield xylulose 5-phosphate are selected as an output. The search is initiated by clicking on GO!

Pathfinder Select.jpg

If a path is found, a list of the events and molecules connecting the input and output molecules is shown at the bottom of the page. The path between the entered input molecules is displayed by highlighting these reactions in the starry sky display of reaction space.

Pathfinder Result.jpg

This graphical display feature uses an old view of Reactome content that will soon be retired. Work is underway to develop a graphical display for Pathfinder output that uses the same pathway diagrams as the rest of the Reactome web site.


FI Network Tool

Overview

Reactome FI Cytoscape Plugin was designed to find network patterns related to cancer and other types of diseases. This plugin accesses the Reactome Functional Interaction (FI) network, a highly reliable, manually curated pathway-based protein functional interaction network covering close to 50% of human proteins, and allows you to construct a FI sub-network based on a set of genes, query the FI data source for the underlying evidence for the interaction, build and analyze network modules of highly-interacting groups of genes, perform functional enrichment analysis to annotate the modules, expand the network by finding genes related to the experimental data set, display pathway diagrams, and overlay with a variety of information sources such as cancer gene index annotations. For an example how we use Reactome FIs for cancer data analysis, please see our publication: A human functional protein interaction network and its application to cancer data analysis.

Download and Launch the Reactome FI plugin

  • If you have installed Cytoscape already (version 2.7.0 or above), please save this jar file, caBigR3.jar, into your Cytoscape plugins folder, and restart Cytoscape.
  • You can also launch FI Cytoscape plug-in using Java Web Start by clicking this link: Cytoscape.jnlp. Please choose "Allow" if you see a dialog similar to the following screenshot so that the FI plugin can open your local file and save results:
Reactome FI Plugin Java Web Start
  • If you need the Java source code for this Cytoscape plug-in, you can download it from this link: FICytoscapePlugInSrc.jar

Use the Reactome FI plugin

After starting Cytoscape, you should see a menu item called "Reactome FIs" under the Plugins menu. After clicking this menu, you can see three sub-menus: Gene Set/Mutation Analysis, Microarray Data Analysis and User Guide. Gene set/mutation analysis is used to do FI network-based data analysis for a set of genes or a mutation data file, microarray data analysis used to do MCL (Markov Graph Clustering, http://micans.org/mcl/) based FI network clustering analysis by converting a non-weighted FI network to weighted network using correlations among genes in the network, and user guide brings you to this user guide.
Reactome FI Plugin Menu
Gene Set/Mutation Analysis
  1. Currently FI plug-in supports three file formats for gene set/mutation analysis:
    1. Simple gene set: one line per gene. For example, GWASFuzzyGenes.txt, a list of T2D GWAS genes.
    2. Gene/sample number pair. For example, GeneSampleNumber.txt, which contains two required columns, gene and number of samples having gene mutated, and an optional third column listing sample names (delimited by ";").
    3. NCI MAF (mutation annotation file). For example, GlioblastomaMutationTable.txt, the mutation file from the TCGA GBM project.
  2. Choose a file containing genes you want to use to construct a functional interaction network. Select an appropriate file format and parameters to load genes and construct FI network in the dialog. Click the "OK" button to start the FI network building process.
    Open Gene Set File
  3. The constructed FI network will be displayed in the network view panel. A FI specific visual style will be created automatically for the FI network.
    Reactome FI Sub-Network
  4. The main features of Reactome FI plug-in should be invoked from a popup menu, which can be displayed by right clicking an empty space in the network view panel.
    Popup Menu for Network
    1. Select nodes: select nodes from a list of node ids delimited by ", ".
    2. Fetch FI annotations: query detailed information on selected FIs. Three FI related edge attribues will be created: FI Annotation, FI Direction, and FI Score. Edges will be displayed based on FI direction attribute values. In the following screenshot, "->" for activating/catalyzing, "-|" for inhibition, "-" for FIs extracted from complexes or inputs, and "---" for predicted FIs. See the "VizMapper" tab, Edge Source Arrow Shape and Edge Target Arrow Shape values for details.
      FI Annotations
    3. Analyze network functions: pathway or GO term ennrichment analysis for the displayed network. You can choose to filter enrichment results by a FDR cutoff value. Also you can choose to display nodes in the network panel for a selected row or rows by checking "Hide nodes in not selected rows". The following screenshot shows results from a pathway enrichment analysis.
      Pathways in FI Sub-Network

      Tip: To analyze pathway or GO term enrichment on a set of genes that are not linked together, select the "Show genes not linked to others" option in the "Set Parameters for FI Network" dialog.
    4. Cluster FI network: run a network clustering algorithm (spectral partition based network clustering by Newman 2006) on the displayed FI network. Nodes in different network modules will be shown in different colors (different colors used only for first 15 modules based on sizes).
      Network Modules
    5. Analyze module functions: pathway or GO term enrichment analysis for each individual network modules. You can select a size cutoff to filter out network modules that are too small, choose a FDR cutoff to view enriched pathways or GO terms under a certain FDR value, and view nodes in a selected row or rows only in the network diagram.
Microarray Data Analysis

The Reactome FI Cytoscape plugin can load gene expression data file, calculate correlations among genes involved in the same FIs, use the calculated correlations as weights for edges (i.e. FIs) in the whole FI network, apply MCL graph clustering algorithm to the weighted FI network, and generate a sub-network for a list of selected network modules based on module size and average correlation. The generated FI sub-network will be displayed in the network panel, and can be used for analysis as in Gene Set/Mutation Analysis.

An array data file should be a tab-delimited text file with table headers. The first column should be gene names. All other columns should be expression values in different samples. The data set in the file should be pre-normalized.

  1. Select a microarray data file and run MCL network clustering: After selecting sub-menu "Microarray Data Analysis" from menu Plugins/Reactome FIs, you should see the following dialog. Choose a microarray data file, check if you want to use absolute values as weights for edges, and input an inflation parameter (-I) for the MCL clustering algorithm. The smaller the inflation parameter is, the bigger the average size of generated network modules. Based on our own experience, we use 5.0 for the inflation parameter, the highest recommended value, and choose the absolute value for edge weights. For more details on how to choose the inflation parameter, please see http://micans.org/mcl/. After you have set these parameters, click the OK button to load the data file, calculate correlations, and apply the MCL clustering algorithm.
    Set Parameters for Microarray Data Analysis
  2. Select network modules and build a FI sub-network: The generated network modules are listed in the MCL clustering results dialog (see below). Only modules having more than 2 genes can be listed, and used in the FI sub-network building. You can choose a module size or an average correlation value (absolute value if absolute has been checked before) to filter out modules that may not be significant (Note: after set these cutoff values, please press the "Enter" key to commit your changes.). In our analysis, we choose modules having 7 or more genes with average correlation values no less than 0.25. These values have been used as default in the dialog. In the dialog, you can see how many modules and genes will be chosen for building FI sub-network under your selected filter values. Click the OK button to start the sub-network building. The built sub-network will be displayed, and can be analyzed as with sub-networks generated from the gene set/mutation analysis.
    Choose MCL Network Modules

Other features in FI plugin

Query FI source
Select an edge and right click it to get the popup menu for edge. Select a menu called "Reactome FI/Query FI Source". If a FI is extracted from curated pathways or reactions, a dialog for the original data source(s) will be displayed. Double click a row in the displayed table to show a detailed web page for the source of the FI. If the selected FI is a predicted one, the evidence for this FI should be displayed.
Query FI Source
Reactome FI Plugin Menu

Fetch FIs for node
All FIs for a node can be queried. Select a node in the network panel, and right click it to get the popup menu for node. Select a menu called "Reactome FI/Fetch FIs". FI partners for the selected node will be displayed in two sections: partners have been displayed in the network and partners not displayed in the network. You can select partners from the second sections to expand the displayed network.
Query Node FIs
Show Node FIs

Show pathway diagram
Pathway diagrams can be shown for pathway hits. Select a pathway in the "Pathways in Network" or "Pathways in Modules" tab, and right click to get the popup menu for pathway. Select "Show Pathway Diagram" from the popup menu
Show Pathway Diagram
. If pathways are imported from KEGG, KEGG pathway diagram pages will be shown in a browser with node genes listed in the "Nodes" column highlighted in red (for text and borders in pathway diagrams). If pathways are from Reactome or other non-KEGG databases, pathway diagrams should be shown in a separated window. If pathways are curated by the Reactome project, human laid-out diagrams should be displayed if any. Otherwise, auto-laid-out diagrams should be displayed. Genes or proteins from the displayed network should be highlighted in blue. Detailed annotations for nodes and reactions displayed in the diagram window can be viewed by using a popup menu called "View Instance". Diagrams displayed can be zoomed in/out using the zoom slider at the bottom of the window. The diagram can be panned by the overview window at the top-right corner.
KEGG Focal Adhesion
Reactome Signaling by PDGF

Load cancer gene index annotations

Reactome FI plug-in can load cancer gene index annotations for genes/proteins displayed in the network. There are two ways to show these annotations: use a popup menu called "Load Cancer Gene Index" when no object is selected (left figure), and use another popup menu "Fetch Cancer Gene Index" for a selected node (right figure).


Load Gene Index
Load Node Cancer Gene Index

By using the first method, the user can load the tree of NCI disease terms and display the tree in the left panel. The user can select disease term in the tree, all genes or proteins have been annotated for the selected disease and its sub-terms will be selected.
Cancer Gene Index Overlay

By using the second method, the user can view detailed annotations for the selected gene or protein. The user can sort these annotations based on PubMedID, Cancer type, and annotation status, and also filter annotations based on several criteria.


Cancer Gene Index Annotations for Node

Survival analysis
Survival analysis is based on a server-side R script to do either coxph or Kaplan-Meier survival analysis. To do survival analysis, a tab-delimited text file containing at least three columns should be provided. The names of three columns should be: Samples, OSDURATION, and OSEVENT. To do survival analysis, use the popup menu "Analyze Module Functions/Survival Analysis..." (see below)
Survival Analysis Menu

In the survival analysis dialog (below), double click the text field to select a file containing survival information for samples used to build the displayed FI sub-network (Note: you cannot do survival analysis if you use a gene set file only to construct the displayed FI subnetweork). You can choose either coxph or Kaplan-Meier model to do survival analysis. If you choose the Kaplan-Meier model, you have to select a module for analysis. In the Kaplan-Meier analysis, all samples will be divided into two groups: samples having no mutated genes in the selected module (group 1) and samples having mutated genes in module (group 2). It is recommended to run the coxph module first without selecting any module in order to see which module is most significantly related to survival times. After that, you can focus on some specific modules for survival analysis.


Survival Analysis Dialog

The results from survival analysis will be displayed in the right Results Panel with a tab labeled "Survival Analysis" (below left). You can do multiple survival analyses. All results returned from the server-side R script will be displayed in this panel with labels based on your parameter selections in the survival analysis dialog. The last result will be selected as default. At most three sections are displayed in the result panel for each analysis: Output, Error, and Plot. If no warning or error returned from an analysis, the error section may not be shown. Rows for modules having p-values less than 0.05 from coxph (all modules) analysis are displayed in blue with text underlined. You can click these modules to do a quick single-module based survival analysis without going through the above steps. Single module-based Kaplan-Meier analysis will show a plot file. You can click the file to view the actual plot (below right). You may need to save the plot file for your future use.


Survival Analysis Results
Kaplan-Meier Survival Plot



Download Reactome

To Download Reactome data and code please go to the Download page, linked to the Menu Bar on the Reactome homepage.

Exporting Data From Reactome

Reactome content can be exported in a number of different formats. For your convenience, export files for the entirety of Reactome are available in the download section, see Download Reactome for details. The details panel for pathway diagrams also provides a way to do exports for individual pathways or even reactions, take a look at Pathway Details. The following export formats are available:

  • SBML. An exchange format used by systems biologists for their models. See SBML At Reactome for comprehensive information about SBML export.
  • BioPAX. An exchange format used by systems biologists for their models.
  • PDF. Text dump of the pathway, organized to look like a research report.
  • Word. Text dump of the pathway, organized to look like a research report. The file generated is of type RTF, and can also be understood by OpenOffice and LibreOffice.
  • Protege. A format used for ontology exchange.
  • PSI-MOD. A tabular format containing protein-protein interactions derived from Reactome's reactions and complexes. Take a look at PSI-MITAB Interactions for details.
  • MySQL. A database dump of all of Reactome's data. To understand the schema properly, please take look at objectrelational-mapping.