Introduction to BioCyc and Pathway Tools

BioCyc and Pathway Tools accelerate science by providing comprehensive data for sequenced organisms, and a wide suite of bioinformatics tools.

BioCyc is a database collection and website with extensive search and analysis tools.

The Pathway Tools software can be installed at your site to compute metabolic reconstructions and create BioCyc-like databases for genomes of interest.

BioCyc Database Collection

The BioCyc collection of Pathway/Genome Databases (PGDBs) provides a reference on the genomes, metabolic pathways, and (in some cases) regulatory networks of thousands of sequenced organisms. Each database combines information from three sources:

  • Computational inferences: Our Pathway Tools software predicts the metabolic pathways of an organism, predicts which genes code for missing enzymes in metabolic pathways, and predicts operons. We compute orthologs across BioCyc databases.

  • Imported data: BioCyc integrates information from other bioinformatics databases, such as protein feature and Gene Ontology information from UniProt, gene-essentiality datasets from OGEE, and regulatory information from RegTransBase.

  • Manual curation: The curated databases (called Tier 1 and Tier 2 PGDBs) have received literature-based curation to enter new gene functions, pathways, protein complexes, regulation, and more.

    • The EcoCyc DB is the result of more than 20 person-years of effort to enter information from 40,000+ E. coli articles about gene function, metabolism, transport, and regulatory processes.

    • The MetaCyc DB describes metabolic pathways, enzymes,and metabolites from all domains of life, curated from 65,000+ publications.

The EcoCyc and MetaCyc databases are freely available to all users because their curation is supported by NIH funding. Also free is the database for the cyanobacterium Arthrospira platensis NIES-39 as an example of a Tier 3 database. The other BioCyc databases are available via subscription, which supports their curation. To obtain free access to the other BioCyc databases for teaching purposes, please click here.

BioCyc data files may be downloaded to your site, and BioCyc data can be queried via web services. Bioinformatics Tools provides a suite of bioinformatics tools (see Tools menu) for accessing and analyzing the BioCyc databases. The tools provide search and visualization, omics data analysis, and comparative genomics and comparative pathway analysis:

  • Search: Multiple search tools enable users to find genes, pathways, and metabolites of interest, which are presented in corresponding information pages. Most searches apply to the currently selected organism database, which can be changed with the "Change Current Database" button at the top of most pages. There are two ways to search across multiple databases: (1) Use Tools → Search → Cross Organism Search or (2) In commands such as Tools → Search → Search Genes, Proteins, and RNAs, select "Search across multiple organisms/databases" under the list of buttons.

  • Visualization: A variety of visualization tools are provided, such as metabolic-pathway diagrams, and zoomable diagrams depicting the complete metabolic chart of each organism [example].

  • Genome Browser: The BioCyc genome browser [example] enables analysis of positional genome datasets via tracks.

  • Omics Data Analysis: Tools include statistical over-representation analysis; and visualization of gene expression, proteomics, or metabolomics data on metabolic-chart diagrams [example] and on the Omics Dashboard [example].

  • SmartTables: Biologist-friendly analysis capabilities for groups of genes or metabolites that are stored in your BioCyc account.

  • Metabolic Route Search: Search for reactions paths connecting specified metabolites in the metabolic network, with the option of adding new reactions from the MetaCyc DB.

  • Comparative Analysis: Tools include comparison of pathways, metabolites, transporters, and regulatory networks -- (see menu command Analysis -> Comparative Analysis).

  • Sequence Analysis: Perform BLAST searches, sequence pattern searches, and perform multiple alignments.

Pathway Tools Software

Pathway Tools is an enterprise genome and pathway data management tool and is among the most extensive bioinformatics software packages. It is the software used to create BioCyc databases and it powers the website. Its capabilities are described in detail here.

Pathway Tools can run as both a desktop application and as a web server.

Installing Pathway Tools at your site brings these advantages:

  • Install a private local set of BioCyc PGDBs on your intranet

  • Create new PGDBs from your own genome data, generating metabolic reconstructions, operon inferences, and more.

  • Apply its extensive search, visualization, and analysis tools to your own genome data.

  • Edit PGDBs interactively to add new gene functions and pathways

  • Build quantitative metabolic flux models using Flux-Balance Analysis with the MetaFlux tool

How to Learn More About BioCyc

The following additional information exists about the BioCyc site:

Definitions of Terminology on the BioCyc Website

Here we define a few key terms. See the glossary for more definitions.

Pathway/Genome Database (PGDB). A database that describes

  • The genome of an organism -- its chromosome(s), genes, and genome sequence
  • The product of each gene
  • The metabolic network of the organism -- its pathways, reactions, enzymes, and metabolites
  • The transporter complement of the organism
  • The regulatory network of the organism, including its operons, transcription factors, and the interactions between transcription factors and their small-molecule ligands and DNA binding sites

Tier 1 PGDB. PGDBs in Tier 1, such as EcoCyc, MetaCyc, and HumanCyc, have received at least one year of literature-based curation by scientists. More information about curation practices is available in the Curator Guide.

Tier 2 PGDB. PGDBs in Tier 2 were generated by the PathoLogic program, which predicted their metabolic pathways; their operons (for bacteria only); and some missing enzymes in their predicted pathways (pathway hole fillers). The resulting PGDBs underwent manual review by a person to remove false-positive pathway predictions that they could detect, and to perform refinements such as defining protein complexes. The resulting PGDBs also underwent a period of literature-based curation, such as to enter metabolic pathways that had been experimentally elucidated in the organism but that were not inferred by PathoLogic. [list of Tier 2 PGDBs]

Tier 3 PGDB. PGDBs in Tier 3 were generated by PathoLogic, which predicted metabolic pathways, operons (for bacteria only), pathway hole fillers, and transport reactions. The resulting PGDBs did not undergo manual review of the pathway predictions, nor subsequent literature curation. Therefore, the pathway predictions should be treated with due caution. [list of Tier 3 PGDBs]

Pathway Tools Software. Pathway Tools is used to construct, update, visualize, query, and analyze PGDBs, such as the BioCyc collection. It is freely available to academics interested in creating PGDBs for organisms of interest to them. Components of Pathway Tools are:

  • The Pathway/Genome Navigator supports querying, visualization, and analysis of PGDBs
  • The Pathway/Genome Editors support interactive updating and refinement of PGDBs
  • PathoLogic performs computational inferences such as pathway prediction
  • MetaFlux enables creation of quantitative metabolic models from PGDBs

BioCyc: The collection of PGDBs at URL is called the BioCyc Database Collection. EcoCyc and MetaCyc are component databases within the BioCyc collection.

BioCyc Organism Home Pages

BioCyc contains home pages for the following organisms. You can visit these pages to pre-select these organism databases for searches.

 Bacillus subtilis
 Saccharomyces cerevisiae
 Salmonella enterica