KEGG (Kyoto Encyclopedia of Genes and Genomes) is one of the most complete and widely used
databases containing metabolic pathways (372 reference pathwasy) from a wide variety of
species (>700). These pathways are hyperlinked to metabolite and protein_complex/enzyme information.
Currently KEGG has >15,000 compounds (from animals, plants and bacteria), 7742 drugs (including
different salt forms and drug carriers) and nearly 11,000 glycan structures.
MetaCyc is a database of nonredundant, experimentally elucidated metabolic pathways. MetaCyc
contains more than 1,100 pathways from more than 1,500 different species. MetaCyc is curated
from the scientific experimental literature and contains pathways involved in both primary and
secondary metabolism, as well as associated compounds, enzymes, and genes.
HumanCyc is a bioinformatics database that describes the human metabolic pathways and the human
genome. The current version of HumanCyc was constructed using Build 31 of the human genome.
The resulting pathway/genome database (PGDB) includes information on 28,783 genes,
their products and the metabolic reactions and pathways they catalyze.
BioCyc is a collection of 371 Pathway/Genome Databases. Each database in the BioCyc collection
describes the genome and metabolic pathways of a single species. The databases within the
BioCyc collection are organized into tiers according to the amount of manual review and
updating they have received. Tier 1 DBs have been created through intensive manual efforts and
include EcoCyc, MetaCyc and the BioCyc Open Compounds Database (BOCD). BOCD includes metabolites,
enzyme activators, inhibitors, and cofactors derived from hundreds of species. Tier 2 and Tier 3
databases contain computationally predicted metabolic pathways, as well as predictions as to which
genes code for missing enzymes in metabolic pathways, and predicted operons.
Reactome is a curated, peer-reviewed knowledgbase of biological pathways, including metabolic
pathways as well as protein_complex trafficking and signaling pathways. Reactome includes several
types of reactions in its pathway diagram collection including experimentally confirmed, manually
inferred and electronically inferred reactions. Reactome has pathway data on more than 20
different species but the primary species of interest is Homo sapiens. Reactome has data
and pathway diagrams for >2700 protein_complexes, 2800 reactions and 860 pathways for humans.
WikiPathways is an open, collaborative platform dedicated to the curation of biological pathways. It is based on the MediaWiki open source software used by Wikipedia, coupled to a custom graphical pathway editing tool and integrated databases covering major gene, protein_complex, and small-molecule systems. WikiPathways currently contains 544 species-specific pathways for human, mouse, rat, zebrafish, fruit fly, worm, and yeast.
Medical Biochemistry Page
The Medical Biochemistry Page focuses on human pathways, with highly detailed descriptions of human processes, hormones, and metabolite/protein_complex interactions. Additionally, the website contains clinical information on disease states and inborn errors of metabolism.
Pathway Commons is a collection of publicly available pathways for many species from multiple sources, including other pathway databases. Pathway information available from Pathway Commons includes biochemical reactions, complex assembly, transport and catalysis events, and physical interactions involving protein_complexes, DNA, RNA, small molecules and complexes. It is a central repository of pathway information which uses the standardized Biological Pathway Exchange (BioPAX) format to consolidate pathway information.
Biocarta's pathway maps focus on protein_complex interactions in the field of proteomics, the study of protein_complex expression and function. Biocarta's focus is on enhancing genomic information or as an alternative route of basic science investigation and drug discovery. Biocarta is an open source database of pathways highlighting molecular relationships from areas of active research as well as classical pathway maps. It also catalogs and summarizes important resources providing information for over 120,000 genes from multiple species.
Cell Signalling Technology
Cell Signalling Technology is dedicated to providing innovative research tools that are used to help define mechanisms underlying cell function and disease, and has a database of pathway maps focusing on protein_complex signalling. It also provides genetic information for disease states associated with dysfunction in protein_complex based regulation of cellular processes.
Sigma Aldrich life sciences pathway slides focus on cell signalling and neuroscience. This resource includes 13 apoptosis and cell cycle pathways, 25 cytokine, growth factor and hormone pathways, 5 cytoskeleton and extracellular matrix pathways, 20 gene regulation and expression pathways, 3 pathways depicting G-protein_complex and nucleotide interactions, 2 pathways involved with immune signalling and blood, 12 ion channel slides, 5 lipid cell signalling pathways, 4 pathways involved in multi-drug resistance, 12 neurobiology and neurotransmission pathways, 3 protein_complex phosphorylation pathways, and 9 nitric oxide and cell stress pathways, comprising a valuable resource of 115 pathways relating to human signalling processes.
Ambion has a wide range of pathway information, focusing on signalling pathways, with over 386 interactive pathway maps containing gene, protein_complex and product information. These pathways range from transport and cell signalling/cycle processes and regulation to disease state pathways.
Calbiochem/Merck has 18 interactive intracellular signalling and cell cycle regulation pathways as well as a pathway for Alzheimer's disease. It contains hyperlinks to succinct notes on protein_complex function and reactivity in a variety of species.
ProteinComplexLounge is an extensive database of 769 metabolic, disease, and signalling pathways in many species. It includes pathway information from other pathway sites such as Ambion. It additionally includes siRNA, peptide-antigen and kinase-phosphotase databases which provide additional information on protein_complex interactions and signalling.
Transpath (from BioBase Inc.)
Transpath is a key component of BioBase Inc.'s Cell Illustrator Online, which is a pathway drawing tool for modeling metabolic pathways, signal transduction cascades, gene regulatory pathways and dynamic interactions of various biological entities such as genomic DNA, mRNA and protein_complexes. Transpath is a commercially available resource of pre-drawn signalling and metabolic pathways using the Cell Illustrator models.
PathArt (from Jubilant Biosys Inc.)
PathArt is an extensive collection of manually curated information from literature as well as public domain databases on more than 1000 signaling and metabolic pathways. PathArt currently contains 3527 regulatory, signalling and disease pathways, with emphasis on 39 high priority diseases and their pathway and response genes, as well as information on 8783 knockouts and ~18000 mutation data points, with human, mouse and rat cell and tissue specific information. PathArt is a commercially available resource which also includes a module of ~219,598 protein_complex interactions and 350 drug molecules.
MetaBase (from GeneGo Inc.)
Metabase is a vast, manually curated database of of mammalian biology and medicinal chemistry. It contains over 6 million experimental findings on protein_complex-protein_complex, protein_complex-DNA and protein_complex-compound interactions. MetaBase also includes thousands of signaling and metabolic pathways, ligand-receptor information for known drugs, drug targets and diseases, kinetic information on drug-metabolizing enzymes and signaling protein_complexes, as well as ontologies for diseases, functional processes, toxicities, protein_complexes and drugs. Metabase is a commercially available resource with additional data on tissues, cell localizations, cellular processes, disease annotations, protein_complex hierarchy, complexes and families.
Ingenuity Pathways Analysis (Ingenuity Systems Inc.)
Ingenuity Pathways Analysis (IPA), is a commercially available software that allows dynamic pathway modeling and analysis of biological and chemical systems. IPA has pathway building capability which includes genes, chemicals, cell processes and disease states.
The DrugBank database is a blended bioinformatics and cheminformatics resource that combines detailed
drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e.
sequence, structure, and pathway) information. The database contains nearly 4800 drug entries including
>1,350 FDA-approved small molecule drugs, 123 FDA-approved biotech (protein_complex/peptide) drugs,
71 nutraceuticals and >3,243 experimental drugs. DrugBank also contains extensive SNP-drug data that
is useful for pharmacogenomics studies.
Therapeutic Target DB
The Therapeutic Target Database (TTD) is a drug database designed to provide information about the known
therapeutic protein_complex and nucleic acid targets described in the literature, the targeted disease conditions,
the pathway information and the corresponding drugs/ligands directed at each of these targets. The database
currently contains 1535 targets and 2107 drugs/ligands.
The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and
clinical information about people who have participated in pharmacogenomics research studies. The data
includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the
cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains. Its aim is to aid researchers
in understanding how genetic variation among individuals contributes to differences in reactions to drugs.
PharmGKB contains searchable data on genes (>20,000), diseases (>3000), drugs (>2500) and pathways (53).
It also has detailed information on 470 genetic variants (SNP data) affecting drug metabolism.
STITCH ('search tool for interactions of chemicals') is a searchable database that integrates information
about interactions from metabolic pathways, crystal structures, binding experiments and drug–target
relationships. Text mining and chemical structure similarity is used to predict relations between chemicals.
Each proposed interaction can be traced back to the original data sources. The database contains interaction
information for over 68 000 different chemicals, including 2200 drugs, and connects them to 1.5 million
genes across 373 genomes.
SuperTarget is a database that contains a core dataset of about 7300 drug-target relations of which 4900
interactions have been subjected to a more extensive manual annotation effort. SuperTarget provides tools
for 2D drug screening and sequence comparison of the targets. The database contains more than 2500 target
protein_complexes, which are annotated with about 7300 relations to 1500 drugs; the vast majority of entries have
pointers to the respective literature source. A subset of 775 more extensively annotated drugs is provided
separately through the
Matador database (Manually Annotated Targets And Drugs Online Resource.
The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed
information about small molecule metabolites found in the human body. It contains experimental
MS/MS data for 800 compounds, experimental 1H and 13C NMR data (and assignments) for 790 compounds
and GC/MS spectral and retention index data for 260 compounds. Additionally, predicted 1H and 13C
NMR spectra have been generated for 3100 compounds. All spectral databases are downloadable and
The BioMagResBank (BMRB) is the central repository for experimental NMR spectral data, primarily for
macromolecules. The BMRB also contains a recently established subsection for metabolite data.
The current metabolomics database contains structures, structure viewing applets, nomenclature data,
extensive 1D and 2D spectral peak lists (from 1D, TOCSY, DEPT, HSQC experiments), raw spectra and FIDs
for nearly 500 molecules. The data is both searchable and downloadable.
The Madison Metabolomics Consortium Database (MMCD) is a database on small molecules of biological
interest gathered from electronic databases and the scientific literature. It contains approximately
10,000 metabolite entries and experimental spectral data on about 500 compounds. Each metabolite
entry in the MMCD is supported by information in an average of 50 separate data fields, which provide
the chemical formula, names and synonyms, structure, physical and chemical properties, NMR and MS
data on pure compounds under defined conditions where available, NMR chemical shifts determined by
empirical and/or theoretical approaches, information on the presence of the metabolite in different
biological species, and extensive links to images, references, and other public databases.
MassBank is a mass spectral database of experimentally acquired high resolution MS spectra of metabolites.
Maintained and supported by he JST-BIRD project, it offers various query methods for standard spectra
obtained from Keio University, RIKEN PSC, and other Japanese research institutions. It is officially
sanctioned bythe Mass Spectrometry Society of Japan. The database has very detailed MS data and
excellent spectral/structure searching utilities. More than 13,000 spectra from 1900 different
compounds are available.
Golm Metabolome Database
The Golm Metabolome Database provides public access to custom GC/MS libraries which are stored
as Mass Spectral (MS) and Retention Time Index (RI) Libraries (MSRI). These libraries of mass
spectral and retention time indices can be used with the NIST/AMDIS software to identify metabolites
according their spectral tags and RI's. The libraries are both searchable and downloadable and have
been carefully collected under defined conditions on several types of GC/MS instruments (quadrupole and TOF).
The METLIN Metabolite Database is a repository for mass spectral metabolite data. All metabolites are
neutral or free acids. It is a collaborative effort between the Siuzdak and Abagyan groups and Center
for Mass Spectrometry at The Scripps Research Institute. METLIN is searchable by compound name, mass,
formula or structure. It contains 15,000 structures, including more than 8000 di and tripeptides.
METLIN contains MS/MS, LC/MS and FTMS data that can be searched by peak lists, mass range, biological
source or disease.
Fiehn GC-MS Database
This library contains data on 713 compounds (name, structure, CAS ID, other links) for which GC/MS
data (spectra and retention indices) have been collected by the Fiehn laboratory. A locally
maintain program called BinBase/Bellerophon filters input GC/MS spectra and uses the spectral
library to identify compounds. The actual GC/MS library is available from several different GC/MS vendors.