Connectivity Map

Unravel biology with the world’s largest perturbation-driven gene expression dataset.


We are creating a genome-scale library of cellular signatures that catalogs transcriptional responses to chemical, genetic, and disease perturbation. To date, the library contains 1,800,255 profiles resulting from perturbations of multiple cell types.


CMap is a resource that uses gene expression signatures to probe relationships between diseases, cell physiology, and therapeutics. The patterns of gene expression (a "signature") that arise from a disease, genetic perturbation (knockdown or overexpression of a gene) or treatment with a small molecule compound are compared for similarity to all perturbational signatures in the database. Perturbagens that give rise to highly similar (or opposing) expression signatures are "connected" and thus may have related effects on the cell. Our goal is to use these connections to uncover novel treatments for a variety of diseases, including cancers, neurological diseases, and infectious diseases.


The data is a massive catalog of gene expression profiles representing transcriptional responses to a wide variety of chemical, genetic and disease perturbations.


Big data sets can be an enigmatic monolith without the proper interface to access and interpret the information they hold. We offer command line interfaces (CLI) for computational biologists, API's for software engineers, and web-based software applications for all. Check out our collection of Web Apps and Developer's Tools.


We are grateful for the important contributions from the Broad community, the CMap Team, our research collaborators, and third party code developers.  

Contact CMap