About CLUE

CLUE is a cloud-based software platform for the analysis of perturbational datasets generated using gene expression (L1000) and proteomic (P100 and GCP) assays . The CLUE platform provides integrated access to datasets, results from the processing and analysis of these data, and software tools that the community can leverage to advance their research.

Jump to: The CLUE Platform | Availability and Use | Web Apps | Analysis Tools| Acknowledgements

The CLUE Platform

Recent technological improvements have resulted in a dramatic increase in high-dimensional perturbational datasets available to the biomedical community. However, the enormity of the data and the complexity of integrating across multiple assays, multiple cell types, and experimental conditions of dose and treatment time requires users to have considerable computational expertise to ask questions of the data.

Biologists need intuitive and performant user-interfaces to explore and query the dataset to evaluate hypotheses. Even for computational researchers, the huge scale of the dataset requires wasted effort downloading and formatting the data, and is sometimes a barrier to their engagement with the data at all.

To address these needs, we have developed a computational environment built from the ground up to execute on state-of-the-art cloud-based systems. This environment, which we call CLUE, is built to meet the following goals:

Lower the barrier to access by making data and tools available on the cloud, thereby 1) eliminating the need to download massive files and 2) allowing users to stay in sync with the latest data releases
Facilitate interoperability between perturbational data types by harmonizing datasets
Implement web applications with user friendly graphical user interfaces that access underlying sophisticated algorithms

Availability and Use

Enhancements continue to be made to the CLUE platform, and the scope of the datasets available will continue to grow. As of today, it has already been loaded with over 1 million gene expression profiles from the Connectivity Map dataset, related perturbational datasets, analytical tools, and web-based applications. These data and tools are freely available to academic users. Recognizing that drug-discovery companies will want to leverage this work for their proprietary research programs while still maintaining confidentiality of their proprietary data, we also offer CLUE as a subscription. See details at our subscription page.

Web Apps

Touchstone App

“Touchstone” refers to a dataset of compound and genetic perturbagens that are well-studied and generate robust gene expression signatures in cells. Thus the Touchstone data set serves as a benchmark for assessing connectivity among perturbagens. Use the Touchstone app to learn more about these perturbagens and explore their connectivities.

Query App

Use the query app to find positive and negative connections between your gene expression signature of interest and all the signatures in CMap

Proteomics Query App

This query app finds positive and negative connections between your protein sets of interest and the Touchstone-P reference set of P100 and GCP proteomics data.

Morpheus App

Morpheus is an interactive version of the ICV that lets you manipulate and annotate an existing dataset or one of your choice.

Repurposing App

Explore the Broad Institute’s repurposing collection of ~5000 tool compounds and drugs for drug discovery opportunities.

Analysis Tools

Big data sets can be an enigmatic monolith without the proper interface to access and interpret the information they hold. We offer command line interfaces (CLI) for computational biologists, API's for software engineers, and web-based software applications for all. Check out our collection of Web Apps and Developer's Tools.

Our API provides metadata about compounds, genes, cell lines, and signatures. We have also developed command line interfaces with tools for computationalists and developers.

Acknowledgements

We are grateful for the important contributions from the Broad community, and third party code developers.