CMap Docker Resource
General Guide to Dockers
To download a docker container from the command line:
docker pull DOCKERNAME
Guide Guide (about this guide)
Throughout this guide, variables will be indicated by brackets, i.e. {variable_name}.
Many of the Dockers require the user to mount volumes internal to the Docker, indicated by the use of the -v
arg. A local directory will be mounted inside the Docker, allowing the docker access to those local files. There are two use-cases of this flag present in this document:
Volume-binding for inputs and outputs, all paths will refer to file locations within the docker relative to this binding, indicated by {desired alias}
Some dockers require pre-computed background files, these paths are hard-coded internal to the docker, indicated by a lack of brackets for alias path.
cmap/gctx-to-gct
convert filetype GCTX to GCT
Arguments:
filepath
-- path (mounted within docker) to GCTX file to be converted
outdir
-- path (mounted within docker) to subdirectory where output will be (requires trailing / )
outpath
-- output filename (will be located within the outdir)
Command:
docker run -it --rm \ --name {name} \ -v {local path to mount}:{desired alias} \ cmap/gctx-to-gct \ --filepath {filepath} \ --outdir {outdir} \ --outpath {outpath}
Example:
Input full path: ~/my_directory/example.gctx
Docker command: docker run -it --rm --name gctx_converter -v ~/my_directory/:/mnt/ \ cmap/gctx-to-gct --filepath /mnt/example.gctx --outdir /mnt/converted \ --outpath example.gct
Output full path: ~/my_directory/converted/example.gct
cmap/sig_slice_tool
Extract a subset from a larger dataset
(can also be used to convert GCT↔GCTX if cid and rid are full grps of all col/row ids)
Arguments:
create_subdir
: (boolean) whether or not to create a subdirectory for output
cid
: path (mounted within docker) to .grp file of column ids to extract from input
rid
: path(mounted within docker) to .grp file of row ids to extract from input
ds
: path (mounted within docker) to input file to be sliced
out
: path (mounted within docker) where output will be saved includes file name
use_gctx
: boolean whether to save output as .gctx filetype, 0 returns .gct filetype
Command:
docker run --rm \ --name sig_slice_tool \ -v {local path to mount}:{desired alias} \ -it cmap/sig_slice_tool \ --create_subdir {0 or 1} \ --cid {cid} \ --rid {rid} \ --ds {ds} \ --out {out} \ --use_gctx {use_gctx}
Example:
My local directory structure:
~/my_directory/
input/
cid.grp
rid.grp
input.gct
Docker command: docker run --rm --name sig_slice_tool \ -v ~/my_directory/:/mnt/ -it cmap/sig_slice_tool --create_subdir 1 \ --cid /mnt/input/cid.grp --rid /mnt/input/rid.grp --ds /mnt/input/input.gct \ --out /mnt/output/ --use_gctx 1
Output full path :~/my_directory/output/subset.gctx
cmap/sig_collate_tool
combine multiple GCT(X)s into a single GCT(X)
Arguments:
files
: path (mounted within docker) to .grp file of names of files be collated
parent_folder
: path (mounted within docker) to parent directory where files listed in grp are located
out
: path (mounted within docker) to output directory
Command:
docker run --name sig_collate_tool \ -v {local path to mount}:{desired alias} \ -w {working directory is desired alias} \ -t cmap/sig_collate_tool \ --files {files} \ --parent_folder {parent_folder} \ --out {out}
Example:
My local directory:
~/my_directory/
input/
files.grp
uncollated/
file1.gct
file2.gct
file3.gct
.
.
.
Docker command:docker run --name sig_collate_tool -v ~/my_directory/:/mnt/ \ -w /mnt/ -t cmap/sig_collate_tool --files /mnt/input/files.grp \ --parent_folder /mnt/input/uncollated --out /mnt/output
Output full path: ~/my_directory/output/result.gctx
cmap/sig_prot_query_tool
run proteomics Query for connectivity of custom GCT with Touchstone-P
Yml contents:
assay
: (P100 || GCP)
name
: (string)
introspect
: (true or false, whether or not to compute internal connectivity)
input_file
: (path to input mounted within docker)
fields_to_aggregate
: [(list of strings referring to set of columns which will be aggregated to identify unique perturbagens)]
out_dir
: (path to output directory mounted within docker)
psp_on_clue_yml
: clue/psp_on_clue.yml
Arguments:
config
: path (mounted within docker) to yml configuration
out
: path (mounted within docker) to save output includes filename [NB: will override out_dir
: argument in yml
Command:
docker run --rm \ --name sig_prot_query_tool \ -v {local path to mount}:{desired alias}\ -it cmap/sig_slice_tool \ --config {config} --out {out}
Example:
My local directory:
~/my_directory/
input/
my_configuration.yml
input.gct
My_configuration.yml contents:
assay
: P100
name
: my_query
introspect
: true
input_file
: /mnt/input/input.gct
fields_to_aggregate
: ["pert_id", "cell_id", "pert_time"]
out_dir
: /mnt/this_is_overridden
psp_on_clue_yml
: clue/psp_on_clue.yml
Docker command: docker run --rm --name sig_prot_query_tool \ -v ~/my_directory/:/mnt/ -it cmap/sig_slice_tool \ --config /mnt/input/my_configuration.yml \ --out /mnt/output
Output full path : ~/my_directory/output/
Expected files :
INTROSPECT_CONN.gct
CONCATED_CONN.gct
cmap/cmappy: conda environment for CMapPy
Arguments:
None. Running docker on its own will put the user in a shell environment with cmappy_env activated.
Command:
docker run -it \ -v {local path to mount}:{desired alias} \ cmap/cmappy \ {any additional command will be run in the shell}
Example:
Input full path: -
Docker command: docker run -it cmap/cmappy python
Output: Docker is now running python, with the ability to import cmapPy
cmap/sig_recall_tool
Compare replicates signatures to assess similarity
Arguments:
ds_list
: path (mounted within docker) to, for single replicate set, grp of file paths, else tsv with column names: group_id and file_path
metric
: ['spearman', 'wtcs']
set_size
: (for wtcs metric only) recommended 50
Command:
docker run --rm \ --name sig_recall \ -v {local path to mount}:{desired alias} \ -it cmap/sig_recall_tool \ --ds_list {ds_list} \ --metric {metric}
Example:
My local directory:
~/my_directory/
input.tsv
classA/
input1.gctx
input2.gctx
classB/
input1.gctx
input2.gctx
input3.gctx
input.tsv file contents:
group_id file_path
A /cmap/input/classA/input1.gctx
A /cmap/input/classA/input2.gctx
B /cmap/input/classB/input1.gctx
B /cmap/input/classB/input2.gctx
B /cmap/input/classB/input3.gctx
Docker command:
docker run --rm --name sig_recall -v ~/my_directory/:/mnt/ -it cmap/sig_recall_tool --ds_list /mnt/input.tsv --metric 'spearman'
cmap/build_synpopsis_tool
Given a build directory, generates a report containing functional and technical QC plots.
Arguments:
--inpath
: path to the build directory
--out
: the output directory [default: .]
--rpt
: prefix to append to output directory. only applies if --create_subdir is passed as well [default: my_analysis]
--opts
: RDS file containing argument values
--title
: title for the report [default: ]
Command:
docker run \ -it / -v /path/to/output/ \ cmap/sig_build_synopsis_tool \ --runtests \ --out /output
Example:
run the tool in standard mode:
sig_build_synopsis_tool --inpath /path/to/L-build --title BuildName