*How to make a build you have created externally available on CLUE: *
The first thing you'll need to do is fill out the build request form for an external build, selecting only "build" in the sig tools list. The build tool is autoselected and read-only, so you shouldn't need to select any other options on the page.
Submit the request, and the dataset detailed view will be available at clue.io/data/PROJECT#DATASET.
Once the build is submitted, you can use the "Add/ReRun Tools" button to select further analyses to run on the build, if you require them (this is not required for external builds). This is done because we are assuming that, if you are uploading an external build, the L-build portion has already been completed on your end (if the build is L1000).
Once the tools have been selected, you can upload your build to s3. The location where the files should be stored is designated as follows: s3://macchiato.clue.io/builds/BUILD_NAME/build/.
The build folder should have the modz and siginfo files, if they exist for the build.
If you want macchiato tools to be run against your build, it should have the following structure: build_name
build
name_MODZ.gctx (example: build_name_MODZ_n296x978.gctx)
dn50_lm.gmt
instinfo.txt
siginfo.txt
up50_lm.gmt
The up and down gene sets should be Entrez gene ids.
After all necessary files have been uploaded to the proper locations, you can use the "Start build" button to begin running tools on the build, and the status will be set to "Running" until all tools have completed.
The results will be on the #ntf-build channel on Slack. Clicking on the link that appears there will take you to the dataset detailed view in Data Library.
If you have files that you would like to appear in the Downloads section on Clue, then you must have a manifest file that allows this format (please refer to the second portion of this documentation that explains how to use a MANIFEST file).
*How to use a MANIFEST file: *
This section will explain how one could use a MANIFEST file, that one has already uploaded to S3, to update resources for a particular build on CLUE. A sample manifest file can be found at s3://macchiato.clue.io/builds/PBENCH_A/MANIFEST.txt.
Assumptions:
A build request already exist
You know the ID of the build
If you don't know the ID the following shows how you can find it.
Option 1: Go to clue.io/data. Click on the project then on the build name. On the build page click on the "Info" icon at the top right and look for the field "ID".
Option 2: API access
curl -X GET --header "Accept: application/json" --header "user_key: API_KEY" "[https://api.clue.io/api/data?filter=](http://api.clue.io/api/data?filter={"where){"where": {"BAR": "FOO_BAR"},"fields" : [ "id","name"]}"
Where "BAR" is either the string "name" or "display_name", and
"FOO_BAR" is the name or the display_name value that you used in your build request.```
```So if "BAR" is "display_name", then "FOO_BAR" should be the display_name used in your request.
if "BAR" is "name", then "FOO_BAR" should be the name used in your request.```
Note that you will have to URL encode the where clause.
3. Assets; i.e all the files mentioned in MANIFEST.txt have already been uploaded somewhere under s3://macchiato.clue.io/builds/DATASET_NAME/
DATASET_NAME is the name of the folder that holds all the assets in the MANIFEST.txt
4. The manifest file should be named exactly as "MANIFEST.txt". Note the caps in the name
5. The manifest file should have one column named "file_name" and another one named "level_desc" column
"file_name" should be a relative path to the name of the file that should be downloadable from the data library.
For example, foo/bar/blah.txt
Means that the file is located on s3 at s3://macchiato.clue.io/builds/DATASET_NAME/foo/bar/blah.txt
If the file is at the top level then just use its name, for example blah.txt
"Level_desc" - holds the description for that file
6. If the API endpoint below is called, it will replace the current content of the downloadable resources for the build with the contents of this manifest file
**API EndPoint:**
To make the resources available on CLUE. Execute a cURL request to the clue API that looks something like the following:
```curl -X PUT --header "Content-Type: application/x-www-form-urlencoded" --header "Accept: application/json" --header "user_key: XYZ" -d "s3_path_to_manifest=NAME_OF_PARENT_FOLDER_OF_MANIFEST_FILE_ON_S3" "https://api.clue.io/api/data/ID_OF_BUILD/updateWithManifest"```
Replace:
1. ```"XYZ" with your API Key```
2. ```"NAME_OF_PARENT_FOLDER_OF_MANIFEST_FILE_ON_S3" path of the immediate parent folder of your manifest file on S3
For example if your MANIFEST.txt file is located under :
s3://macchiato.clue.io/builds/PCAL127-155_T3A/```
Use PCAL127-155_T3A
3. ```ID_OF_BUILD with the database id of the build. You can get the database id```
After executing the curl request, confirm the resources shows up on the relevant build as downloadable resources.
**Configuring data table and ICV for a build: **
The following is an example configuration for setting the columns in the build's data table and the view in ICV:
```{
"table": [
{
"field": "pert_iname",
"dataType": "string",
"visible": true
},
{
"field": "moa",
"dataType": "string",
"visible": true
},
{
"field": "cell_iname",
"dataType": "string",
"visible": true
},
{
"field": "strength",
"dataType": "number",
"visible": true
},
{
"field": "modz_core_72h",
"dataType": "number",
"visible": true
},
{
"field": "sqrt_auc_prep",
"dataType": "number",
"visible": true
},
{
"field": "is_sensitive_72h",
"dataType": "string",
"visible": true
}
],
"icv": {
"columns": [
{
"field": "pert_iname",
"display": "text"
},
{
"field": "moa",
"display": "text"
},
{
"field": "cell_id",
"display": "text"
},
{
"field": "strength",
"display": "bar"
},
{
"field": "modz_core_72h",
"display": "bar"
},
{
"field": "sqrt_auc_prep",
"display": "bar"
},
{
"field": "is_sensitive_72h",
"display": "color"
}
]
}
}```
The columns in the table are specified by an array of column objects. Each column object is structured as follows:
{
"field": "XXX",
"dataType": "YYY",
"visible": Z
}
```
Where XXX
represents the field name in data as a string, YYY
represents the datatype (such as string, number, boolean, etc.) as a string, and Z
represents the boolean value of whether or not the field should be visible when the table is first loaded.
The configurations should go in a file called data-app.json, which should be placed in the arfs directory of the build.