cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Thierry_S
Super User

JMP 16.1 > WIN > JSL > Parse Complex JSON

Hi JMP Community,

 

I need to retrieve biological information from the Human Protein Atlas public website based on a unique identifier.

So far, I can easily retrieve and parse the JSON file from the website using the simple code below.

Names DEfault to Here (1);

ID = "ENSG00000108691";
HTM_c = "https://www.proteinatlas.org/XXX.json";

HTM_str = Substitute (HTM_c, "XXX", ID);

request = New HTTP Request (URL (HTM_str), Method ("Get"));


temp = request << send ();

js_par = Parse JSON (temp);

The problem arises when I try to collect only a selected portion of the JSON document (see attached):

1) The Command "js_par << Get Keys" only retrieves the top-level keys, leaving "sub-keys" unavailable

2) The command "js_par << Get Values" does not return the actual values, but the format of data "containers"

 

Hence, how can I break down the JSON document to enable the retrieval of specific Keys and Values?

 

Any hint would be welcome.

 

Thank you.

Best,

TS

 

 

Thierry R. Sornasse
1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: JMP 16.1 > WIN > JSL > Parse Complex JSON

Actually, after reformatting (visual studio code worked), it probably makes more sense to use stack(0) and get 1 row. Each column is a list of attributes for that column (I'm guessing.) The synonym column has 8 possibilities. The protein class has 5 possibilities. They should not be aligned on rows like the first example shows, I'm guessing that's not what was intended.

The table will have expression columns instead of character columns with delimiters if you ask for them:

capture2.png

 

[
    {
        "Gene": "CCL2",
        "Gene synonym": [
            "GDCF-2",
            "HC11",
            "MCAF",
            "MCP-1",
            "MCP1",
            "MGC9434",
            "SCYA2",
            "SMC-CF"
        ],
        "Ensembl": "ENSG00000108691",
        "Gene description": "C-C motif chemokine ligand 2",
        "Uniprot": [
            "P13500"
        ],
        "Chromosome": "17",
        "Position": "34255274-34257208",
        "Protein class": [
            "Cancer-related genes",
            "Candidate cardiovascular disease genes",
            "Human disease related genes",
            "Plasma proteins",
            "Predicted secreted proteins"
        ],
        "Biological process": [
            "Chemotaxis",
            "Inflammatory response"
        ],
        "Molecular function": [
            "Cytokine"
        ],
        "Disease involvement": [
            "Cancer-related genes"
        ],
        "Evidence": "Evidence at protein level",
        "HPA evidence": "Evidence at protein level",
        "UniProt evidence": "Evidence at protein level",
        "NeXtProt evidence": "Evidence at protein level",
        "RNA tissue specificity": "Tissue enhanced",
        "RNA tissue distribution": "Detected in all",
        "RNA tissue specificity score": null,
        "RNA tissue specific nTPM": {
            "urinary bladder": "726.7"
        },
        "RNA single cell type specificity": "Cell type enhanced",
        "RNA single cell type distribution": "Detected in many",
        "RNA single cell type specificity score": null,
        "RNA single cell type specific nTPM": {
            "Ductal cells": "2990.5",
            "Exocrine glandular cells": "1242.1",
            "Pancreatic endocrine cells": "1696.3",
            "Secretory cells": "2061.7",
            "Smooth muscle cells": "1284.8"
        },
        "RNA cancer specificity": "Low cancer specificity",
        "RNA cancer distribution": "Detected in all",
        "RNA cancer specificity score": null,
        "RNA cancer specific FPKM": null,
        "RNA brain regional specificity": "Low region specificity",
        "RNA brain regional distribution": "Detected in all",
        "RNA brain regional specificity score": null,
        "RNA brain regional specific nTPM": null,
        "RNA blood cell specificity": "Group enriched",
        "RNA blood cell distribution": "Detected in some",
        "RNA blood cell specificity score": "6",
        "RNA blood cell specific nTPM": {
            "classical monocyte": "7.6",
            "eosinophil": "3.1"
        },
        "RNA blood lineage specificity": "Group enriched",
        "RNA blood lineage distribution": "Detected in many",
        "RNA blood lineage specificity score": "6",
        "RNA blood lineage specific nTPM": {
            "granulocytes": "3.1",
            "monocytes": "7.6"
        },
        "RNA cell line specificity": "Low cancer specificity",
        "RNA cell line distribution": "Detected in many",
        "RNA cell line specificity score": null,
        "RNA cell line specific nTPM": null,
        "RNA tissue cell type enrichment": [
            "Pancreas - Ductal cells"
        ],
        "RNA mouse brain regional specificity": "Low region specificity",
        "RNA mouse brain regional distribution": "Detected in some",
        "RNA mouse brain regional specificity score": null,
        "RNA mouse brain regional specific nTPM": null,
        "RNA pig brain regional specificity": "Group enriched",
        "RNA pig brain regional distribution": "Detected in all",
        "RNA pig brain regional specificity score": "4",
        "RNA pig brain regional specific nTPM": {
            "midbrain": "13.1",
            "spinal cord": "52.4"
        },
        "Antibody": [
            "CAB013676",
            "HPA019163"
        ],
        "Reliability (IH)": "Approved",
        "Reliability (Mouse Brain)": null,
        "Reliability (IF)": "Approved",
        "Subcellular location": [
            "Golgi apparatus",
            "Vesicles"
        ],
        "Secretome location": "Secreted to blood",
        "Secretome function": "Chemokine",
        "CCD Protein": "NA",
        "CCD Transcript": "NA",
        "Blood concentration - Conc. blood IM [pg\/L]": 323000,
        "Blood concentration - Conc. blood MS [pg\/L]": null,
        "Blood expression cluster": null,
        "Tissue expression cluster": "Cluster 7: Adipose tissue - Mixed function",
        "Brain expression cluster": "Cluster 12: Non-specific - Vasculature",
        "Cell line expression cluster": "Cluster 49: HMC-1 - Innate immune response",
        "Single cell expression cluster": "Cluster 48: Smooth muscle cells - Signal transduction",
        "Interactions": 4,
        "Subcellular main location": [
            "Golgi apparatus"
        ],
        "Subcellular additional location": [
            "Vesicles"
        ],
        "Antibody RRID": {
            "CAB013676": null,
            "HPA019163": "AB_1846179"
        },
        "Pathology prognostics - Breast cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "6.43e-2"
        },
        "Pathology prognostics - Cervical cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "4.14e-2"
        },
        "Pathology prognostics - Colorectal cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "4.78e-2"
        },
        "Pathology prognostics - Endometrial cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "5.57e-2"
        },
        "Pathology prognostics - Glioma": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "2.37e-3"
        },
        "Pathology prognostics - Head and neck cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "8.64e-2"
        },
        "Pathology prognostics - Liver cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "2.78e-2"
        },
        "Pathology prognostics - Lung cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "7.53e-2"
        },
        "Pathology prognostics - Melanoma": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "2.90e-1"
        },
        "Pathology prognostics - Ovarian cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "1.19e-1"
        },
        "Pathology prognostics - Pancreatic cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "9.33e-2"
        },
        "Pathology prognostics - Prostate cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "2.65e-1"
        },
        "Pathology prognostics - Renal cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": true,
            "p_val": "4.51e-4"
        },
        "Pathology prognostics - Stomach cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "1.33e-1"
        },
        "Pathology prognostics - Testis cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "2.50e-2"
        },
        "Pathology prognostics - Thyroid cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "1.81e-1"
        },
        "Pathology prognostics - Urothelial cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "4.92e-2"
        }
    }
]
Craige

View solution in original post

3 REPLIES 3
Craige_Hales
Super User

Re: JMP 16.1 > WIN > JSL > Parse Complex JSON

See if the json import wizard does what you need. There are several ways to attack it; a JSL solution might look like

dt = open("z:/ENSG00000108691.json",guess("huge", stack(1)),jsonwizard(0));

change the 0 to a 1 to see the wizard dialog.

capture.png

this file seems to have 8 records with the best guess choice. If that doesn't match your expectation, you might want to hand select what you want via the GUI, but I'm not sure you can do much more than remove unneeded columns.

(file->open, select the .json extension, select the data using preview, open.)

json (and XML) dialogs are almost identical.json (and XML) dialogs are almost identical.

Craige
Craige_Hales
Super User

Re: JMP 16.1 > WIN > JSL > Parse Complex JSON

Actually, after reformatting (visual studio code worked), it probably makes more sense to use stack(0) and get 1 row. Each column is a list of attributes for that column (I'm guessing.) The synonym column has 8 possibilities. The protein class has 5 possibilities. They should not be aligned on rows like the first example shows, I'm guessing that's not what was intended.

The table will have expression columns instead of character columns with delimiters if you ask for them:

capture2.png

 

[
    {
        "Gene": "CCL2",
        "Gene synonym": [
            "GDCF-2",
            "HC11",
            "MCAF",
            "MCP-1",
            "MCP1",
            "MGC9434",
            "SCYA2",
            "SMC-CF"
        ],
        "Ensembl": "ENSG00000108691",
        "Gene description": "C-C motif chemokine ligand 2",
        "Uniprot": [
            "P13500"
        ],
        "Chromosome": "17",
        "Position": "34255274-34257208",
        "Protein class": [
            "Cancer-related genes",
            "Candidate cardiovascular disease genes",
            "Human disease related genes",
            "Plasma proteins",
            "Predicted secreted proteins"
        ],
        "Biological process": [
            "Chemotaxis",
            "Inflammatory response"
        ],
        "Molecular function": [
            "Cytokine"
        ],
        "Disease involvement": [
            "Cancer-related genes"
        ],
        "Evidence": "Evidence at protein level",
        "HPA evidence": "Evidence at protein level",
        "UniProt evidence": "Evidence at protein level",
        "NeXtProt evidence": "Evidence at protein level",
        "RNA tissue specificity": "Tissue enhanced",
        "RNA tissue distribution": "Detected in all",
        "RNA tissue specificity score": null,
        "RNA tissue specific nTPM": {
            "urinary bladder": "726.7"
        },
        "RNA single cell type specificity": "Cell type enhanced",
        "RNA single cell type distribution": "Detected in many",
        "RNA single cell type specificity score": null,
        "RNA single cell type specific nTPM": {
            "Ductal cells": "2990.5",
            "Exocrine glandular cells": "1242.1",
            "Pancreatic endocrine cells": "1696.3",
            "Secretory cells": "2061.7",
            "Smooth muscle cells": "1284.8"
        },
        "RNA cancer specificity": "Low cancer specificity",
        "RNA cancer distribution": "Detected in all",
        "RNA cancer specificity score": null,
        "RNA cancer specific FPKM": null,
        "RNA brain regional specificity": "Low region specificity",
        "RNA brain regional distribution": "Detected in all",
        "RNA brain regional specificity score": null,
        "RNA brain regional specific nTPM": null,
        "RNA blood cell specificity": "Group enriched",
        "RNA blood cell distribution": "Detected in some",
        "RNA blood cell specificity score": "6",
        "RNA blood cell specific nTPM": {
            "classical monocyte": "7.6",
            "eosinophil": "3.1"
        },
        "RNA blood lineage specificity": "Group enriched",
        "RNA blood lineage distribution": "Detected in many",
        "RNA blood lineage specificity score": "6",
        "RNA blood lineage specific nTPM": {
            "granulocytes": "3.1",
            "monocytes": "7.6"
        },
        "RNA cell line specificity": "Low cancer specificity",
        "RNA cell line distribution": "Detected in many",
        "RNA cell line specificity score": null,
        "RNA cell line specific nTPM": null,
        "RNA tissue cell type enrichment": [
            "Pancreas - Ductal cells"
        ],
        "RNA mouse brain regional specificity": "Low region specificity",
        "RNA mouse brain regional distribution": "Detected in some",
        "RNA mouse brain regional specificity score": null,
        "RNA mouse brain regional specific nTPM": null,
        "RNA pig brain regional specificity": "Group enriched",
        "RNA pig brain regional distribution": "Detected in all",
        "RNA pig brain regional specificity score": "4",
        "RNA pig brain regional specific nTPM": {
            "midbrain": "13.1",
            "spinal cord": "52.4"
        },
        "Antibody": [
            "CAB013676",
            "HPA019163"
        ],
        "Reliability (IH)": "Approved",
        "Reliability (Mouse Brain)": null,
        "Reliability (IF)": "Approved",
        "Subcellular location": [
            "Golgi apparatus",
            "Vesicles"
        ],
        "Secretome location": "Secreted to blood",
        "Secretome function": "Chemokine",
        "CCD Protein": "NA",
        "CCD Transcript": "NA",
        "Blood concentration - Conc. blood IM [pg\/L]": 323000,
        "Blood concentration - Conc. blood MS [pg\/L]": null,
        "Blood expression cluster": null,
        "Tissue expression cluster": "Cluster 7: Adipose tissue - Mixed function",
        "Brain expression cluster": "Cluster 12: Non-specific - Vasculature",
        "Cell line expression cluster": "Cluster 49: HMC-1 - Innate immune response",
        "Single cell expression cluster": "Cluster 48: Smooth muscle cells - Signal transduction",
        "Interactions": 4,
        "Subcellular main location": [
            "Golgi apparatus"
        ],
        "Subcellular additional location": [
            "Vesicles"
        ],
        "Antibody RRID": {
            "CAB013676": null,
            "HPA019163": "AB_1846179"
        },
        "Pathology prognostics - Breast cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "6.43e-2"
        },
        "Pathology prognostics - Cervical cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "4.14e-2"
        },
        "Pathology prognostics - Colorectal cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "4.78e-2"
        },
        "Pathology prognostics - Endometrial cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "5.57e-2"
        },
        "Pathology prognostics - Glioma": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "2.37e-3"
        },
        "Pathology prognostics - Head and neck cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "8.64e-2"
        },
        "Pathology prognostics - Liver cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "2.78e-2"
        },
        "Pathology prognostics - Lung cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "7.53e-2"
        },
        "Pathology prognostics - Melanoma": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "2.90e-1"
        },
        "Pathology prognostics - Ovarian cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "1.19e-1"
        },
        "Pathology prognostics - Pancreatic cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "9.33e-2"
        },
        "Pathology prognostics - Prostate cancer": {
            "prognostic type": "favorable",
            "is_prognostic": false,
            "p_val": "2.65e-1"
        },
        "Pathology prognostics - Renal cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": true,
            "p_val": "4.51e-4"
        },
        "Pathology prognostics - Stomach cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "1.33e-1"
        },
        "Pathology prognostics - Testis cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "2.50e-2"
        },
        "Pathology prognostics - Thyroid cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "1.81e-1"
        },
        "Pathology prognostics - Urothelial cancer": {
            "prognostic type": "unfavorable",
            "is_prognostic": false,
            "p_val": "4.92e-2"
        }
    }
]
Craige
lala
Level VII

回复: JMP 16.1 > WIN > JSL > Parse Complex JSON

 

Thanks!