cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
andykpdf
Level I

JSON to data table: how to avoid NULL columns being automatically dropped?

Hi All,

 

I'm using JMP 15 and have JSL that retrieves a JSON payload via Http Request(), runs it through Char to Blob(), and then open()s it in a data table. The JSON payload content is dynamic, but is always mappable to a tabular format (rows and columns).

 

The issue: if any keys in the JSON have all NULL values, they are not included in the data table (i.e., no columns are created for these keys). This is problematic, because I want all of the keys in the JSON to be displayed as columns in the data table, whether all values are NULL or not.

 

If I save the payload in a local file and open it directly using Open(), I get the desired result: a data table where all keys are converted to columns, and all NULL values are displayed as NULLs. However, this won't work for my use case.

 

Here is basic JSL that replicates the problem. Instead of Http Request(), I retrieve the data with Load Text File(), which gives the same results. The file "data.json" is attached.

dt1 = Open("/Users/abc/data.json", JSON ); // works as desired

raw = Load Text File("/Users/abc/data.json");
jsonblob = Char To Blob( raw ); 
dt2 = open(jsonblob, "JSON",  Guess("wide")); // does not work as desired -- drops all-NULL columns

Can anyone suggest a method to convert JSON in a variable into a data table that preserves the all-NULL keys? 

 

Thanks!

 

Andy

 

6 REPLIES 6
Craige_Hales
Super User

Re: JSON to data table: how to avoid NULL columns being automatically dropped?

I wrote an answer below that might help, then I realized there is a simple answer you should try first. The data seems to be one of the pandas formats, so change "wide" to "pandas".

 

If that turns out wrong, try...

 

Use the wizard to explicitly select the columns, then use that script.

You'll probably want the stack option too.You'll probably want the stack option too.

I also changed the Row generator from root to items, because it is a little more correct. Probably won't make a real difference.

The item selected for Row causes a new row to be written to the data table each time that item occurs. Turn on the Stats and you can see items is what repeats 2 times.

 

Craige
andykpdf
Level I

Re: JSON to data table: how to avoid NULL columns being automatically dropped?

Thanks for the suggestions, Craige. I tried them and have a few follow-up questions:

 

1. changing Guess from "wide" to "pandas" does successfully include the NULL columns in the data table -- unfortunately, it introduces a new problem: the last "row" of JSON data is repeated for every row of the data table (see "pandas.png", attached -- Lot Number 21055 is repeated twice, and Lot Number 21054 is not included. I replicated this same behavior with a larger data set).

 

Is there any other Open() parameter or config that might correct this?

 

2. I was able to successfully import the data.json file, including NULL columns, using the wizard -- however, the script it produced is specific to the payload in data.json (see "script_view.png", attached). My use case is more general -- the payloads that my JSL receives will vary in number of columns and column labels. Is there a way to use the output of the wizard to create a more generic config option? Or is my only option to manually iterate over the payload and construct the data table column-by-column?

 

2b. Related to this -- can you point me to any documentation for using the "JSON Settings" parameter with Open()? I couldn't find anything in the Syntax Reference or Scripting Index (and when I google 'JMP JSL "JSON settings"', the top hit is my previous question on this forum).

 

3. Is there a way to force Open() to import empty columns as Numeric? I tried "Treat Empty Columns as Numeric (1)" in the Import Settings and it caused the data.json file to be treated as a single line/record (see "pandas_numeric.png, attached), so I assume I'm doing something incorrect.

 

Thanks!

 

Andy

Craige_Hales
Super User

Re: JSON to data table: how to avoid NULL columns being automatically dropped?

  • OK, I think the wide option is the correct option to use and it is a bug that the all-missing columns are removed from the import. @Wendy_Murphrey   It certainly makes things complicated.
  • If you know all the possible columns, and only a subset will be in the JSON, you can list all of them in the JSON Settings. Do a small test first, see if it works the way you need (I'm not sure if you'll get the full set of columns or just the subset that appear in the JSON.)
  • There is still access to a previous version of the JSON import code; specifying JSON (without the JSON SETTINGS or Guess parameter) may do what you want.
blob=loadtextfile("$DESKTOP/data.json",blob);
Open(blob,JSON)
  • The JSON does not look exactly like any of the pandas formats. But the pandas option does seem to retrieve the column names you want; if nothing else works that might be a starting point for creating the JSON Settings on the fly for a second import to really get the data.
  • There is no specific doc on the JSON Settings that I'm aware of; it is intended to be written by the wizard and still human-readable. It matches the wizard dialog values, and there is some doc there.
  • I'm pretty sure the ImportSettings are used by CSV import, and maybe something else. They are not used by the JSON wizard.

 

(update: couple of small edits)

Craige
Craige_Hales
Super User

Re: JSON to data table: how to avoid NULL columns being automatically dropped?

And if you have any control over the source of the json and can get it as a real pandas format, 

orient='table' (in python code)

might be a good choice. JMP's import can use some of that metadata. JMP can read the other orientations as well.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html

 

Craige

Re: JSON to data table: how to avoid NULL columns being automatically dropped?

Thanks for the note, @Craige_Hales.  I have reported the deficiency noted in bullet 1 to the Development team for consideration in a future release of JMP. 

 

For what it's worth, it seems that these do not drop the null columns in JMP 14.3.

raw = Load Text File( "$DOWNLOADS/data.json" );
jsonblob = Char To Blob( raw );
dt2 = Open( jsonblob, "JSON", Guess( "wide" ) ); 

JSON To Data Table( raw, Guess( "wide" ) );

 

Wendy
DaveB
Staff

Re: JSON to data table: how to avoid NULL columns being automatically dropped?

JSON arrays that have no elements are problematic for JMP.  One option is to replace empty arrays with arrays that have at least one child.  You can make that child null or use the missing value character - [.].   I think this means you won't have valid JSON any more but JMP 16 will still read it and at this time there is no plan to break that behavior.  Using null will create a character column and using the missing value will create a numeric column.  If you know there is no string data in your JSON that contains [] then you could simply make this change

 

raw = Load Text File("$downloads/data.json");
jsonblob = Char To Blob( substitute(raw,"[]",".") ); 
dt2 = open(jsonblob, "JSON",  Guess("wide"));

If you do want to protect against replacing [] in JSON strings it will take a few more lines of jsl to skip strings but should still not be too hard.