cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
ron_horne
Super User (Alumni)

parse json - do we need to get keys at every level?

Dear members of the community,

once we parse json the program knows all of its keys in all levels. can we extract values purely based on location or do we have to get the keys at all the different levels every time?

for example:

 

 

Names Default To Here( 1 );
l = Parse JSON(
	"[ { \!"name\!": \!"KATIE\!", \!"age\!": 12, \!"sex\!": \!"F\!", \!"height\!": 59, \!"weight\!": 95 }, { \!"name\!": \!"LOUISE\!", \!"age\!": 12, \!"sex\!": \!"F\!", \!"height\!": 61, \!"weight\!": 123 }, { \!"name\!": \!"JANE\!", \!"age\!": 12, \!"sex\!": \!"F\!", \!"height\!": 55, \!"weight\!": 74 } ]"
);

For( i = 1, i <= Length( l ), i++,
	Show( l[i] );
); // works just fine


For( i = 1, i <= Length( l ), i++, Show( l[i] ); For( ii = 1, ii <= Length( l[i] ), ii++, Show( l[i][ii] ); // not working ); ); l[1]["age"]; // brings "12" without getting the keys of l[1] l[1][1]; // not working - subscripted key not in map keys = l[1] << get keys; l[1][keys[1]]; // brings 12 works fine For( i = 1, i <= Length( l ), i++, keys = l[i] << get keys; // can we skip this step? For( ii = 1, ii <= Length( keys ), ii++, Show( l[i], l[i][keys[ii]] ) ); );

many thanks!

 

3 REPLIES 3
Craige_Hales
Super User

Re: parse json - do we need to get keys at every level?

 

Your example data is a Python list of Python dictionaries.

 

 

View more...
Python and JMP use the { } and [ ] more-or-less backwards from each other. The Python [ ] makes a list, indexed by a number. JMP uses [ ] for a matrix or associative array. Python uses { } to make a dictionary. JMP's equivalent to a dictionary is the [ ] associative array. JMP's list uses { }.

The dictionaries are converted to associative arrays, and JMP's AAs don't use a numeric subscript like that. But you can get really close. Before JMP 16, you could iterate through the elements in an associative array using <<first and <<next, but it was a little clumsy. In JMP 16, I'd suggest this

 

 

jlist = Parse JSON(
	"[ 
	{ \!"name\!": \!"KATIE\!", \!"age\!": 12, \!"sex\!": \!"F\!", \!"height\!": 59, \!"weight\!": 95 }, 
	{ \!"name\!": \!"LOUISE\!", \!"age\!": 12, \!"sex\!": \!"F\!", \!"height\!": 61, \!"weight\!": 123 }, 
	{ \!"name\!": \!"JANE\!", \!"age\!": 12, \!"sex\!": \!"F\!", \!"height\!": 55, \!"weight\!": 74 } 
	]"
);

for each({record,index1}, jlist,
	show(record,index1); // ... record = ["age" => 12, "height" => 55, "name" => "JANE", "sex" => "F", "weight" => 74]; index1 = 3;
	for each({{key,value},index2}, record, show(key,value,index2)); // key = "age";value = 12;index2 = 1;key = "height";value = 55;index2 = 2;...
)

The outer for each knows it is processing a list, and the inner for each knows it is processing an AA. I made the first argument to for each in both cases extra complicated so it could include the index1 and index2; you might not actually need the numeric index.

 

You didn't say what your data actually looks like. You might want to use the JSON import wizard to create a script to load the data directly into a data table. It might be as simple as using the best-guess option if the JSON is amenable.

 

Craige
ron_horne
Super User (Alumni)

Re: parse json - do we need to get keys at every level?

thank you @Craige_Hales 

I have a few issues with my json parsing actually. I get millions of them from a data base so i download them all into a jmp data table using sql. Then, parse them one by one to extract the information i need. i would like to have a row for each child, and all the information of the person duplicated per child.

consider the following script. this was produced manually using the wizard. since the names of the children are in the titles the wizard is not aggregating them by default. if i had the word "child" repeating instead of tish, mimi and berta it would guess it perfectly using the long guess.
how do i go from there to make this accommodate any number of different children the json may have? how do i change the hard coding of tish, mimi and berta? also, some children's attributes may be missing so i would need to be able to skip them.

JSON To Data Table(
	"\[
{
    "Person": "sami",
    "Age": "45",
    "event": "came home",
    "when": 1639682643.7996492,
    "Children": {
        "tish": [
            {
                "Name": "tish",
                "Father": "bob",
                "Age": "14",
                "event": "went to school",
                "when": 1639682630.1256561
            }
        ],
        "mimi": [
            {
                "Name": "mimi",
                "Father": "bob",
                "Age": "25",
                "event": "went to play",
                "when": 1639682630.1256561
            }
        ],
        "berta": [
            {
                "Name": "berta",
                "Father": "bob",
                "Age": "23",
                "event": "went to work",
                "when": 1639682630.1256561
            }
        ]
    }
}
]\", 

	JSON Settings(
		Stack( 0 ),
		Row( "/root/Children/tish" ),
		Row( "/root/Children/mimi" ),
		Row( "/root/Children/berta" ),
		Col(
			"/root/Person",
			Column Name( "Person" ),
			Fill( "Use Forever" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Age",
			Column Name( "person Age" ),
			Fill( "Use Forever" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/event",
			Column Name( "person event" ),
			Fill( "Use Forever" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/when",
			Column Name( "person when" ),
			Fill( "Use Forever" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children",
			Column Name( "Children" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/tish/Name",
			Column Name( "Name" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/tish/Father",
			Column Name( "Father" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/tish/Age",
			Column Name( "Age" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/tish/event",
			Column Name( "event" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/tish/when",
			Column Name( "when" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/mimi/Name",
			Column Name( "Name" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/mimi/Father",
			Column Name( "Father" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/mimi/Age",
			Column Name( "Age" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/mimi/event",
			Column Name( "event" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/mimi/when",
			Column Name( "when" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/berta/Name",
			Column Name( "Name" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/berta/Father",
			Column Name( "Father" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/berta/Age",
			Column Name( "Age" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/berta/event",
			Column Name( "event" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		),
		Col(
			"/root/Children/berta/when",
			Column Name( "when" ),
			Fill( "Use Once" ),
			Type( "Character" ),
			Format( {"Best", 15} ),
			Modeling Type( "Continuous" )
		)
	),
	JSON Wizard( 1 )
);

so far i was looping through the keys at every level (brut-force)  but it took over three days to go over a sample of 6 million jsons.

 

 

 

 

Craige_Hales
Super User

Re: parse json - do we need to get keys at every level?

Away from desk. You are already on the best path. Your example is using the keys to hold the data, and the Json importer does not understand that.
Use the jsl debugger/profiler to figure out where the time is going. You don't want to waste your time optimizing the wrong thing.
Craige