Solved: How to use JSL to process the txt files into tabular data?

lwx228 · Nov 5, 2019 02:14 AM

The txt file downloaded from the web page, this file has data, how to use JSL to process such txt file as tabular data?
I can't handle it with JSON.Thank Experts!

Craige_Hales · Nov 5, 2019 04:21 PM

// the U.txt file has 2 extra characters at the 
// start and 3 extra characters at the end.
// visual inspection is the only way I know to
// figure that out.

// using JMP 15's json wizard
x = Load Text File( "E:/U.txt" );
x = Substr( x, 3, Length( x ) - 5 );
Open( Char To Blob( x ), jsonwizard( 0 ) );

// using JMP 14's basic json support
x = Load Text File( "E:/U.txt" );
x = Substr( x, 3, Length( x ) - 5 );
Open( Char To Blob( x ), "json" );

If you have JMP 15, you can also use the wizard interactively (use JsonWizard(1) to see the GUI.) It still needs help removing the extra characters. The JSL above loads the txt into x, removes the 2 and 3 characters with substr(), then passes the data to open(), via a blob, without re-writing to disk. If you rewrite to disk, you can also change the extension to .json.

JMP 15's wizard also guesses the datatypes better in this example

@Anonymous

Craige

View solution in original post

Mark_Bailey · Nov 5, 2019 8:02 AM

Create a data table with the necessary data columns (e.g., "name", "cname", and so on). There are extraneous fences "({[" at the start and "]})" at the end of this data. Substitute "{" and "}" so that you have a simple list of sub-lists. Then iterate over the list and process the sub-lists one at a time. The variable, value pairs are in a simple order so you can add one row to the data table at a time.

I do not have time now to write a simple script to demonstrate this approach.

martindemel · Nov 5, 2019 10:25 AM

Hi Mark,
this is a nice approach, which would work for many cases. Unfortunately it seems the fields are not consistent as described below in my response . So the value pairs won't fit in all cases unfortunately.
Or do I miss something?
Martin

/****NeverStopLearning****/

Mark_Bailey · Nov 5, 2019 11:13 AM

No, you are correct. I spent two hours writing a script based on my approach and ran into many tedious problems. I don't have more time now.

martindemel · Nov 5, 2019 10:19 AM

Hi lwx228,

Below you will find a script which will import the data, recode appropriately, put texct to columns, stack, split to get the final table.

Some remarks as this is quite a weird data format:

there is inconsistency with the delimiters and compared to usual txt files:
- comma is used in the fields and as seperator,
- quotation marks are used as well, but not for each element,
- brackets are used in several ways
- Sometimes there is a 'null' statement for a missing value, but sometimes not!
- There is no first row with column names, they are within the data fields. That makes it necessary to extract those and do the stack/split operation to get the columns as you need them for analysis (may someone has an easier approach - would be welcome to hear about it ;) )
the default settings won't work therefore as you will get wrong number of fields per row!
Best is to read in the whole string per row and do the rest in JMP using recode and column utilitites. Hopefully these can be used with a similar file as well.
To script these things you have to test each step and if the step seems to be ok, save/copy the script into a script window and do the next step.
That's how the script below was created.

Names Default to Here(1);
//Open your txt file using special settings, like } as end of line, uncheck column names, data starts row 1, ...

dt = Open(
	"YOURPATH\U.txt",  // exchange YOURPATH with your path to the file
	columns( New Column( "c000001", Character, "Nominal" ) ),
	Import Settings(
		End Of Line( Semicolon, Other( "}" ) ),
		End Of Field( CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 0 ),
		Column Names Start( 1 ),
		Data Starts( 1 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);

// Now you have one column with in each row all information after "name"
// Remove last two lines as they are just filled with brackets
dt << Clear Selected Row States;
m = matrix({N Rows(dt)-1, N Rows(dt)});
dt << Select Rows( m);
wait(2);
dt << Delete Rows();

// Best to use Recode and Column utility functions to seperate all information 
dt << New Column("recoded","Character",Formula(Recode(
	:c000001,
	{Substitute( _rcNow, ",{", "" ), Substitute( _rcNow, "({[{", "" ),
	Substitute( _rcNow, ":", "," ), Substitute( _rcNow, "\!",\!"", ";"  ),
	Substitute( _rcNow, ";;", ";null;"  ),
	Substitute( _rcNow, "\!"", ""  ), Substitute( _rcNow, ",null,", ";null;"  ),
	Substitute( _rcNow, ",null", ";null"  )
	}
)));

// Get all text seperated by ; in seperated columns
dt << Text To Columns( delimiter( ";" ), columns( :recoded ) );
dt << Delete Columns(Column(1),Column(2));

// Now you have to stack the data appropriately
colNames = dt << Get Column Names();
dt_stack = dt << Stack(
	Columns(colNames),
	Source Label Column( "ColName" ),
	Stacked Data Column( "Data" ),
	Number of Series( 2 )
);
dt_stack << Delete Columns(1,3);

// Just to Split it again to get the columns with the correct names as well the data respectively
dt_final = dt_stack << Split(
	Split By( :Data ),
	Split( :Data 2 ),
	Sort by Column Property
);

// Close data tables you do not need anymore
close(dt_stack, NoSave);
close(dt, NoSave);

// Finally you get an alphabetic order and all data is in character. So you now need to change this appropriately.

It was fun to go through this, but took some time to overcome the weird data challenges described above. However due to the built-in scripts and formulas not too much :)

One more tip: To get the recode formulas, use recode and save it as formula column, look into the formula editor and there you go.

Best,

Martin

/****NeverStopLearning****/

Craige_Hales · Nov 5, 2019 04:21 PM

// the U.txt file has 2 extra characters at the 
// start and 3 extra characters at the end.
// visual inspection is the only way I know to
// figure that out.

// using JMP 15's json wizard
x = Load Text File( "E:/U.txt" );
x = Substr( x, 3, Length( x ) - 5 );
Open( Char To Blob( x ), jsonwizard( 0 ) );

// using JMP 14's basic json support
x = Load Text File( "E:/U.txt" );
x = Substr( x, 3, Length( x ) - 5 );
Open( Char To Blob( x ), "json" );

If you have JMP 15, you can also use the wizard interactively (use JsonWizard(1) to see the GUI.) It still needs help removing the extra characters. The JSL above loads the txt into x, removes the 2 and 3 characters with substr(), then passes the data to open(), via a blob, without re-writing to disk. If you rewrite to disk, you can also change the extension to .json.

JMP 15's wizard also guesses the datatypes better in this example

@Anonymous

Craige

martindemel · Nov 6, 2019 03:08 AM

Nice! One additional comment for those in Europe: Take care of your preference settings for display language (under Windows specific) with this file and set it to English as well check the check bock below it to use JMPs language rather than system language. Otherwise the continuous data will import the decimal point numeric data as empty cells. With the recommended setting decimal points will be recognized correctly and the data imported as numeric data type. Other option is to import all as text and change it aferwards, using e.g. recode. In JMP 14 all has been imported as character type automatically.

/****NeverStopLearning****/

lwx228 · Nov 6, 2019 04:10 AM

Nice! JMP handles JSON very quickly.Thanks for the experts' help!

How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?

Re: How to use JSL to process the txt files into tabular data?