Subscribe Bookmark RSS Feed

Importing a file with multiple "other" delimiters

MuttonChops

Occasional Contributor

Joined:

Mar 30, 2017

JMP Pro 13.  I'm attempting to import a .dat file.  Each column is numeric.  Each line begins with a ( and ends with a ) so I can't seem to figure out how to remove both of those symbols at import.  The problem is that I cannot seem to get rid of that trailing ) so that column is imported as a character.  Even if I setup script to replace the ) with nothing the column is still character and I can't figure out how to force it to numeric.

 

For( i = 1, i<= n rows(dtSFileMasterModule), i++,
		dtSPMFile = Open( PMxPath || PMx[i] ||"\file.dat", Import Settings(
				End Of Line( CRLF, CR, LF ),
				End Of Field( Space, Other( "(" ), CSV( 0 ) ),
				Strip Quotes( 1 ),
				Use Apostrophe as Quotation Mark( 0 ),
				Use Regional Settings( 0 ),
				Scan Whole File( 1 ),
				Treat empty columns as numeric( 0 ),
				CompressNumericColumns( 0 ),
				CompressCharacterColumns( 0 ),
				CompressAllowListCheck( 0 ),
				Labels( 1 ),
				Column Names Start( 3 ),
				Data Starts( 4 ),
				Lines To Read( "All" ),
				Year Rule( "20xx" )));
		dtSPMxFileMaster = dtSPMxFileMaster << concatenate(dtSPMFile, append to first table(1));
		Close( dtSPMFile, "no save" );
	);
1 ACCEPTED SOLUTION

Accepted Solutions
txnelson

Super User

Joined:

Jun 22, 2012

Solution

Oops, I forgot to subract "1" from the length of the string to be subsetted.  The code should be:

Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i] = Substr(
	Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i],
	1,
	Contains( Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i], ")" ) - 1
)
Jim
3 REPLIES
txnelson

Super User

Joined:

Jun 22, 2012

I believe that if you add this after the all of the individual data tables have been concatenated that it will correct the issue

// Loop through all rows and get rid of the ")" on the last column
For( i = 1, I <= N Rows( dtSPMxFileMaster ), i++,
	Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i] = Substr(
		Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i],
		1,
		Contains( Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i], ")" )
	)
);
// Change the last column to numeric and continuous
Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) ) << data type( numeric ) << modeling type( continuous );
Jim
MuttonChops

Occasional Contributor

Joined:

Mar 30, 2017

That seems very very close.  The one issue I'm having is that the whole cell is being deleted, instead of just the closing parenthesis.  For example, that cell contains a number like 26.1234) and I just want to trim off the closing parenthesis.  The script you provide deletes the cell, but I don't think you meant it to do that, right?  

 

I can replace a string in your script with a "7" to trim it after the 7th digit but it makes the script not very flexible in the event that the data changes to some other string length.  This will work most likely but if you have any suggestions I'm for it!  Thanks!!!

For( i = 1, i <= N Rows( dtSPMxFileMaster ), i++,
	Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i] = Substr(Column( 
		dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i],
		1,
		7 )
	);
// Change the last column to numeric and continuous
Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) ) << data type( numeric ) << modeling type( continuous );

  

txnelson

Super User

Joined:

Jun 22, 2012

Solution

Oops, I forgot to subract "1" from the length of the string to be subsetted.  The code should be:

Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i] = Substr(
	Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i],
	1,
	Contains( Column( dtSPMxFileMaster, N Cols( dtSPMxFileMaster ) )[i], ")" ) - 1
)
Jim