cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
SteveTerry
Level III

How can I prevent 3-digit version numbers from getting read as dates into data table?

In JMP 14.2.0, when I read in (or 'Open') a CSV file that contains a column with 3-digit version number values, those values get interpreted as a date.  For example, if verTest.csv contains:

Product,Station,Version,Date
Widget,Audio-Test,2.2.2044,3/2/19
Widget,RF-Test,2.2.2045,3/2/20

all the column values get interpreted as expected except the Version values which get read in as dates, e.g., 2/2/2044 and 2/2/2045, with Numeric data type, Continuous modeling type, and with the m/d/y format.  I find that 2-digit and 4-digit version numbers are read ok, that is, as character strings.

 

A further complication is that I don't know in advance which columns might contain this format, so I can't explicitly set the column types ahead of time, (nor is it clear that it's even possible to set the column types prior to opening a CSV file).  And if I try to manually change the column types after the fact, it seems that the '/' characters are baked in such that changing the type to Character leaves the value with a "2/2/2044" string. 

 

Is there a way to inhibit this interpretation so that 3-digit version numbers read in as strings, i.e., Character, Nominal, (and without affecting how dates are interpreted using the traditional formats such as m/d/y)?

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

Use Multiple File Import. It will not try to guess date formats. It will import one file at a time if that's all you select, but it is really targeted at importing a bunch of similar files from a folder. Look at the source script from a manual import if you need to repeat the process.

Craige

View solution in original post

6 REPLIES 6
cwillden
Super User (Alumni)

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

Hi @SteveTerry ,

When you open the .csv from with in JMP, you should have some options on the navigation window (at least for Windows.  I don't have a Mac).  Choose Data (Using Preview) and you will have an opportunity to manually set formats for imported columns.  Here's what it looks like in Windows:

OpenFile.PNG

 

When you click Open, you'll get a series of dialog screens to set parameters for the import.  On the second screen, you can click at the top of the preview by the header you want to change and make sure that that column is imported as a character column.  Once you import it the way you like, you can just copy the Source script from the table and re-use it in the future to automate the import.

-- Cameron Willden
SteveTerry
Level III

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

Hi Cameron,

Thank you for your response.  Per your suggestion, on the last import screen, I was able to click the icon to the left of the red arrow to change it from a Numeric data column to a Character data column.  With that, it imported the data as I had wanted it.

SteveTerry_1-1583186145597.png

The source for this gave me the following:

 

Open(
	"/Users/st/src/analysis_jmp/testVer.csv",
	columns(
		New Column( "Product", Character, "Nominal" ),
		New Column( "Station", Character, "Nominal" ),
		New Column( "Version", Character, "Nominal" ),
		New Column( "Date",
			Numeric,
			"Continuous",
			Format( "m/d/y", 10 ),
			Input Format( "m/d/y" )
		)
	),

This indicates that it's possible to set the column types prior to opening a CSV file.  However, when I target just the Version column within my JSL script, things go haywire – (JSL bug maybe??).

 

My script:

 

path = "./verTest.csv";
dt = Open( path, columns( New Column( "Version", Character, "Nominal" ) ),

Resultant source:

 

 

Open(
	"/Users/st/src/analysis_jmp/verTest.csv",
	columns(
New Column( "Product", Character, "Nominal" )
),

Thus, it looks like the importer disregards the name in the New Column specification and simply takes them in order of appearance in the file, ('Product' being the first one – I added a second New Column with similar results thus strengthening this hypothesis).

 

Even if this did work, it would presume I knew the name of the column in advance of the import, (which I don't).  In some cases I do, so I could at least apply this logic (assuming it worked) for those;  but I would eventually miss some of them.

 

I was hoping there might be an import option that would limit or constrain how dates are interpreted generally, (so that '.' characters are not considered as m/d/y delimiters, for example).

 

Craige_Hales
Super User

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

Use Multiple File Import. It will not try to guess date formats. It will import one file at a time if that's all you select, but it is really targeted at importing a bunch of similar files from a folder. Look at the source script from a manual import if you need to repeat the process.

Craige
SteveTerry
Level III

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

Hi @Craige_Hales,

Thank you so much for the suggestion!  I implemented Multiple File Import and have been testing it heavily;  I can't find any issues, and it's reading in my 3-digit Version column (and others like it) with the delimiters intact, but still reads the dates in correctly.

From the source produced by the manual Multiple File Import, I got the following:

Multiple File Import(
	<<Set Folder( "./" ),
	<<Set Name Filter( "verTest.csv" ),
	<<Set Name Enable( 1 ),
	<<Set Size Filter( {100, 100} ),
	<<Set Size Enable( 0 ),
	<<Set Date Filter( {3666002110.40841, 3666002110.40841} ),
	<<Set Date Enable( 0 ),
	<<Set Add File Name Column( 0 ),
	<<Set Add File Size Column( 0 ),
	<<Set Add File Date Column( 0 ),
	<<Set Import Mode( "CSVData" ),
	<<Set Charset( "Best Guess" ),
	<<Set Stack Mode( "Table Per File" ),
	<<Set CSV Has Headers( 1 ),
	<<Set CSV Allow Numeric( 1 ),
	<<Set CSV First Header Line( 1 ),
	<<Set CSV Number Of Header Lines( 1 ),
	<<Set CSV First Data Line( 2 ),
	<<Set CSV EOF Comma( 1 ),
	<<Set CSV EOF Tab( 0 ),
	<<Set CSV EOF Space( 0 ),
	<<Set CSV EOF Spaces( 0 ),
	<<Set CSV EOF Other( "" ),
	<<Set CSV EOL CRLF( 1 ),
	<<Set CSV EOL CR( 1 ),
	<<Set CSV EOL LF( 1 ),
	<<Set CSV EOL Semicolon( 0 ),
	<<Set CSV EOL Other( "" ),
	<<Set CSV Quote( "\!"" ),
	<<Set CSV Escape( "" )
) << Import Data;

To integrate that into my import function (which takes the path as a parameter), I only had to populate the first two methods correctly:

Convert_csv_to_jmp = Function( {path},
	:
	:
	// determine folder and file args for Multiple File Import function.
	lastSlash = Contains( path, "/", -1 );
	dir = Substr( path, 1, lastSlash );
	If( Length(dir) == 0,
		dir = "./";
	);
	file = Substr( path, lastSlash+1 );
	Write( "Path:  ", dir, "\!nFile:  ", file, "\!n" );

	Multiple File Import(
		<<Set Folder( dir ),
		<<Set Name Filter( file ),  // import files with this name.
		:

I would appreciate if someone from the community could link to some good documentation on this function.  I would like to understand what all the methods are doing.  For example, Set Date Filter appears to want some very arcane arguments.

Craige_Hales
Super User

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

Glad that is working for you!

The syntax serves three purposes: remembering the full state of the dialog if you want to re-launch the dialog with pre-populated values, running the import, and showing the possibilities. The filters are an example of all of those:

 

	<<Set Name Filter( "verTest.csv" ),
	<<Set Name Enable( 1 ),
	<<Set Size Filter( {100, 100} ),
	<<Set Size Enable( 0 ),
	<<Set Date Filter( {3666002110.40841, 3666002110.40841} ),
	<<Set Date Enable( 0 ),

Comparing that to the dialog,

using <<CreateWindow instead of <<ImportDatausing <<CreateWindow instead of <<ImportData

there are check boxes in the dialog that enable and disable the filter. Above, the name filter is enabled and the size and date filters are not in use. The values in the size and date filters are either (1) the last values you experimented with, or (2) the size range and date range of the file you selected. Most likely (2). So your file is probably 100 bytes long. The number is a JMP date-time value, you can use something like asdate( 3666002110.40841 ) to get 

02Mar2020:13:55:10

You can delete those lines and (probably) many others from the script; they are mostly default values that show how to use MFI.

Finally, what the script is really doing: the MFI(...) function returns an object. The <<ImportData method at the very end tells the object to actually import the data. There is an example in the Scripting Index that shows << CreateWindow as another method. The MFI object is used immediately and then destroyed in the <<ImportData case. In the <<CreateWindow case the MFI object is not destroyed until the window closes. You can also use xyz = MultipleFileImport(...); and send messages to the variable xyz (including ImportData and CreateWindow); the object won't be destroyed until the variable is cleared. ShowProperties(xyz) will list the messages to the log window; they are used in the script too.

 

Craige
Craige_Hales
Super User

Re: How can I prevent 3-digit version numbers from getting read as dates into data table?

...and you can use dates like this

	<<Set Date Filter( {2jan2020, 10jan2020} ),
	<<Set Date Enable( 1 ),

<<CreateWindow with pre-populated date range<<CreateWindow with pre-populated date range

Craige