cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
ih
Super User (Alumni) ih
Super User (Alumni)

Tricks to prevent incorrect input date formats from resulting in missing values?

When importing a data set with a date column that defaults to character, it is common to change it to numeric and continuous.  Unfortunately if the input date format is incorrect any values that cannot be parsed are set to missing, thus you don't get a second chance at setting the input format.  How do you typically handle this in an exploratory/graphical user interface analysis?

 

I sometimes

  • duplicate the column before converting to numeric
  • Create a new formula column to parse the date

To demonstrate the issue, run this code to create a table with some dates in m/d/y format, and then set that column to use d/m/y format, as shown in the screenshot below.

 

names default to here(1);

//Make a table with dates in m/d/y format
vals = index(today()-in days(100), today(), incr=in hours(1));
dt = New Table( "Example Data for Input Date Format",
	Add Rows( n items(vals) ),
	New Column( "Dates mdy",
		Numeric,
		"Nominal",
		Format( "m/d/y h:m:s", 23, 0 ),
		Input Format( "m/d/y h:m" ),
		Set Values( vals )
	)
);

dt << run formulas;
wait(1);
dt:Dates mdy << data type("Character");

Now set all values at once in column info and note how some data is set to missing, even after changing the column back to character.

ih_0-1621427935764.png

 

2 ACCEPTED SOLUTIONS

Accepted Solutions
peng_liu
Staff

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

The resulting missing values from Char to Num is due to the Input Format.

Missing results start around row 113 with this "02/13/2021 12:23:50 AM". "d/m/y h:m" Input Format interprets it as the 13th month, which is invalid.

peng_liu_0-1621430693056.png

When there is a mistake in conversion, changing format back won't cut it, because the underlying values have been changed. The missing values no longer hold a valid time value. Use "Undo" under Edit instead.

 

View solution in original post

jthi
Super User

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

I usually just create custom formulas to parse the dates.

 

One option that came into my mind, what if you would first change it to Numeric, Nominal with Best as format. I think this would leave it for JMP to determine the correct format and make it a numerical if it can. After making it numerical you could go back to Format settings and choose the desired formatting.

-Jarmo

View solution in original post

5 REPLIES 5
peng_liu
Staff

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

The resulting missing values from Char to Num is due to the Input Format.

Missing results start around row 113 with this "02/13/2021 12:23:50 AM". "d/m/y h:m" Input Format interprets it as the 13th month, which is invalid.

peng_liu_0-1621430693056.png

When there is a mistake in conversion, changing format back won't cut it, because the underlying values have been changed. The missing values no longer hold a valid time value. Use "Undo" under Edit instead.

 

ih
Super User (Alumni) ih
Super User (Alumni)

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

Thank you @peng_liu,

 

Yes this perfectly describes the issue.  Sometimes either the top few rows appear to parse correctly and the user doesn't think to scroll down to see that lower rows in the table are set to missing, or most values parsed correctly but a few did not due to a data entry issue.  I am hoping to find a different way to help users parse dates so they are less likely to find out hours or days later that they have missing values.

jthi
Super User

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

I usually just create custom formulas to parse the dates.

 

One option that came into my mind, what if you would first change it to Numeric, Nominal with Best as format. I think this would leave it for JMP to determine the correct format and make it a numerical if it can. After making it numerical you could go back to Format settings and choose the desired formatting.

-Jarmo
ih
Super User (Alumni) ih
Super User (Alumni)

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

Thanks @jthi,

 

Okay I'm glad I'm not the only one to parse dates with a new column, I did just submit a wish list item to make this easier at Add Parse Date as a New Formula Column option.

 

That 'best' trick is a good idea, I need to play with it a little more but at least for the m/d/y v d/m/y problem it would help.

jthi
Super User

Re: Tricks to prevent incorrect input date formats from resulting in missing values?

JMP16's Custom Date-Time Formatting with Format Patterns is also fairly helpful with date formatting. Previously I did either use the Data Table Tools Add-in or write my own parse formulas, but hopefully JMP16 will make it easier.

-Jarmo