When sending datatables to pandas dataframes (Python), numeric values (datetimes) are sent as strings.
I would not expect this behavior, since now I have to deal with this manually (considering datetimes in Python and JMP are different).
Any workaround? I imagine the easiest would be to transform the JMP format to integer, send this information, and transform the date considering the logic behind JMP dates (sum of seconds since specific date).
ClearLog();
dt = Open( "$SAMPLE_DATA/Functional Data/Weekly Weather Data.jmp");
t = dt << Subset(
Allrows,
Columns(:DATE)
);
Python Init();
Python Send( t );
Python Submit( "
import pandas as pd
print(t.dtypes)
");
Python Term();
// Output
// DATE object
// dtype: object
// Types: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dtypes.html
//float float64
//int int64
//datetime datetime64[ns] << Expected
//string object
//dtype: object
PythonSend() saves temporary csv which are read with the default pd.read_csv('myfile.csv'), hence no parse_date.
To deal with this, one option which is not impacted by the date format:
ClearLog();
dt = Open( "$SAMPLE_DATA/Functional Data/Weekly Weather Data.jmp");
t = dt << Subset(
Allrows,
Columns(:DATE)
);
t:DATE << Format(best);
Python Init();
Python Send( t ); // send the opened data table represented by dt to Python
Python Submit( "
import pandas as pd
print(t.dtypes)
print(t.head())
t.iloc[:,0] = pd.to_datetime(t.iloc[:,0], unit='s', origin=pd.Timestamp('1904-01-01'))
print(t.dtypes)
print(t.head())
");
Python Term();
// Output
// DATE int64
// dtype: object
// DATE
// 0 3534451200
// 1 3535228800
// 2 3537648000
// 3 3538252800
// 4 3538857600
// DATE datetime64[ns]
// dtype: object
// DATE
//0 2016-01-01
//1 2016-01-10
//2 2016-02-07
//3 2016-02-14
//4 2016-02-21
If there is a better way, let me know.
This message string was for R, but I think you can use the same concept for Python.
https://community.jmp.com/t5/Discussions/How-to-send-dates-as-characters-from-JMP-to-R/td-p/50112
PythonSend() saves temporary csv which are read with the default pd.read_csv('myfile.csv'), hence no parse_date.
To deal with this, one option which is not impacted by the date format:
ClearLog();
dt = Open( "$SAMPLE_DATA/Functional Data/Weekly Weather Data.jmp");
t = dt << Subset(
Allrows,
Columns(:DATE)
);
t:DATE << Format(best);
Python Init();
Python Send( t ); // send the opened data table represented by dt to Python
Python Submit( "
import pandas as pd
print(t.dtypes)
print(t.head())
t.iloc[:,0] = pd.to_datetime(t.iloc[:,0], unit='s', origin=pd.Timestamp('1904-01-01'))
print(t.dtypes)
print(t.head())
");
Python Term();
// Output
// DATE int64
// dtype: object
// DATE
// 0 3534451200
// 1 3535228800
// 2 3537648000
// 3 3538252800
// 4 3538857600
// DATE datetime64[ns]
// dtype: object
// DATE
//0 2016-01-01
//1 2016-01-10
//2 2016-02-07
//3 2016-02-14
//4 2016-02-21
If there is a better way, let me know.