<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic JMP18: python: conversion from pandas to dataTable skips columns with missing values in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762060#M94055</link>
    <description>&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;import jmp, numpy as np, pandas as pd
import jmputils

def p2j(df):
	dt2 = jmp.DataTable('dt2', df.shape[0])
	# get the column names from the data frame
	names = list(df.columns)
	# loop across the columns of the data frame
	for j in range( df.shape[1]):
		# check if the coulumn data type is string or numeric
		if is_string_dtype(df[names[j] ]):
			dt2.new_column(names[j], jmp.DataType.Character)
		else:
			dt2.new_column(names[j], jmp.DataType.Numeric)
		# populate the JMP column with data
		dt2[j] = list(df.iloc[:,j])
	return(dt2)

fname = r"export.csv";
df = pd.read_csv(fname)
filt_csv = p2j(df);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Using above code to read in a csv file. The p2j function seems to not import col3 correctly as it has missing values&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="minion_0-1717096857056.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/64646iEC1656B3AE21CCFD/image-size/medium?v=v2&amp;amp;px=400" role="button" title="minion_0-1717096857056.png" alt="minion_0-1717096857056.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;csv file contents&lt;/P&gt;&lt;P&gt;"col1","col2","col3"&lt;BR /&gt;"a","b","c"&lt;BR /&gt;"a","b",""&lt;BR /&gt;"a","b","e"&lt;BR /&gt;"a","b",""&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 30 May 2024 19:22:58 GMT</pubDate>
    <dc:creator>minion</dc:creator>
    <dc:date>2024-05-30T19:22:58Z</dc:date>
    <item>
      <title>JMP18: python: conversion from pandas to dataTable skips columns with missing values</title>
      <link>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762060#M94055</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;import jmp, numpy as np, pandas as pd
import jmputils

def p2j(df):
	dt2 = jmp.DataTable('dt2', df.shape[0])
	# get the column names from the data frame
	names = list(df.columns)
	# loop across the columns of the data frame
	for j in range( df.shape[1]):
		# check if the coulumn data type is string or numeric
		if is_string_dtype(df[names[j] ]):
			dt2.new_column(names[j], jmp.DataType.Character)
		else:
			dt2.new_column(names[j], jmp.DataType.Numeric)
		# populate the JMP column with data
		dt2[j] = list(df.iloc[:,j])
	return(dt2)

fname = r"export.csv";
df = pd.read_csv(fname)
filt_csv = p2j(df);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Using above code to read in a csv file. The p2j function seems to not import col3 correctly as it has missing values&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="minion_0-1717096857056.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/64646iEC1656B3AE21CCFD/image-size/medium?v=v2&amp;amp;px=400" role="button" title="minion_0-1717096857056.png" alt="minion_0-1717096857056.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;csv file contents&lt;/P&gt;&lt;P&gt;"col1","col2","col3"&lt;BR /&gt;"a","b","c"&lt;BR /&gt;"a","b",""&lt;BR /&gt;"a","b","e"&lt;BR /&gt;"a","b",""&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 May 2024 19:22:58 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762060#M94055</guid>
      <dc:creator>minion</dc:creator>
      <dc:date>2024-05-30T19:22:58Z</dc:date>
    </item>
    <item>
      <title>Re: JMP18: python: conversion from pandas to dataTable skips columns with missing values</title>
      <link>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762062#M94057</link>
      <description>&lt;P&gt;The short answer is that&amp;nbsp;&lt;FONT face="inherit"&gt;is_string_dtype( col3 ) returns False. &amp;nbsp;It is not a string column, it is a column of objects, some are strings some are NaN. &amp;nbsp;The code above treats everything that is not a string as numeric. The NaN's go in properly as missing, but the character values 'c' and 'e' are invalid for a numeric column and become missing values. &amp;nbsp;I'm looking for a better answer on reading the pandas column as character, without &lt;/FONT&gt;testing&lt;FONT face="inherit"&gt;&amp;nbsp;every value.&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 May 2024 21:08:49 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762062#M94057</guid>
      <dc:creator>Paul_Nelson</dc:creator>
      <dc:date>2024-05-30T21:08:49Z</dc:date>
    </item>
    <item>
      <title>Re: JMP18: python: conversion from pandas to dataTable skips columns with missing values</title>
      <link>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762067#M94059</link>
      <description>&lt;P&gt;Updated code shown here, if the column type is not numeric or string, check until we get a non-missing value before determining the column type as character or numeric.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;import jmp
import numpy as np, pandas as pd
from pandas.api.types import is_string_dtype
from pandas.api.types import is_numeric_dtype

def p2j(df):
	dt2 = jmp.DataTable('dt2', df.shape[0])
	# get the column names from the data frame
	names = list(df.columns)
	# loop across the columns of the data frame
	for j in range( df.shape[1]):
		# check if the coulumn data type is string or numeric
		if is_numeric_dtype( df[ names[j] ] ):
			dt2.new_column(names[j], jmp.DataType.Numeric)
			dt2[j] = list(df.iloc[:,j])			# populate the JMP column with data
		elif is_string_dtype( df[ names[j]] ):
			dt2.new_column(names[j], jmp.DataType.Character)
			dt2[j] = list(df.iloc[:,j])			# populate the JMP column with data
		else:
			col = list()
			dtype = None
			for i in range(dt2.nrows):
				cell_value = df.iloc[i,j]
				# delay setting column type until we see first non-null value
				if pd.isna( cell_value ):
					col.append(None)
				else:
					if not dtype:
						if is_numeric_dtype( cell_value ):
							dtype = jmp.DataType.Numeric
						else:
							# assume it's character - could be datetime64,... 
							dtype = jmp.DataType.Character
					col.append( cell_value )
			print(col)
			if dtype: 
				dt2.new_column(names[j], dtype)
				dt2[j] = col				# populate the JMP column with data
		
	return(dt2)

fname = r"export.csv";
df = pd.read_csv(jmp.HOME + 'Downloads/' + fname)
print(df)
print(df.dtypes)
print( f'col1 is a string: {is_string_dtype(df[names[0]])}' )
print( f'col2 is a string: {is_string_dtype(df[names[1]])}' )
print( f'col3 is a string: {is_string_dtype(df[names[2]])}' )

filt_csv = p2j(df);&lt;/PRE&gt;
&lt;P&gt;This gives the results:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;/*:

  col1 col2 col3
0    a    b    c
1    a    b  NaN
2    a    b    e
3    a    b  NaN


col1    object
col2    object
col3    object
dtype: object


col1 is a string: True


col2 is a string: True


col3 is a string: False


['c', None, 'e', None]
&lt;/PRE&gt;
&lt;P&gt;Notice in my function I converted the NaN which is numeric to None. &amp;nbsp;The JMP DataTable code will accept None either in Numeric columns or Character columns and does the 'right' thing. &amp;nbsp;By putting a NaN in a numeric column and "" empty string in a Character column.&lt;/P&gt;</description>
      <pubDate>Fri, 31 May 2024 01:10:03 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762067#M94059</guid>
      <dc:creator>Paul_Nelson</dc:creator>
      <dc:date>2024-05-31T01:10:03Z</dc:date>
    </item>
    <item>
      <title>Re: JMP18: python: conversion from pandas to dataTable skips columns with missing values</title>
      <link>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762069#M94060</link>
      <description>&lt;P&gt;Using the above p2j( ) and an updated export.csv with the a column of numeric data having a missing value as the first value gives:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Paul_Nelson_0-1717118219358.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/64647i688228392B6A53E0/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Paul_Nelson_0-1717118219358.png" alt="Paul_Nelson_0-1717118219358.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;% more ~/Downloads/export.csv&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"col1","col2","col3","col4"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"a","b","c",&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"a","b","",2.718&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"a","b","e",1.618&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"a","b","",3.14159&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 31 May 2024 01:18:59 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/JMP18-python-conversion-from-pandas-to-dataTable-skips-columns/m-p/762069#M94060</guid>
      <dc:creator>Paul_Nelson</dc:creator>
      <dc:date>2024-05-31T01:18:59Z</dc:date>
    </item>
  </channel>
</rss>

