- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
.csv Import error
Hello
New JMP user here. I'm just trying to import data from a .csv and I'm getting this error:
The original data in the .csv looks like this:
This is how it imports using best guess:
This is how it imports as plain text:
Does anyone know what's going on?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
Yes, the nul characters are a problem. Here's a possible workaround:
// make a CSV file with nuls in a field
filename = savetextfile("$temp/nuls.csv",chartoblob("a,b,c~0D~0A1.1,~00~00~00,1.2~0D~0A2.1,~00~00~00,2.2"));
// load the file into a blob so we can clean it up
blobdata = loadtextfile(filename,blob);
// convert the blob to text so we can manipulate the nuls as ~00
textdata = blobtochar(blobdata);
write("\!nbefore = ", textdata);
// fix it up, convert nul to space
substitute into(textdata,"~00"," ");
write("\!n after = ", textdata);
// convert it back to a blob, handling the ~0D and ~0A that remain
repairedblobdata = Char To Blob(textdata, "ascii~hex");
// at this point, you could save it (like above) or open it from memory:
dt = open(repairedblobdata,"text"); // tell JMP to interpret the blob as a CSV text file
JMP strings are limited to 2GB, so if you have extremely large files this will not work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
The issue is clearly the barcode values.
Use the text import wizard instead of the default settings. Change the data type for the barcode column to character and see if it reads the file better. First, let's get all the data into a table. Then we can worry about what to do with barcodes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
Thanks for the advice Mark.
I've gone ahead and used the text import wizard. I've tried all the different charsets and none of them seem to help. Also, the wizard doesn't seem to have a way to change the charset of just one specific column. Is there another way of doing that? It seems like it's having an issue deliminating after the first barcode value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
I found it on the next page. I'll see if I can set to "character". Note: If I exclude this column, it seems to exclude every data cell that comes after it. Like it doesn't see the comma after the first barcode value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
It successfully imports in Excel, and it reads the barcodes as just empty spaces. I can save this as an excel file and then import it into JMP, but I'd prefer to avoid that step if possible since there's a whole bunch of data files and I'll be using live data in the future.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
If you have JMP 14, try using Multiple File Import.
You can use this to peek at the file and see the problem characters in hex
blob = loadtextfile("$Downloads\data_LCS.csv",blob(readlength(1000)));
write(char(blob));
Char To Blob( "Q1~09Q2~09Q3~09Q4~09Q5~09Q6 ...
will tell us more about the data if the 1000, above, is long enough to include a barcode value. In this example, the ~09 is a TAB character separating fields.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
This is what it shows for the blob:
In Notepad++ they show as NULL characters
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
Yes, the nul characters are a problem. Here's a possible workaround:
// make a CSV file with nuls in a field
filename = savetextfile("$temp/nuls.csv",chartoblob("a,b,c~0D~0A1.1,~00~00~00,1.2~0D~0A2.1,~00~00~00,2.2"));
// load the file into a blob so we can clean it up
blobdata = loadtextfile(filename,blob);
// convert the blob to text so we can manipulate the nuls as ~00
textdata = blobtochar(blobdata);
write("\!nbefore = ", textdata);
// fix it up, convert nul to space
substitute into(textdata,"~00"," ");
write("\!n after = ", textdata);
// convert it back to a blob, handling the ~0D and ~0A that remain
repairedblobdata = Char To Blob(textdata, "ascii~hex");
// at this point, you could save it (like above) or open it from memory:
dt = open(repairedblobdata,"text"); // tell JMP to interpret the blob as a CSV text file
JMP strings are limited to 2GB, so if you have extremely large files this will not work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: .csv Import error
And for a slightly better answer
// make a CSV file with nuls in a field
filename = savetextfile("$temp/nuls.csv",chartoblob("a,b,c~0D~0A1.1,~00~00~00,1.2~0D~0A2.1,~00~00~00,2.2"));
// load the file into a blob so we can clean it up
blobdata = loadtextfile(filename,blob);
// convert the blob to text so we can manipulate the nuls as ~00
textdata = blobtochar(blobdata);
write("\!nbefore = ", textdata);
// fix it up, convert nul to SYMBOL FOR NULL (U+2400)
// https://www.fileformat.info/info/unicode/char/2400/index.htm
// replace ~00 with three byte UTF-8 sequence for a printable character
substitute into(textdata,"~00","~E2~90~80");
write("\!n after = ", textdata);
// convert it back to a blob, handling the ~0D and ~0A that remain
repairedblobdata = Char To Blob(textdata, "ascii~hex");
// at this point, you could save it (like above) or open it from memory:
dt = open(repairedblobdata,"text"); // tell JMP to interpret the blob as a CSV text file
the diagonal NUL is a single Unicode character
It's really hard to talk about the real nul character that won't print and tends to break things and the printable character 2400 that isn't ~00 but shows a nice graphic that suggests it might be ~00. I'll make my font smaller now.