cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
MathStatChem
Level VI

Import Excel data from specific Excel cells

I have situation where the data I need to analyze in JMP is stored somewhat haphazardly in in an Excel file.  I fortunately can easily get a list of the Excel cells that the data is in, and I would like to automate with JSL importing those values.  I am looking for ideas on how to do this effectively and efficiently.  

 

For example, let's say the data I need in a single column in JMP  is stored in these excel cells
E284,E400,E507,D615,E727,D852,E1003,E1114,D1220,D1339,E1493

 

I tried using Open() and specifying the worksheet, starting Row, Ending Row, Starting Column, Ending Column to just get each value in a JMP file with a single row/column, e.g.

 

 

Open(
	"C:\temp\worksheet.xlsx",
	Worksheets( "sheet1" ),
	Use for all sheets( 0 ),
	Concatenate Worksheets( 0 ),
	Create Concatenation Column( 0 ),
	Worksheet Settings(
		1,
		Has Column Headers( 0 ),
		Number of Rows in Headers( 1 ),
		Headers Start on Row( 1 ),
		Data Starts on Row( 284 ),  // get data on row 284
		Data Starts on Column( 5 ), // get data on column "E" (the fifth column)
		Data Ends on Row( 284 ), // get data on row 284
		Data Ends on Column( 5 ), // get data on column "E" (the fifth column)
		Replicated Spanned Rows( 1 ),
		Replicated Spanned Headers( 0 ),
		Suppress Hidden Rows( 1 ),
		Suppress Hidden Columns( 1 ),
		Suppress Empty Columns( 1 ),
		Treat as Hierarchy( 0 ),
		Multiple Series Stack( 0 ),
		Import Cell Colors( 0 ),
		Limit Column Detect( 0 ),
		Column Separator String( "-" )
	)
)

I would then get a separate data file for each value I want to retrieve from a specific cell and then gather the values in the separate JMP files all together in to one data table in one column.

 

One problem I am facing is that it seems that when you open an excel file, all the blank rows at the top of the excel worksheet are ignored?  So row 1 is not really row 1 unless there is some data in row 1.  Is that true?

 

Any other ideas would be helpful.

 

7 REPLIES 7
David_Burnham
Super User (Alumni)

Re: Import Excel data from specific Excel cells

Do you have information that you can use to identify the cells other than the Excel cell reference.  For example, perhaps a field label?  I usually load the data from Excel into JMP with minimal processing, then simply iterate over rows and columns searching for key information.  Typically the first scan will identify the column that contains the field labels that I am interested in, and the second scan will then load the field values.  In this way I can build up either lists of data from which I can build a table, or header information which I can set as table variables.

 

For example - scanning to find a column containing a specific field label

			identifier = "Description";
			cols = dt<<getcolumnnames(string,Nominal);
			lblCol = 0;
			success = 0;
			for (r=1,r<=nrows(dt),r++,
				if (success,break());
				for (c=1,c<=nitems(cols),c++,
					cell = column(dt,cols[c])[r];
					if (contains(cell,identifier),
						success = 1;
						lblCol = cols[c];
						break()
					)
				)
			)	 

And then scanning for fields

		fields = [=>];
		fields["Description"] = 0;
		fields["USL"] = 0;
		fields["LSL"] = 0;
		fields["Nominal"] = 0;
		keys = fields << getKeys;
		for (i=1,i<=nitems(keys),i++,
			field = keys[i];
			success = 0;
			for (r=1,r<=nrow(dt),r++,
				cell = column(dt,lblCol)[r];
				if (contains(cell,field),
					success = 1;
					fields[field] = r;
					break()
				)
			)
		);
-Dave
MathStatChem
Level VI

Re: Import Excel data from specific Excel cells

Simple answer is "no".  If if did have some other way to reference the location of the data, I'd try your approach, but in this case the Excel cell reference is the primary way to find the data.

MathStatChem
Level VI

Re: Import Excel data from specific Excel cells

I did just remember that, yes indeed, for an Excel worksheet with empty cells at the top of the sheet, those rows are ignored.  See https://www.jmp.com/support/notes/60/026.html

and Excel Import Improvement: Show Row Numbers and Column Labels in import preview 

 

 

David_Burnham
Super User (Alumni)

Re: Import Excel data from specific Excel cells

This is a slightly extreme hack, although I use similar techniques to create PPTX and DOC files ...

 

XLSX is just a zip file containing XML files.  You can use the XML to reference cell coordinates:

 

xl-xml.png

 

You can use ZipArchive to get to the XML files.

-Dave
MathStatChem
Level VI

Re: Import Excel data from specific Excel cells

That is an interesting idea.  I had no idea that was what was "under the hood" of an XLSX file.

David_Burnham
Super User (Alumni)

Re: Import Excel data from specific Excel cells

Here's an example (probably better ways of parsing XML but pattern matching functions do the job for me:

 

	/*----------------------------------------------------------------------
	Method: getWorksheets
	----------------------------------------------------------------------*/	
	getWorksheets = method({username=1},
	
		// this technique (including the overhead of creating the class,
		// is >300 x faster than getExcelWorksheets on Mac
		// and >100 x faster on Windows

		workbook = "xl/workbook.xml";
		za = open(_excelFilePath,zip);	
		contents = za<<read(workbook);
		pat = "<sheet name=\!"" 
			+ PatRepeat(PatNotAny("\!""))>>sheet + "\!"" 
			+ " sheetId=\!"" + PatRepeat(PatNotAny("\!"")) + "\!""
			+ (" r:id=\!"" + PatRepeat(PatNotAny("\!"")) + "\!"" | " state=\!"hidden\!"" + " r:id=\!"" + PatRepeat(PatNotAny("\!"")) + "\!"")>>x;
		ismatch = 1;
		lstWorksheets = {};
		while(ismatch,
			ismatch = patmatch(contents,pat,"");
			if (!contains(x,"hidden"),
				substituteinto(sheet,"&amp;","&");
				if (!contains(lstWorksheets,sheet),	
					insertinto(lstWorksheets,sheet)
				)
			)
		);		
				
		return(lstWorksheets);
	),
-Dave
MathStatChem
Level VI

Re: Import Excel data from specific Excel cells

I found an easier workaround in Excel.  There is an INDIRECT(cellref) function in Excel that you use to get a result from a specific cell.  So I was able to make a new excel sheet and create columns that indicate the sheet and cell that the data is in and use the INDIRECT function to get a table of results that I needed for analysis.

 

I do think, though, this problem highlights some improvements needed for the Excel import in JMP.