cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
markschahl
Level V

How to query a large data table based on a column level and return columns whose name contains that level?

We want to build a quality monitoring application to deploy across dozens of production facilities (eventually publish to JMP Live).

 

For each facility, we have a very tall & wide data table (let's call it dtBigBoy, <1000 cols, 10k-100k rows) that is refreshed from a process data historian which cannot be queried directly (we get all the data between start/stop timestamps and concatenate the fresh data to the existing table). One of the columns :Grade contains the product name being made at the time (for that row). The column naming follows: Grade.Property.#. There are at least 1 Grade.Property.# column per Grade. One of the user cases is user is interested in a certain grade (say ABCD001) and wants to see all the properties data.

 

  1. What is the most efficient way to do this? Use local data filter on dtBigBoy and proceed with control charts, etc.?
    OR select only rows where :Grade == Chosen.Grade and column names contain Chosen.Grade and create a subset?
  2. How does one create a selection dialog box which reads the column :Grade and presents the levels in :Grade as choices? The user choice is saved to variable Chosen.Grade.
  3. How to select columns whose name contains Chosen.Grade?
  4. Is it more efficient to do 1-3 OR use JMP Table Query to get the rows & columns we need from dtBigBoy?

Thanks in advance from wet & steamy Kuala Lumpur, Malaysia!

5 REPLIES 5
Craige_Hales
Super User

Re: How to query a large data table based on a column level and return columns whose name contains that level?

Not sure, but I think the local data filter is going to be similar to a linked subset; the filtered table depends on the table remaining open. 1e3 rows * 1e5 cols * 8 bytes/number is under 1GB and should fit in 16GB+ memory OK. Linked subsets should not significantly increase the memory required.

You can try the summary platform interactively to get a summary table of the grade column. You can get the script from that.

You can script the subset platform in much the same way. You'd write a loop to choose the cols for the subset.

I think it is unlikely anyone knows the answer to the last question without trying both. I think either might be fast enough.

Craige
markschahl
Level V

Re: How to query a large data table based on a column level and return columns whose name contains that level?

Craige: Thanks for the information on memory and the suggestions! We will try them out.

jthi
Super User

Re: How to query a large data table based on a column level and return columns whose name contains that level?

Depending on what type of graphs you have, that shouldn't be too much data. Are you plotting all the 1000 columns at the same time? Would user be able to digest that much data at once? I would start with Local Data Filter (use Conditional if it is what you want) and then go to scripting if necessary (most likely won't work in JMP live). Changing columns is most likely the most difficult part if you cannot just rely on column switchers. Providing some simple example data with some extra explanation could be helpful.

 

When I had to deal with 4million+ rows I had to rely on scripting with local data filter BUT it wasn't really due to slowness of showing single but rather how clunky the local data filter updating is: if anything is changed it will immediately update and I cannot it force to update only on specific changes. So I used Summary table with the local data filter with change function and then updated the shown data based on that. You could also go with "totally" custom filter using List Boxes and linked subsets.

-Jarmo
markschahl
Level V

Re: How to query a large data table based on a column level and return columns whose name contains that level?

Jarmo:

Thanks for the suggestions. 

No, we are not plotting all the columns at the same time. We want to make one IMR chart per Grade|Property. Since each Grade and Property has different specifications, have to split the dt by Grade|Property. This results in many sparse columns. See snippet below for just one property for 12 Grades. Control Chart Builder does not like all the blank rows.


Snippet of datatableSnippet of datatable

jthi
Super User

Re: How to query a large data table based on a column level and return columns whose name contains that level?

Situation might have changed since 2021 but this post discusses some options regarding missing values in control chart builder How to draw Control Charts that ignore missing data?  (and of course via scripting there are plenty of different options).

 

Edit: Quality and Process Methods > Control Chart Builder > Excluded and Hidden Samples in Control Chart ... 

-Jarmo

Recommended Articles