Subscribe Bookmark RSS Feed

FOR Loop on String data

melanie_william

Community Trekker

Joined:

May 5, 2014

Sorry if this is a crazy basic question but have not been able to find an answer on this one (maybe asking the wrong question). 

I'm trying to write a scrip that would calculate the N, Mean, and Median of a parameter (string data) within a particular geographical area (HUC - this is a number but set as a string) grouped by years.  I have attached some sample data for clarity of this but basically have multiple sampling stations that have multiple parameters over several years.  The idea is for each geographical area and each parameter within that area, find the yearly N, Mean, and Median.  Have the script run through each parameter for each area. 

My questions:

1. Am I on the right track?

2. If so, what would I use for the initialization and the iteration portions of the For loop?

dt = Data Table( "Data_TP" );

For (

  While (parameter = "Turbidity", HUC = "03020101"

  Summary(

  Group(ActivYear),

  N(Value), Mean(Value), Median(Value),

  output table name("Test"))

  //save table;

)

);

Any help or reference suggestions would be greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
ms

Super User

Joined:

Jun 23, 2011

Solution

You do not need a loop to make a summary table. The below code will give you a table with N, mean and median for each HUC, year and variable (can easily be done interactively too: Tables -> Summary)

dt = Data Table( "Example_Data.jmp" );

dtsum=dt << Summary(

  Group( :HUC, :ActivYear, :CharacteristicName ),

  N( :Result Value as Number ),

  Mean( :Result Value as Number ),

  Median( :Result Value as Number )

);

// However if your goal is to get a separate table for each combination you still do not need to loop. You can make subsets by one ore more columns. For example

dtsum<<Subset(

  By( :CharacteristicName ),

  All rows,

  Selected columns only( 0 ),

  );

But do NOT try this to the example table if you do not want the screen crowded with data tables. (If it happens, run

close all(data tables)to close all tables with one command).


But of course is also possible to extract the data one combination at a time in a nested loop. The summarize command  is quite handy but I don't think it supports the median. however it can still be used to generate the lists to loop through. I have not time to give an example now, but maybe later if there's interest

Btw, nice data set. I too use JMP to evaluate monitoring data.




3 REPLIES
ms

Super User

Joined:

Jun 23, 2011

Solution

You do not need a loop to make a summary table. The below code will give you a table with N, mean and median for each HUC, year and variable (can easily be done interactively too: Tables -> Summary)

dt = Data Table( "Example_Data.jmp" );

dtsum=dt << Summary(

  Group( :HUC, :ActivYear, :CharacteristicName ),

  N( :Result Value as Number ),

  Mean( :Result Value as Number ),

  Median( :Result Value as Number )

);

// However if your goal is to get a separate table for each combination you still do not need to loop. You can make subsets by one ore more columns. For example

dtsum<<Subset(

  By( :CharacteristicName ),

  All rows,

  Selected columns only( 0 ),

  );

But do NOT try this to the example table if you do not want the screen crowded with data tables. (If it happens, run

close all(data tables)to close all tables with one command).


But of course is also possible to extract the data one combination at a time in a nested loop. The summarize command  is quite handy but I don't think it supports the median. however it can still be used to generate the lists to loop through. I have not time to give an example now, but maybe later if there's interest

Btw, nice data set. I too use JMP to evaluate monitoring data.




melanie_william

Community Trekker

Joined:

May 5, 2014

MS,

This was incredibly helpful! I was trying to make it much harder than it needed to be.  I will need it split out into individual tables as was requested from the boss but also wanted to do an all in one table for database reasons.  So I will be running both of these scripts. 

This worked perfectly and very fast.

Thanks so much!

Mel

PS: Wish I could take credit for the data but it's just a much abbreviated version of what I pulled from EPA STORET. 

mikekutz

Community Trekker

Joined:

Sep 16, 2013

I don't think you need a loop.

You can group by multiple columns.

I used the GUI's "summary" to do what (I think) you want and it produced this script:

    Data Table( "Example_Data" ) << Summary(
   Group( :HUC, :ActivYear, :CharacteristicName ),
   N( :Result Value as Number ),
   Mean( :Result Value as Number ),
   Median( :Result Value as Number )
   )

-- MK