cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
jlrouquette
Level III

Return a Specific Row With Multiple Criteria - For Beginners

I'm not new to JMP but I'm pretty much brand new to the scripting of this sort.  

 

I have a table with ≈8mm rows and 80 columns.  Ive just updated it with 3 additional columns and 2,000 rows.  I need to return a specific "Formation" from the 2,000-row update for each of the 8mm rows from the original table. 

 

I need to do so under a couple criteria.  IF "well.name" = "Well.Name" AND "BitDepth" < "FormationTVD" THEN "Formation".

 

I can break the update back out of the table if that makes more sense but I couldn't figure out how to manage this another way either.  

 

Lastly, I've read several discussions on this or similar topics and I've seen things similar to "dt = blahblah; dt << blah" which I don't understand at all.  If explaining this in those terms makes the most sense please do as thoroughly as you're able to.        

 

1 ACCEPTED SOLUTION

Accepted Solutions
julian
Community Manager Community Manager

Re: Return a Specific Row With Multiple Criteria - For Beginners

Hi again, @jlrouquette,

After reading more carefully through this thread it seems like this boils down to having a table of lookup values that you want to update your main table with based on matching conditions. If you were simply matching based on certain columns being equal you could use Tables > Update (and specify match conditions), but since you need to match when one column is less than another, that won't work. Also, since you have a very large main table (with millions of rows), and a relatively small lookup table (around 2000 rows), I think you will get better performance going about this in reverse: rather than having a column formula that works through your main table row by row, have a script that works through your lookup row by row table to find all matching rows in your main table at once. Below is such a script. I've also attached it as a script file (which you can open and click Run on), as well as the tables you will need (the main table, and a *separate* lookup table, what I think it probably the best practice). 

 

Some important points:

-This script will only fill a cell once. That is, once a match has been made, even if a cell matches later on in the lookup table the data in the main table will not be overwritten. This uses a JMP 13 feature of the Select Where() function to restrict a selection set. If you aren't using JMP 13 this will not work. 

- The matches from this script are mostly the same as the ones you provided, but because of the point above they seem to differ. As you said, many formations match a given set of conditions and this script will simply select the first to match in the lookup table.

 

I suspect there is an even more efficient way to do this but is this along the lines of what you're looking for?
@julian

  

IterativeLookup.gif

 

Names Default To Here(1);

//get handles for the tables
dt1 = Data Table( "MainTable" );
dt2 = Data Table( "LookupTable" );

//Make a column for the matching formation in the main table
dt1 << New Column("FormationReference", Character);

//for better performance, stop showing table updates until complete
dt1 << Begin Data Update;

//loop through the rows of the lookup table to find the matching cells in the main table

For(i = 1, i <= nRows(dt2), i++, 

	dt1 << Select Where( 
		(:WellName == dt2:WellReference[i]) & 
		(:BitDepth < dt2:TVD Reference[i]) ); 

	// remove selection for rows that already have a match in "Formation Reference"
	// note: jmp 13 or later supports restricting selection in the select where function

	dt1 << Select Where(dt1:FormationReference=="", Current Selection("Restrict"));

	// set the cells in FormationReference for the selected 
	// rows to FormationReference in the lookup table

	dt1:FormationReference[dt1 << Get Selected Rows] = dt2:FormationReference[i];
);

//show table updates again
dt1 << End Data Update;

 

View solution in original post

28 REPLIES 28
txnelson
Super User

Re: Return a Specific Row With Multiple Criteria - For Beginners

I am not sure I follow exactly what you have done, and what you need to do, but let me at least start a response.

If I understand what you said, is that you have a data table with about 8 million rows and 80 columns.  You have recently updated the table with 3 new columns, and 2,000 rows.  How did you apply the update?  Did you use a 

     Tables==>Join       or a      Tables==>Update 

to do this update?

And now with the updated table, do you have 83 columns, and 2,002,000 rows?

Moving on to your next issue.

JSL will allow one to easily find all of the rows that meet a specific criteria, and then operations can be performed on those rows and columns.

dt = current data table();
returnedRows = dt << get rows where( :Well.Name == "a well's name" &
     :BitDepth == :FormationTVD;

The above JSL would create a matrix of row numbers (returnedRows) after searching the entire data table to find where a column named Well.Name has a name of "a well's name" and a column named BitDepth equals the the same value in a column named FormationTVD.  From you explanation, I am just guessing what Well.Name, BitDepth, FormationTVD and Formation really are.  But once the rows that you need to be identified, are selected, then the returnedRows matrix can be stepped through, and actions taken on those rows.

Jim
Craige_Hales
Super User

Re: Return a Specific Row With Multiple Criteria - For Beginners

<< is the JSL "send" operator. You send a message to an object. The data table is an object, and you can send it messages like dt<<Sort or dt<<GetRowsWhere (if dt is the JSL variable for the data table.)

A message is pretty much the same as a function call. There are some differences; objects are often designed to ignore messages they don't understand (check spelling if it doesn't work). Functions may need an explicit parameter to tell which object.

To finish complicating it: messages are sometimes called methods of the object. Same thing.

Craige
jlrouquette
Level III

Re: Return a Specific Row With Multiple Criteria - For Beginners

Its probably an odd fix but I couldnt figure out how to reference another table so its 2mm rows with 80 columns then an additional 3 columns at the end with only 2k rows; it is not a semetrical block of data.  I used Tables==>Update.

 

I put in the JSL that you referenced and got the attached error message.

 

I corrected the column references to represent what they really are.  There are two variations of well name, well.name and Well.Name.  Im using them as primary keys in a way to make sure the jsl only references the formations where the well name is the same.

 

I dont see anywhere in the JSL where its references the formation columns (what I need returned) though.  How is that happenening?     

 Error Message.pngFormula2.pngJSL2.png

 

uday_guntupalli
Level VIII

Re: Return a Specific Row With Multiple Criteria - For Beginners

@jlrouquette
      Looking at Formula2.png , 
        A column should be referenced with a colon in front ":"   . So, if BitDepth and Well.Name are columns in the statement, try re-writing your statement as  :

 

dt = Current Data Table(); // Make sure data table you want to work with is open

dt << Select Where(:Name("Well.Name") == "well.name" & :BitDepth < FormationTVD); 

// There is a couple of things happening here 
// 1. Well.Name - presuming that is the column name has a period in it. When
// column names have other characters like period or space , use the name          
// function
// 2. I am selecting the rows first - so I am separating the code into 2 steps. 

SR = dt << Get Selected Rows(); // This will save the row numbers of selected rows that met your condition 

  Looking at your error screenshot tells me that the variable or function you are trying to call is not defined. So - probably try to make sure - that column name exists ? 

Best
Uday
jlrouquette
Level III

Re: Return a Specific Row With Multiple Criteria - For Beginners

Alright, still no luck yet on this one.  I liked your reply a lot because you put it in terms of the variables I'm using (incredibly helpful since I'm new to this).

 

My machine had a mixture of running slow and "not responding" for a couple hours after I ran the code.  Not sure if its something I did wrong or not. 

 

I attached the output and a copy of the code. 

jsl.PNGoutput.png

jlrouquette
Level III

Re: Return a Specific Row With Multiple Criteria - For Beginners

I'm not sure if its necessary to inform you that I added a subset of data to my question here or if it happens automatically so just in case... I just responded to someone else and added a subset of my data in case that helps address my question.  

uday_guntupalli
Level VIII

Re: Return a Specific Row With Multiple Criteria - For Beginners

@jlrouquette
 Also , try to execute the following sample

dt1 = Open( "$SAMPLE_DATA/Air Traffic.jmp" ); // Open Sample Data Table 1 

dt2 = Open( "$SAMPLE_DATA/Airline Delays.jmp" ); // Open Sample Data Table 2

// Selecting and  Identifying selected rows 

Current Data Table(dt1); // Ensuring that the operation you want to perform is actually happening on the table you want 

dt1 << Select Where(:Airline == "Delta"); // Select Rows where Airline is Delta 

SR = dt1 << Get Selected Rows(); // Get the row numbers that have been selected 

Show(SR); // Print Selected Rows 


Close All(Data Tables,"No Save"); // Close data tables 

code to understand this easily and then apply it to your case 

Best
Uday
jlrouquette
Level III

Re: Return a Specific Row With Multiple Criteria - For Beginners

I'm not sure if its necessary to inform you that I added a subset of data to my question here or if it happens automatically so just in case... I just responded to someone else and added a subset of my data in case that helps address my question.  

uday_guntupalli
Level VIII

Re: Return a Specific Row With Multiple Criteria - For Beginners

@jlrouquette
      JMP has a programming language called JSL . In JMP, the inherent data container that is used is a data table. So, when you see : 
       

dt = Current Data Table(); 

// or 

dt = Open( "$SAMPLE_DATA/Cities.jmp" ); // Opening Sample Data Table 

// or
Data Table("Cities")

 

     Essentially - "dt" is the reference to your data table. In order to work with the data table, the reference allows you to perform actions of different kinds and work with the data table. Alternatively you can use Data Table(Table Name) as a way to perform actions on or with the data table. 

 

dt = Open( "$SAMPLE_DATA/Cities.jmp" ); 

dt << Select Where(:State == "IN"); // select all rows where State is Indiana 

     Using the reference, you can perform actions on the data table. This enables you to process your data in the way you want. While these are over simplified examples, I hope they expalin what you are seeing in other posts. 
 
     A couple of humble suggestions for future posts in the community which will also help you get answers faster, since other people on the community are not aware of the problem you are trying to solve : 
1. Please try to provide a reproducible example i.e. either provide a part of your code or try to demonstrate what you are doing by using sample data 

2. If you are not using the scripting language, kindly provide screenshots explaining in detail what your question is - so others can immediately try to help you


     JMP Community is an excellent resource and has helped me many a time. Welcome to the community. Some other resources to be aware of : 
1. https://www.jmp.com/en_dk/events/getting-started-with-jmp/new-user-welcome-kit.html - JMP New User Welcome Kit 
2. In JMP , Help -> Scripting Index 

3. In JMP , Help -> Books 


Best
Uday