cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
HSS
HSS
Level IV

Finding a pattern in a column based on other two column pattern and then storing the start and end values in a new column

Hi,

I am trying to create a new column (in my file "Required values") which can have only binary values - 0/1 or Start/End. Its values depends upon a another column (in my file "Number of Occurrence") which has repeated pattern based on other two columns (in my file - "ID" and "Sub_set").  Too much wording, please see the attached image and data file.  I tried few formulas in new columns and few for loops as well. But I am not able to get both Start and End values. I am able to get only Start value. 

Any help ?

Hari_0-1639487038546.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Finding a pattern in a column based on other two column pattern and then storing the start and end values in a new column

Here is a formula that is one way of calculating what you need

If(
	Row() == 1, 0,
	:Number of Occurance != Lag( :Number of Occurance ), 0,
	:Number of Occurance != :Number of Occurance[Row() + 1], 1,
	Row() == N Rows( Current Data Table() ), 1,
	.
)
Jim

View solution in original post

4 REPLIES 4
SDF1
Super User

Re: Finding a pattern in a column based on other two column pattern and then storing the start and end values in a new column

Hi @HSS ,

 

  Can you provide some basic logic example of how the "required values" column depends on the "number of occurrence", "sub-set", and "ID"? Based on your description, it's unclear how these columns lead to "required values" having either a 0 or 1 output.

 

  For example, you have both row 1 and row 13 as 0, while rows 12 and 25 are both 1. It's unclear from the structure of the data table why these values should be set this way. If you can provide a little more logic explanation, it would help in generating the right kind of of column formula.

 

For example,

If( :"data"n ==0, :"required values"n = 0,

  :"data"n>=0.24, :"required values"n=1,

 .

);

 

  In the above example, the logic is that if the "data" column is equal to 0, then the entry in the "required values" column is equal to 0; else if the value of "data" is greater than or equal to 0.24, then the entry for "required values" is equal to 1. This would satisfy the elements for required values, but might also include some other entries that aren't the last element of the sub-set type.

 

  I know this isn't exactly what you're looking for to apply to the whole column, but getting a better idea of the logic that you need to apply will help in creating the right type of column formula that will apply to the whole column.

 

DS

HSS
HSS
Level IV

Re: Finding a pattern in a column based on other two column pattern and then storing the start and end values in a new column

Hi @SDF1 

There are more than 2 M rows in the actual data set. And this is by chance that when I was creating a sample data set I tool values less than 0.25 and hence you are seeing it all the "Data" <= 0.25. That login will not work.

"ID" is the main column. Each ID can have multiple "Sub-set" like "AAA", "CCC" etc.  (The same "Sub-set" are exist for different "ID" as well.) Now for each "Sub-set", we have "No. of Occurrence" (repeated pattern) like - 2,2,2,2,2,2,2,2,2,2  or 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4 with different length and this may not be in consecutive order!  Now same "No. of Occurrence" (repeated pattern) can be there for other "Sub-set" as well.

The only thing from "Data" if we want to use is - for each patterns ("No. of Occurrence") starts at the lowest "Data" value (but not a fixed number) and end at highest "Data" value for that pattern (again not a fix value).

I hope things are bit clear now!

Thanks.

txnelson
Super User

Re: Finding a pattern in a column based on other two column pattern and then storing the start and end values in a new column

Here is a formula that is one way of calculating what you need

If(
	Row() == 1, 0,
	:Number of Occurance != Lag( :Number of Occurance ), 0,
	:Number of Occurance != :Number of Occurance[Row() + 1], 1,
	Row() == N Rows( Current Data Table() ), 1,
	.
)
Jim
HSS
HSS
Level IV

Re: Finding a pattern in a column based on other two column pattern and then storing the start and end values in a new column

I was trying to do very similar thing, using function 'Dif' but somehow it did not work for me. 
thank you for your help Jim.