Subscribe Bookmark RSS Feed

Specific case of tables joining

Highlighted
miguello

Community Trekker

Joined:

Jan 27, 2016

All,

 

I have a specific problem to solve, and, although I succefully solved it on a tiny subset of original data manually, I won't be able to do this to the whole huge set of data.

 

So, here's the problem. I have two tables.

First table - some measurements done on some set of tools

 

ToolDateMeasurement

A001

1/1/20170.005
B0062/2/20170.003

 

Second table - history of some parts replacement on the same tools:

 

ToolDatePart S/N
A0011/1/20161234567
B0062/2/20160987654
A00112/29/20161234568

 

What I would like to see, how measurements for each tool depend on the installed part age. What I need as a resultant table is the same table with measurements, only with few added columns:

 

 

ToolDateMeasurementInstalled partInstallation dateAge of part

A001

1/1/20170.005123456812/29/20164 days
B0062/2/20170.003098765412/29/2016365 days

 

 

Basically, I did this manually for one tool - just created column and filled it with the ear;liest replacement date. Then moved down the measurement table until the date of next install - and from there filled the same column to the end with new install date, then moved further.... Tedious.

 

I need to somehow automate that, but I can't even start wrapping my head around it.

So, for each line in measurement table, I need to find a line in parts replacement table, that has matched tool, and has install date earlier than the date for line under consideration. Then I would need to take that serial number, that date and put it in measurement table. Then it's easy to calculate part age. Some parts install dates are so long ago that they are not covered by both these tables, so I might have lots of missing values.

Could somebody help me with methodology and maybe a little bit with starting the script?

 

Thanks a lot!

 

7 REPLIES
miguello

Community Trekker

Joined:

Jan 27, 2016

By the way, sorry for ugly tables - in the edit window they looked OK. Can't find any way to go back to editing topic to make tables more readable. Is it even possible?

Thanks!
txnelson

Super User

Joined:

Jun 22, 2012

you can insert a table by clicking on the table icon

 

11 44 222
333 444 66
88 55 99
Jim
miguello

Community Trekker

Joined:

Jan 27, 2016

That is exactly what I did. Looked good until I posted.
ian_jmp

Staff

Joined:

Jun 23, 2011

You could try something like this:

NamesDefaultToHere(1);
dt1 = DataTable("Measurements On Tools");
dt2 = DataTable("Parts Replaced");

// Get a list of the tools on which measurements were made
toolsUsed = Column(dt1, "Tool") << getValues;
// Get a list of the tools which had one or more parts replaced
toolsWithReplacedPart = Column(dt2, "Tool") << getValues;
// Get the corresponding replacement dates
replacedOn = Column(dt2, "Date") << getValues;
// And the corresponding part numbers
partSN = Column(dt2, "Part SN") << getValues;
// Make a new vector and a list that will be populated via the loop below
dates = J(NRow(dt1), 1, .);
parts = {};
// Loop over each measurement made
for(t=1, t<=NItems(toolsUsed), t++,
	thisTool = toolsUsed[t];
	// Find the rows in dt2 in which this tool is referenced
	rowsForThisTool = Loc(toolsWithReplacedPart, thisTool);
	// Find the latest date on which a part was replaced and store it
	latestReplacementDate = Max(replacedOn[rowsForThisTool]);
	dates[t] = latestReplacementDate;
	// Identify the new part used and store it
	rowForLatestPart = Loc(replacedOn[rowsForThisTool] == latestReplacementDate);
	newPart = partSN[rowsForThisTool][rowForLatestPart];
	InsertInto(parts, newPart);
);
// Add the new columns to dt1
dt1 << newColumn("Last Installed Part", Character, Values(parts));
dt1 << newColumn("Installed Date", Numeric, Continuous, Values(dates), Format( "m/d/y", 12 ));
dt1 << New Column( "Age of Part", Numeric, Continuous, Format( ":day:hr:m", 14 ), Formula( :Date - :Installed Date ));

The code is verbose to try to show the logic used. Bear in mind this has had no testing, and could be much improved: For example the '<<getValues()' are not needed (in JMP 13 at least).

miguello

Community Trekker

Joined:

Jan 27, 2016

Ian,

 

Thanks a lot for such a detailed answer.

Quick question on this:

 

latestReplacementDate = Max(replacedOn[rowsForThisTool]);

Parts replacements are done regularly as measurements go. So I need not only find the latest, I need to find the part that was on the tool during that measurement, simple latest won't do. Basically, I need to find the latest replacedOn date such that the date of measurement is greater than the repplacedOn date. Any suggestion on how to do it in a most efficient and elegant way? I'm having years worth of daily data on tens of tools with parts replacements every month or so, so it better be efficient - so far I can only come up with another loop comparing dates one by one.

 

If you have a more elegant solution right out of your mind, I'd appreciate this very much.

miguello

Community Trekker

Joined:

Jan 27, 2016

Ian,

 

Do you think something like this would work:

 

currentReplacementDate = Max(replacedOn[rowsForThisTool]<thisDate);

Where thisDate is analog of thisTool:

 

thisDate = currentMeasDate[t];

and currentMeasDate is just a date from measurement table.

 

 

Now, since these two tables are produced by different systems, dates formats are different (y/m/d vs m/d/y hh:mm:ss  or something like that). Do I need to somehow translate one to another, or it'll be done automatically?

miguello

Community Trekker

Joined:

Jan 27, 2016

Ian,

 

Are you sure that

Loc(replacedOn[rowsForThisTool] == latestReplacementDate);

is correct syntaxis?

 

I mean I get what you wanted to do here, but  it gives me error and syntaxis help says that it whould be Loc(x,y) - gives you all indices in x where it is equal to y. Equal only, not any logical statement. Any ideas how to re-write this statement?

 

Thanks!