Subscribe Bookmark RSS Feed

add rows based on a column values in the same data table

qais_hatim_fda_

Community Trekker

Joined:

Sep 22, 2015

Dear All,

I have data table that contain ID, last inspection data, and last inspection decision columns. I want to combine the last inspection data column with the last inspection decision in order to have one column (stacked) to be used in association analysis. For instance, one cluster of my data structure looks like the following:

ID             Last Inspection Decision            Last Inspection Data 

101                                                             PRODUCT SPECIFICATIONS DEFECT
101                                                            CGMP Deviations
101                                                            21 CFR 211.113
101            VAI                                          LIQ_US

I would like to move Last Inspection Decision, in this case VAI, under Last Inspection Data column and create new row that has the same ID, in this case 101. Therefore I need my stacked data to be like:

ID                         Last Inspection Data 

101                      PRODUCT SPECIFICATIONS DEFECT
101                      CGMP Deviations
101                      21 CFR 211.113
101                     LIQ_US

101                      VAI

 

Moreover, some of these IDs are repeated but with different value in one or more of the Last Inspection Data. For instance, Lab DEFECT can be another problem for the same ID so I create another cluster of rows when ever any of these values differ from previous one.

ID             Last Inspection Decision            Last Inspection Data 

101                                                             Lab DEFECT
101                                                            CGMP Deviations
101                                                            21 CFR 211.113
101            VAI                                          LIQ_US

 

Thank you very much for your help in advance.

 

Best,

Qais 

3 REPLIES
ron_horne

Super User

Joined:

Jun 23, 2011

This is not the most robust solution but give it a go:

 

dt = current data table ();

for each row( if(!is missing (:Last Inspection Decision),
dt << add rows(1, row()),
is missing (:Last Inspection Data), :Last Inspection Data[row()] = :Last Inspection Decision[row()-1])
 );

 

qais_hatim_fda_

Community Trekker

Joined:

Sep 22, 2015

Hi,

It is not working as I have more than 100,000 rows. I tried to break my data to three parts, each one with almost 35,000 rows, but it is not working. Any suggestion?

 

Best,

Qais 

ian_jmp

Staff

Joined:

Jun 23, 2011

I'm not sure if I have understood correctly, but here's an alternative. If it's what you want, you also need to see how well it will scale.

 

NamesDefaultToHere(1);

// Example table . . .
dt = New Table( "Inspection",
					Add Rows( 10 ),
					New Column( "ID",
						Character,
						"Nominal",
						Set Values(
							{"101", "101", "101", "101", "102", "102", "102", "101", "101", "101"}
						)
					),
					New Column( "Last Inspection Decision",
						Character,
						"Nominal",
						Set Values( {"", "", "", "VAI", "", "", "Code 1", "", "", "Code 2"} )
					),
					New Column( "Last Inspection Data",
						Character,
						"Nominal",
						Set Values(
							{"PRODUCT SPECIFICATIONS DEFECT", "CGMP Deviations", "21 CFR 211.113",
							"LIQ_US", "PRODUCT SPECIFICATIONS DEFECT", "21 CFR 211.113", "LIQ_US",
							"PRODUCT SPECIFICATIONS DEFECT", "CGMP Deviations", "21 CFR 211.113"}
						)
					)
				);

// Copy the table
dt2 = Eval(dt << getScript);
dt2 << setName(dt << getName ||" Stacked");

// Get the values from the two columns of interest into lists
ids = Column(dt2, "ID") << getValues;
vals = Column(dt2, "Last Inspection Data") << getValues;

// Find out where we need to insert new values, get the required values, and insert them into the right positions
insertAfter = dt2 << getRowsWhere(!IsMissing(:Last Inspection Decision));
for(r=1, r<= NItems(insertAfter), r++,
	InsertInto(ids, Column(dt2, "ID")[insertAfter[r]], insertAfter[r] + r);
	InsertInto(vals, Column(dt2, "Last Inspection Decision")[insertAfter[r]], insertAfter[r] + r);
);

// Add rows to teh copy of the table, insert the column values, and delete the column we don't need
dt2 << addRows(NRows(insertAfter));
Column(dt2, "ID") << setValues(ids);
Column(dt2, "Last Inspection Data") << setValues(vals);
dt2 << deleteColumns(:Last Inspection Decision);