Discussions

Yass · Jul 9, 2025 10:44 AM

Hi everyone,

I’m working in JMP with two tables:

TPMS_Sensor_install_DATA.jmp : contains a single key column ID_HEX.
TableID Globale.jmp contains the key column ID along with all the descriptive columns (Ref Produit, ProductName, Date, Operator, etc.).

Goal: Filter the global table so that it only keeps rows whose ID appears in my ID_HEX list, and then return only the Ref Produit and ProductName columns.

Here’s the script I’ve tried—it ends up returning the entire global table unfiltered:

// 1. Open both tables invisibly
dtSensor = Open( "D:\\Project JMP\\IDEnorme\\TPMS_Sensor_install_DATA.jmp", Invisible );
dtGlobal = Open( "D:\\Project JMP\\Tables Données\\TableID Globale.jmp", Invisible );

// 2. Extract the list of ID_HEX values from dtSensor
hexList = Column( dtSensor, "ID_HEX" ) << Get Values;

// 3. Subset dtGlobal: keep only rows where :ID is in hexList,
//    and select only Ref Produit + ProductName
dtResult = dtGlobal << Subset(
    Where( Contains( hexList, :ID ) ),
    Select(
        :"Ref Produit",
        :"ProductName"
    ),
    Output Table Name( "Résultats Correspondance" )
);

// 4. Show the result
dtResult << Show Window;

Can someone help me correct this script so that it: actually filters TableID Globale by the hexList of ID_HEX values, and returns only the Ref Produit and ProductName columns with the ID matched.

Any pointers or example JSL snippets would be greatly appreciated!

Thanks in advance !

jthi · Jul 9, 2025 11:15 AM

You can get rid off the duplicates on your dtSensor for example by using Summarize, it will help slightly

Summarize(dtSensor, hexList = By(:ID_HEX));

-Jarmo

View solution in original post

Craige_Hales · Jul 9, 2025 03:56 PM

using an associative array, you should see <2 minutes

dtSensor = Open( "$DESKTOP/TPMS_Sensor_install_DATA.jmp" );
hexList = Column( dtSensor, "ID_HEX" ) << Get Values; //1048575 
lookup = Associative Array( hexList );//231356
hexList = lookup << getkeys; // the keep list.
// manufacture a 12 million row table with 50% matches
n = 12e6; // about 1/4 minute to build a test table...
dtGlobal = New Table( "Untitled",
	Add Rows( n ),
	New Column( "ID_HEX",
		Character,
		"Nominal",
		Set Values( Transform Each( {v}, Repeat( {"1", "0"}, n / 2 ), If( v == "1", "badc0de", hexlist[Random Integer( 1, N Items( hexlist ) )] ) ) )
	),
	New Column( "Ref Produit", setvalues( 1 :: n ) ), // even rows match the keep list
	New Column( "ProductName", setvalues( 1 :: n ) ) // odd rows are badc0de, not in keep list
);


unwantedrows = dtGlobal << get rows where( !(lookup << Contains( ID_HEX )) ); // about a minute to identify the rows

dtGlobal << deleterows( unwantedrows );

Craige

View solution in original post

jthi · Jul 9, 2025 10:52 AM

Have you tried if join would be enough for you? Other option would be to use << Get Rows Where to get the rows of interest and subset based on that

rows_of_interest = dtGlobal << Get Rows Where(Contains(hexList, :ID));
dt_new = dtGlobal << Subset(Rows(rows_of_interest), Columns(:Ref Produit, :ProductName), Output table("Filtered"));

or something similar

-Jarmo

Yass · Jul 9, 2025 8:10 AM

I’ve tried it, but the script takes an extremely long time and produces no output it seems to hang completely. For context, dtGlobal contains 12 million rows and dtSensor contains 400 000.

dtSensor = Open( "D:\\Project JMP\\IDEnorme\\TPMS_Sensor_install_DATA.jmp", Invisible );
dtGlobal = Open( "D:\\Project JMP\\Tables Données\\TableID Globale.jmp", Invisible );

hexList = Column( dtSensor, "ID_HEX" ) << Get Values;

rows_of_interest = dtGlobal << Get Rows Where(
    Contains( hexList, :ID )
);
dtResult = dtGlobal << Subset(
    Rows( rows_of_interest ),
    Select( 
        :Ref Produit, 
        :ProductName 
    ),
    Output Table Name( "Résultats Correspondance" )
);

dtResult << Show Window;

jthi · Jul 9, 2025 11:15 AM

You can get rid off the duplicates on your dtSensor for example by using Summarize, it will help slightly

Summarize(dtSensor, hexList = By(:ID_HEX));

-Jarmo

Craige_Hales · Jul 9, 2025 03:56 PM

using an associative array, you should see <2 minutes

dtSensor = Open( "$DESKTOP/TPMS_Sensor_install_DATA.jmp" );
hexList = Column( dtSensor, "ID_HEX" ) << Get Values; //1048575 
lookup = Associative Array( hexList );//231356
hexList = lookup << getkeys; // the keep list.
// manufacture a 12 million row table with 50% matches
n = 12e6; // about 1/4 minute to build a test table...
dtGlobal = New Table( "Untitled",
	Add Rows( n ),
	New Column( "ID_HEX",
		Character,
		"Nominal",
		Set Values( Transform Each( {v}, Repeat( {"1", "0"}, n / 2 ), If( v == "1", "badc0de", hexlist[Random Integer( 1, N Items( hexlist ) )] ) ) )
	),
	New Column( "Ref Produit", setvalues( 1 :: n ) ), // even rows match the keep list
	New Column( "ProductName", setvalues( 1 :: n ) ) // odd rows are badc0de, not in keep list
);


unwantedrows = dtGlobal << get rows where( !(lookup << Contains( ID_HEX )) ); // about a minute to identify the rows

dtGlobal << deleterows( unwantedrows );

Craige

Yass · Jul 9, 2025 06:23 PM

I removed duplicates from both data tables as Jthi suggested, used the associative‐array method, and it worked !
Thanks so much for the help !

jthi · Jul 10, 2025 02:35 AM

I would suggest using Summarize if just the list is being utilized AND the values are strings. Using associative array for just duplicate removal does get slow with larger datasets. Summary table is fastest with large data tables, but it does require slightly more code, but it does have benefits such as keeping numbers as numbers. The ordering may also be different between Associative Array and Summary/Summarize

Names Default To Here(1);

dttemp = Open("$DOWNLOADS/TPMS_Sensor_install_DATA.jmp");
hexList = Column(dttemp, "ID_HEX") << Get Values;
lookup = Associative Array(hexList);
hexList = lookup << getkeys;

n = 12e6; 
dt = New Table("Untitled",
	Add Rows(n),
	New Column("ID_HEX",
		Character,
		"Nominal",
		Set Values(
			Transform Each({v}, Repeat({"1", "0"}, n / 2),
				If(v == "1",
					"badc0de",
					uniq[Random Integer(1, N Items(uniq))]
				)
			)
		)
	),
	Private
);
Close(dttemp, no save);

wait(0);
s = Tick Seconds();
Summarize(dt, uniq_summarize = By(:ID_HEX));
wait(0);
summarize_dur = Tick Seconds() - s;
wait(0);


s = Tick Seconds();
uniq_aa = Associative Array(:ID_HEX) << get keys;
wait(0);
aa_dur = Tick Seconds() - s;
wait(0);


s = Tick Seconds();
vals = Column(dt, "ID_HEX") << Get Values;
aa = Associative Array(vals);
uniq_aa2 = Associative Array(aa) << get keys;
wait(0);
aa2_dur = Tick Seconds() - s;
wait(0);


s = Tick Seconds();
dt_summary = dt << Summary(
	Group(:ID_HEX),
	Freq("None"),
	Weight("None"),
	output table name("Summary of Untitled grouped by ID_HEX"),
	Private
);
uniq_summary = dt_summary[0, 1];
Close(dt_summary, no save);
wait(0);
summary_dur = Tick Seconds() - s;

Close(dt, no save);
Show(N Items(uniq_summarize), N Items(uniq_aa), N Items(uniq_aa2), N Items(uniq_summary));
Show(summarize_dur, aa_dur, aa2_dur, summary_dur);

Names Default To Here(1); dttemp = Open("$DOWNLOADS/TPMS_Sensor_install_DATA.jmp"); hexList = Column(dttemp, "ID_HEX") << Get Values; lookup = Associative Array(hexList); hexList = lookup << getkeys; n = 12e6; dt = New Table("Untitled", Add Rows(n), New Column("ID_HEX", Character, "Nominal", Set Values( Transform Each({v}, Repeat({"1", "0"}, n / 2), If(v == "1", "badc0de", uniq[Random Integer(1, N Items(uniq))] ) ) ) ), Private ); Close(dttemp, no save); wait(0); s = Tick Seconds(); Summarize(dt, uniq_summarize = By(:ID_HEX)); wait(0); summarize_dur = Tick Seconds() - s; wait(0); s = Tick Seconds(); uniq_aa = Associative Array(:ID_HEX) << get keys; wait(0); aa_dur = Tick Seconds() - s; wait(0); s = Tick Seconds(); vals = Column(dt, "ID_HEX") << Get Values; aa = Associative Array(vals); uniq_aa2 = Associative Array(aa) << get keys; wait(0); aa2_dur = Tick Seconds() - s; wait(0); s = Tick Seconds(); dt_summary = dt << Summary( Group(:ID_HEX), Freq("None"), Weight("None"), output table name("Summary of Untitled grouped by ID_HEX"), Private ); uniq_summary = dt_summary[0, 1]; Close(dt_summary, no save); wait(0); summary_dur = Tick Seconds() - s; Close(dt, no save); Show(N Items(uniq_summarize), N Items(uniq_aa), N Items(uniq_aa2), N Items(uniq_summary)); Show(summarize_dur, aa_dur, aa2_dur, summary_dur);

N Items(uniq_summarize) = 231357;
N Items(uniq_aa) = 231357;
N Items(uniq_aa2) = 231357;
N Items(uniq_summary) = 231357;
summarize_dur = 2.36666666666861;
aa_dur = 32.5166666666628;
aa2_dur = 32.25;
summary_dur = 1.30000000000291;

Depending on the real data, this can most likely made quite fast as there aren't too many rows for JMP to easily handle.

-Jarmo

Discussions

Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Re: Matching ID_HEX List to Global Reference Table

Recommended Articles