cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
djhanson
Level V

col<

Anyone know of any 'simple' syntax to do this?  col<<get values gets all of the column's row values (excluded, included, etc).  Would be cool if their was some obscure JSL column message to get a column's rows which are not excluded.  Let me know if you know... thx, dj

1 ACCEPTED SOLUTION

Accepted Solutions
ms
Super User (Alumni) ms
Super User (Alumni)

col<

Not pretty but it returns a matrix of the values of non-excluded rows.

col[dt << select where( Excluded(Row State()) != 1 ) << get selected rows]

View solution in original post

8 REPLIES 8
ms
Super User (Alumni) ms
Super User (Alumni)

col<

Not pretty but it returns a matrix of the values of non-excluded rows.

col[dt << select where( Excluded(Row State()) != 1 ) << get selected rows]

djhanson
Level V

col<

MS, thanks a bunch!  That's exactly what I was looking for and works perfectly for my needs .  dj

hogi
Level XII

Re: col<

Very old post ...
is there now a better way to do this?

Why is there a

dt << Get Excluded Rows();

but no

dt << Get nonExcluded Rows();

?

hogi
Level XII

Re: col<

what is the fastest way to get 

 

dt << Get nonExcluded Rows();

 

in JMP 17 for a data set with 10 mio rows - without changing the selection?

txnelson
Super User

Re: col<

nonExcluded = dt << get rows where( Excluded( Row State( Row() ) ) == 0 );
Jim
jthi
Super User

Re: col<

Not sure about the fastest but this should work (it might need some additional error handling)

Names Default To Here(1);
/* init */
dt = Open("$SAMPLE_DATA/Big Class.jmp");
dt << Select Rows([2,3,4,30]) << Exclude(1) << Clear Select;

// get row indices
excl_rows = dt << Get Excluded Rows;
all_rows = 1::N Rows(dt);
all_rows[excl_rows] = .;
all_rows = Loc(all_rows);

Also a good idea to keep in mind that fastest isn't always the best

-Jarmo
hogi
Level XII

Re: col<

Thanks.

 

I also added a version which scales linearly with # of excluded rows (ecluded rows are removed one by one).

It's quite competitive for small numbers of excluded rows, but performs worse and worse for large numbers.

Around 200 excluded rows it reached the performance of Jim's version and gets then really bad ...

 

In addition, I found a version ("Jim X"), which is ~ factor 2 faster than Jim's version - and even shorter code:

	nonExludedRows = dt << get rows where( Excluded() == 0 );

Most important:

Jarmo's version is by far the best.

hogi_4-1667508531636.png

View more...
dt = New Table( "dummy table" );
NR = 10E6;
dt << add rows( NR );
dt << New Column( "random", set each value( Random Uniform( 0, 100 ) ) );

For Each( {percent, idx},
	{0.00001, 0.00002, 0.00005, 0.0001, 0.0002, 0.0005, 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 15, 20, 30, 40, 50, 60,
	70, 80, 90, 100},
	Print( percent );
	dt << Clear Row States;
	dt << select where( :random < percent );
	dt << Hide and Exclude;
	nexcl = N Rows( dt << Get Excluded Rows );
	dt << Clear Select;
	Wait();

	// Jim
	bt = HP Time();
	allRows = dt << get rows where( Excluded( Row State( Row() ) ) == 0 );
	et = HP Time();
	time1 = et - bt;
	
	// Jim X
	bt = HP Time();
	allRows = dt << get rows where( Excluded() == 0 );
	et = HP Time();
	time2 = et - bt;

	// Jarmo
	bt = HP Time();
	excl_rows = dt << Get Excluded Rows;
	et = HP Time();
	time3a = et - bt;
	all_rows = 1 :: N Rows( dt );
	all_rows[excl_rows] = .;
	all_rows = Loc(all_rows);
	et = HP Time();
	time3 = et - bt;
	

	// hogi
	If( idx < 11,
		bt = HP Time();
		excl = dt << Get Excluded Rows;
		allRows = (1 :: N Rows( dt ));
		For Each( {i, j}, Reverse( excl ), Remove From( allRows, i, 1 ) );
		et = HP Time();
		time4 = et - bt;
	);

	Wait();

	If( idx == 1,
		New Window( "timing results",
			tb = Table Box(
				Number Col Box( "# excluded", {Nexcl} ),
				Number Col Box( "Jim", {time1 / 1000000} ),
				Number Col Box( "Jim X", {time2 / 1000000} ),
				Number Col Box( "Jarmo", {time3 / 1000000} ),
				Number Col Box( "Jarmo (get ... rows)", {time3a / 1000000} ),
				Number Col Box( "hogi", {time4 / 1000000} ), 
				
			)
		)
	,
		If( idx < 11,
			tb << add row( {Nexcl, time1 / 1000000, time2 / 1000000, time3 / 1000000,time3a / 1000000, time4 / 1000000} ),
			tb << add row( {Nexcl, time1 / 1000000, time2 / 1000000, time3 / 1000000, time3a / 1000000} )
		)
		
	);
);

results = tb << Make Combined Data Table;

gb = results << Graph Builder(
	Show Control Panel( 0 ),
	Variables( X( :"# excluded"n ), Y( :Jim ), Y( :Jim X, Position( 1 ) ), Y( :Jarmo, Position( 1 ) ), Y( :hogi, Position( 1 ) ) ),
	Elements( Points( X, Y( 1 ), Y( 2 ), Y( 3 ), Y( 4 ), Legend( 5 ) ), Smoother( X, Y( 1 ), Y( 2 ), Y( 3 ), Y( 4 ), Legend( 6 ), Lambda( 0.001 ) ) )
);

Report( gb )[AxisBox( 1 )] << {Scale( "Log" ), Min( 1 ), Max( 10000000 )};
Report( gb )[AxisBox( 2 )] << {Scale( "Log" ), Min( 0.1 ), Max( 10 ),Label Row( Show Minor Grid( 1 ) )};
Report( gb )[Text Edit Box( 4 )] << Set Text( "time [s]" )  

 

Re: col<

Thank you for the suggestions. JMP 18 will implement a new series of JSL functions that improves the performance and readability of the specific query discussed here and other similar queries. Documentation has been updated also to reflect the preferred/faster methods. 

The JSL looks something like this:

//instead of Get Rows Where()
Where( dt, Excluded() );
Where( dt, !Excluded() );
Where( dt, Hidden() );
Where( dt, !Hidden() );
Where( dt, Labeled() ); 
Where( dt, !Labeled() );
Where( dt, Is Missing() );