cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Learn how to build custom Python data connectors and further customize JMP’s Data Connector Framework with the Python Data Connector Demo, available now in the JMP Marketplace!
  • See how to create experiments to support product design and ID useful product features. Register for June 12 webinar, 2pm US Eastern Time.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
LaserGuy
Level II

Read Text File, Find Lines Containing a Certain String

Hello Everyone,

 

Here is a script I have that works. I want to know if there is a better way to do the same thing.

 

I first use "load text file" to read a text file into the string variable "file_text1".

 

Next, I use "words" with "\!n" to separate the lines and load them into the string array "file_text2".

 

Next, I then load the "file_text2" into a data table. I then use the "get rows where" combined with "contains" to find the line number in the original text file that contains a certain string.

 

Is there a function equivalent to "get rows where" but usable for string arrays? This way, I can skip the portion to load into data table.

 

file_text1 = load text file( "example.txt" );
file_text2 = words(file_text1 , "\!n");
dt0 = new table("file_dt", new column("file_content", character) );
	dt0:file_content << set values(file_text2);

pass_qty_row_array = dt0 << get rows where(contains( :file_content, "Pass") );
1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Read Text File, Find Lines Containing a Certain String

There is also For Each, Filter Each (and Transform Each) which can help with cases like this and it should be fairly fast.

 

Names Default To Here(1);

file_txt = Load Text File("$SAMPLE_IMPORT_DATA/Animals.txt");
file_list = words(file_txt, "\!n");
search_word = "fall";

pass_qty_row_array = [];
For Each({line, idx}, file_text2,
	If(Contains(line, search_word),
		Insert Into(pass_qty_row_array, idx);
	);
);
show(pass_qty_row_array);

 

 

-Jarmo

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: Read Text File, Find Lines Containing a Certain String

Here is another way to handle this.  It may be faster, or it may be slower.....I don't know.

names default to here(1);
dt0 = Open(
	"example.txt",
	columns( New Column( "Line", Character, "Nominal" ) ),
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 0 ),
		Column Names Start( 1 ),
		Data Starts( 1 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);

pass_qty_row_array = dt0 << get rows where(contains( :file_content, "Pass") );
Jim
ErraticAttack
Level VI

Re: Read Text File, Find Lines Containing a Certain String

Personally, I've created a Map(), Filter(), and Reduce() function for problems such as this.  Here is an example using the Map() function and on my system is is roughly 50% - 60% faster than creating a table to do the search.

 

Here is the map function (explicitly globalized):

::map = Function( {inputs /* list, function */ },
	/* uses a single underscore _ as the wild-card */
	{__i__, __result__, __list__, _, __, __keys__},
	__list__ = Eval( Arg( inputs, 1 ) );
	If( Is List( __list__ ),
		__result__ = {};
		Eval(
			Substitute(
				Expr(
					For( __i__ = 1, __i__ <= __N__, __i__++,
						_ = __list__[__i__];
						__result__[__i__] = __function__
					)
				)
			,
				Expr( __N__ ), N Items( __list__ ),
				Expr( __function__ ), Arg( inputs, 2 )
			);
		);
	,
		Is Associative Array( __list__ ),
		__result__ = [=>];
		__keys__ = __list__ << Get Keys;
		Eval(
			Substitute(
				Expr(
					For( __i__ = 1, __i__ <= __N__, __i__++,
						__ = __keys__[__i__];
						_ = __list__[__];
						__result__[__] = __function__
					)
				)
			,
				Expr( __N__ ), N Items( __keys__ ),
				Expr( __function__ ), Arg( inputs, 2 )
			)
		)
	);
	__result__
);

and here is a comparison of using a table vs. the map function:

Names Default To Here( 1 );
filename = Convert File Path( "$SAMPLE_IMPORT_DATA/UN Malaria 2009.csv", absolute, windows );
result = Load Text File( filename );
Show( result );

file_text2 = Words( result, "\!N" );
N = 10000;
word = "malaria";
s = HP Time();
Summation( i = 1, N,
	dt0 = New Table( "file_dt", New Column( "file_content", character ), Private );
	dt0:file_content << set values( file_text2 );

	pass_qty_row_array 1 = dt0 << get rows where( Contains( :file_content, word ) );
	close( dt0, No Save );
	0
);
Show( time 1 = (HP Time() - s) / 1000000 );

s = HP Time();
Summation( i = 1, N,
	pass_qty_row_array 2 = loc( Matrix( ::map({ file_text2, Contains( _, word ) }) ) );
	0
);
Show( time 2 = (HP Time() - s) / 1000000 );

Show( All( pass_qty_row_array 1 == pass_qty_row_array 2 ) ); 1 - (time 1 - time 2) / time 1

Aside from defining the map function, using it is usually much cleaner looking in code than any other solution.

Jordan
jthi
Super User

Re: Read Text File, Find Lines Containing a Certain String

There is also For Each, Filter Each (and Transform Each) which can help with cases like this and it should be fairly fast.

 

Names Default To Here(1);

file_txt = Load Text File("$SAMPLE_IMPORT_DATA/Animals.txt");
file_list = words(file_txt, "\!n");
search_word = "fall";

pass_qty_row_array = [];
For Each({line, idx}, file_text2,
	If(Contains(line, search_word),
		Insert Into(pass_qty_row_array, idx);
	);
);
show(pass_qty_row_array);

 

 

-Jarmo
LaserGuy
Level II

Re: Read Text File, Find Lines Containing a Certain String

Thank you everyone and especially jthi.

 

I have determined that using a for-loop on N Items(file_text2) was faster than using a data table. And then I saw jthi's method using "For Each", which from time measurements is even faster.

 

Recommended Articles