cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
LaserGuy
Level II

Read Text File, Find Lines Containing a Certain String

Hello Everyone,

 

Here is a script I have that works. I want to know if there is a better way to do the same thing.

 

I first use "load text file" to read a text file into the string variable "file_text1".

 

Next, I use "words" with "\!n" to separate the lines and load them into the string array "file_text2".

 

Next, I then load the "file_text2" into a data table. I then use the "get rows where" combined with "contains" to find the line number in the original text file that contains a certain string.

 

Is there a function equivalent to "get rows where" but usable for string arrays? This way, I can skip the portion to load into data table.

 

file_text1 = load text file( "example.txt" );
file_text2 = words(file_text1 , "\!n");
dt0 = new table("file_dt", new column("file_content", character) );
	dt0:file_content << set values(file_text2);

pass_qty_row_array = dt0 << get rows where(contains( :file_content, "Pass") );
1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Read Text File, Find Lines Containing a Certain String

There is also For Each, Filter Each (and Transform Each) which can help with cases like this and it should be fairly fast.

 

Names Default To Here(1);

file_txt = Load Text File("$SAMPLE_IMPORT_DATA/Animals.txt");
file_list = words(file_txt, "\!n");
search_word = "fall";

pass_qty_row_array = [];
For Each({line, idx}, file_text2,
	If(Contains(line, search_word),
		Insert Into(pass_qty_row_array, idx);
	);
);
show(pass_qty_row_array);

 

 

-Jarmo

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: Read Text File, Find Lines Containing a Certain String

Here is another way to handle this.  It may be faster, or it may be slower.....I don't know.

names default to here(1);
dt0 = Open(
	"example.txt",
	columns( New Column( "Line", Character, "Nominal" ) ),
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 0 ),
		Column Names Start( 1 ),
		Data Starts( 1 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);

pass_qty_row_array = dt0 << get rows where(contains( :file_content, "Pass") );
Jim
ErraticAttack
Level VI

Re: Read Text File, Find Lines Containing a Certain String

Personally, I've created a Map(), Filter(), and Reduce() function for problems such as this.  Here is an example using the Map() function and on my system is is roughly 50% - 60% faster than creating a table to do the search.

 

Here is the map function (explicitly globalized):

::map = Function( {inputs /* list, function */ },
	/* uses a single underscore _ as the wild-card */
	{__i__, __result__, __list__, _, __, __keys__},
	__list__ = Eval( Arg( inputs, 1 ) );
	If( Is List( __list__ ),
		__result__ = {};
		Eval(
			Substitute(
				Expr(
					For( __i__ = 1, __i__ <= __N__, __i__++,
						_ = __list__[__i__];
						__result__[__i__] = __function__
					)
				)
			,
				Expr( __N__ ), N Items( __list__ ),
				Expr( __function__ ), Arg( inputs, 2 )
			);
		);
	,
		Is Associative Array( __list__ ),
		__result__ = [=>];
		__keys__ = __list__ << Get Keys;
		Eval(
			Substitute(
				Expr(
					For( __i__ = 1, __i__ <= __N__, __i__++,
						__ = __keys__[__i__];
						_ = __list__[__];
						__result__[__] = __function__
					)
				)
			,
				Expr( __N__ ), N Items( __keys__ ),
				Expr( __function__ ), Arg( inputs, 2 )
			)
		)
	);
	__result__
);

and here is a comparison of using a table vs. the map function:

Names Default To Here( 1 );
filename = Convert File Path( "$SAMPLE_IMPORT_DATA/UN Malaria 2009.csv", absolute, windows );
result = Load Text File( filename );
Show( result );

file_text2 = Words( result, "\!N" );
N = 10000;
word = "malaria";
s = HP Time();
Summation( i = 1, N,
	dt0 = New Table( "file_dt", New Column( "file_content", character ), Private );
	dt0:file_content << set values( file_text2 );

	pass_qty_row_array 1 = dt0 << get rows where( Contains( :file_content, word ) );
	close( dt0, No Save );
	0
);
Show( time 1 = (HP Time() - s) / 1000000 );

s = HP Time();
Summation( i = 1, N,
	pass_qty_row_array 2 = loc( Matrix( ::map({ file_text2, Contains( _, word ) }) ) );
	0
);
Show( time 2 = (HP Time() - s) / 1000000 );

Show( All( pass_qty_row_array 1 == pass_qty_row_array 2 ) ); 1 - (time 1 - time 2) / time 1

Aside from defining the map function, using it is usually much cleaner looking in code than any other solution.

Jordan
jthi
Super User

Re: Read Text File, Find Lines Containing a Certain String

There is also For Each, Filter Each (and Transform Each) which can help with cases like this and it should be fairly fast.

 

Names Default To Here(1);

file_txt = Load Text File("$SAMPLE_IMPORT_DATA/Animals.txt");
file_list = words(file_txt, "\!n");
search_word = "fall";

pass_qty_row_array = [];
For Each({line, idx}, file_text2,
	If(Contains(line, search_word),
		Insert Into(pass_qty_row_array, idx);
	);
);
show(pass_qty_row_array);

 

 

-Jarmo
LaserGuy
Level II

Re: Read Text File, Find Lines Containing a Certain String

Thank you everyone and especially jthi.

 

I have determined that using a for-loop on N Items(file_text2) was faster than using a data table. And then I saw jthi's method using "For Each", which from time measurements is even faster.