cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Sign-in to the JMP Community will be unavailable intermittently Dec. 6-7 due to a system update. Thank you for your understanding!
  • We’re retiring the File Exchange at the end of this year. The JMP Marketplace is now your destination for add-ins and extensions.
  • JMP 19 is here! Learn more about the new features.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
LaserGuy
Level II

Read Text File, Find Lines Containing a Certain String

Hello Everyone,

 

Here is a script I have that works. I want to know if there is a better way to do the same thing.

 

I first use "load text file" to read a text file into the string variable "file_text1".

 

Next, I use "words" with "\!n" to separate the lines and load them into the string array "file_text2".

 

Next, I then load the "file_text2" into a data table. I then use the "get rows where" combined with "contains" to find the line number in the original text file that contains a certain string.

 

Is there a function equivalent to "get rows where" but usable for string arrays? This way, I can skip the portion to load into data table.

 

file_text1 = load text file( "example.txt" );
file_text2 = words(file_text1 , "\!n");
dt0 = new table("file_dt", new column("file_content", character) );
	dt0:file_content << set values(file_text2);

pass_qty_row_array = dt0 << get rows where(contains( :file_content, "Pass") );
1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Read Text File, Find Lines Containing a Certain String

There is also For Each, Filter Each (and Transform Each) which can help with cases like this and it should be fairly fast.

 

Names Default To Here(1);

file_txt = Load Text File("$SAMPLE_IMPORT_DATA/Animals.txt");
file_list = words(file_txt, "\!n");
search_word = "fall";

pass_qty_row_array = [];
For Each({line, idx}, file_text2,
	If(Contains(line, search_word),
		Insert Into(pass_qty_row_array, idx);
	);
);
show(pass_qty_row_array);

 

 

-Jarmo

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: Read Text File, Find Lines Containing a Certain String

Here is another way to handle this.  It may be faster, or it may be slower.....I don't know.

names default to here(1);
dt0 = Open(
	"example.txt",
	columns( New Column( "Line", Character, "Nominal" ) ),
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 0 ),
		Column Names Start( 1 ),
		Data Starts( 1 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);

pass_qty_row_array = dt0 << get rows where(contains( :file_content, "Pass") );
Jim
ErraticAttack
Level VI

Re: Read Text File, Find Lines Containing a Certain String

Personally, I've created a Map(), Filter(), and Reduce() function for problems such as this.  Here is an example using the Map() function and on my system is is roughly 50% - 60% faster than creating a table to do the search.

 

Here is the map function (explicitly globalized):

::map = Function( {inputs /* list, function */ },
	/* uses a single underscore _ as the wild-card */
	{__i__, __result__, __list__, _, __, __keys__},
	__list__ = Eval( Arg( inputs, 1 ) );
	If( Is List( __list__ ),
		__result__ = {};
		Eval(
			Substitute(
				Expr(
					For( __i__ = 1, __i__ <= __N__, __i__++,
						_ = __list__[__i__];
						__result__[__i__] = __function__
					)
				)
			,
				Expr( __N__ ), N Items( __list__ ),
				Expr( __function__ ), Arg( inputs, 2 )
			);
		);
	,
		Is Associative Array( __list__ ),
		__result__ = [=>];
		__keys__ = __list__ << Get Keys;
		Eval(
			Substitute(
				Expr(
					For( __i__ = 1, __i__ <= __N__, __i__++,
						__ = __keys__[__i__];
						_ = __list__[__];
						__result__[__] = __function__
					)
				)
			,
				Expr( __N__ ), N Items( __keys__ ),
				Expr( __function__ ), Arg( inputs, 2 )
			)
		)
	);
	__result__
);

and here is a comparison of using a table vs. the map function:

Names Default To Here( 1 );
filename = Convert File Path( "$SAMPLE_IMPORT_DATA/UN Malaria 2009.csv", absolute, windows );
result = Load Text File( filename );
Show( result );

file_text2 = Words( result, "\!N" );
N = 10000;
word = "malaria";
s = HP Time();
Summation( i = 1, N,
	dt0 = New Table( "file_dt", New Column( "file_content", character ), Private );
	dt0:file_content << set values( file_text2 );

	pass_qty_row_array 1 = dt0 << get rows where( Contains( :file_content, word ) );
	close( dt0, No Save );
	0
);
Show( time 1 = (HP Time() - s) / 1000000 );

s = HP Time();
Summation( i = 1, N,
	pass_qty_row_array 2 = loc( Matrix( ::map({ file_text2, Contains( _, word ) }) ) );
	0
);
Show( time 2 = (HP Time() - s) / 1000000 );

Show( All( pass_qty_row_array 1 == pass_qty_row_array 2 ) ); 1 - (time 1 - time 2) / time 1

Aside from defining the map function, using it is usually much cleaner looking in code than any other solution.

Jordan
jthi
Super User

Re: Read Text File, Find Lines Containing a Certain String

There is also For Each, Filter Each (and Transform Each) which can help with cases like this and it should be fairly fast.

 

Names Default To Here(1);

file_txt = Load Text File("$SAMPLE_IMPORT_DATA/Animals.txt");
file_list = words(file_txt, "\!n");
search_word = "fall";

pass_qty_row_array = [];
For Each({line, idx}, file_text2,
	If(Contains(line, search_word),
		Insert Into(pass_qty_row_array, idx);
	);
);
show(pass_qty_row_array);

 

 

-Jarmo
LaserGuy
Level II

Re: Read Text File, Find Lines Containing a Certain String

Thank you everyone and especially jthi.

 

I have determined that using a for-loop on N Items(file_text2) was faster than using a data table. And then I saw jthi's method using "For Each", which from time measurements is even faster.

 

Recommended Articles