cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Register for our Discovery Summit 2024 conference, Oct. 21-24, where you’ll learn, connect, and be inspired.
Choose Language Hide Translation Bar

How do I identify numbers in a paragraph and extract as column in new tabel?

I have a text paragraph containing several numbers in different sentences and I would like to extract those numbers and get them stacked in a column in a new tabel.

The sentence looks like this: 

 

The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. The primary sample of A: 10 kg, B: 20 kg, C: 30 kg, 40 kg and E: 50 kg was dried for 24 hours at 105 C. The sample was subject to repeated splitting, leaving some 800-1000 grams which was further milled to pass a 100-mesh test sieve. The crushed sample is blended in a V-blender for 20 minutes and transferred to 100 ml waterproof plastic bottle and sealed.

 

In addition the number of different primary samples can vary from 1 up to 20 or 30.

Very thankful if somebody could point me in the right direction with a script. Tried to write it with the help of Copilot, but did not get the result I wanted.

Thanks a heap!

 

2 REPLIES 2
txnelson
Super User

Re: How do I identify numbers in a paragraph and extract as column in new tabel?

Here is one way to do this:

names default to here(1);
paragraph="The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. 
The primary sample of A: 10 kg, B: 20 kg, C: 30 kg, 40 kg and E: 50 kg was dried for 24 hours at 105 C. 
The sample was subject to repeated splitting, leaving some 800-1000 grams which was further milled to 
pass a 100-mesh test sieve. The crushed sample is blended in a V-blender for 20 minutes and transferred 
to 100 ml waterproof plastic bottle and sealed.";

theNumbers = {};
i = 1;
While( Word( i, paragraph, " ." ) != "",
	theWord = Word( i, paragraph, " ." );
	If( Is Missing( Num( theWord ) ) == 0,
		Insert Into( theNumbers, Num( theWord ) )
	);
	i++;
);
Show( theNumbers );

I am sure there is someone who will also provide a RegEx() solution.

Jim
jthi
Super User

Re: How do I identify numbers in a paragraph and extract as column in new tabel?

You can use Words for this but you have to have a list of non-numeric characters. You can then use Concat Items() to turn the list you get back into a string

 

Names Default To Here(1);

str = "The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. The primary sample of A: 10 kg, B: 20 kg, C: 30 kg, 40 kg and E: 50 kg was dried for 24 hours at 105 C. The sample was subject to repeated splitting, leaving some 800-1000 grams which was further milled to pass a 100-mesh test sieve. The crushed sample is blended in a V-blender for 20 minutes and transferred to 100 ml waterproof plastic bottle and sealed.";

matchchar = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" || Get Punctuation Characters() || Get Whitespace Characters();
nums = Words(str,  matchchar);
// {"5", "10", "20", "30", "40", "50", "24", "105", "800", "1000", "100", "20", "100"}

Concat Items(nums, ", "); // "5, 10, 20, 30, 40, 50, 24, 105, 800, 1000, 100, 20, 100"

 

If more complicated matching is required, Regex() or Pat Match() are options but in my opinion it isn't really worth it to learn Pat Match() with it's current documentation and examples found from documentation. Add flag to Regex Match() to find all non-overlapping occurances of pattern would make tasks like this much easier to complete.

-Jarmo