- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How do I identify numbers in a paragraph and extract as column in new tabel?
I have a text paragraph containing several numbers in different sentences and I would like to extract those numbers and get them stacked in a column in a new tabel.
The sentence looks like this:
The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. The primary sample of A: 10 kg, B: 20 kg, C: 30 kg,
In addition the number of different primary samples can vary from 1 up to 20 or 30.
Very thankful if somebody could point me in the right direction with a script. Tried to write it with the help of Copilot, but did not get the result I wanted.
Thanks a heap!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I identify numbers in a paragraph and extract as column in new tabel?
It's the commas. "kg," throws it off.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I identify numbers in a paragraph and extract as column in new tabel?
The Words() function is only using spaces as the delimiters, therefore, kg words with commas( kg, ) or periods( kg. ) are leaving those characters in the extracted word. This can easily be changed by adding the comma and period as delimiters to check for along with the space
Change
words = Words( text );
to
words = Words( text," ,." );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I identify numbers in a paragraph and extract as column in new tabel?
You can use one while loop for something like this (this will modify the original string so you might want to use copy of it)
Names Default To Here(1);
text = "The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. The primary sample of A: 10 kg, B: 20 kg, C: 30 kg, 40 kg and E: 50 kg was dried for 24 hours at 105 C. The sample was subject to repeated splitting, leaving some 800-1000 grams which was further milled to pass a 100-mesh test sieve. The crushed sample is blended in a V-blender for 20 minutes and transferred to 100 ml waterproof plastic bottle and sealed.";
res = {};
While(!Is Missing(match = Regex(text, "\d+.kg")),
Insert Into(res, match);
text = Substr(text, Contains(text, match) + Length(match));
);
show(res);
Or Pat Match if you can figure out how it works (I have never really been able to do that properly)
Names Default To Here(1);
text = "The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. The primary sample of A: 10 kg, B: 20 kg, C: 30 kg, 40 kg and E: 50 kg was dried for 24 hours at 105 C. The sample was subject to repeated splitting, leaving some 800-1000 grams which was further milled to pass a 100-mesh test sieve. The crushed sample is blended in a V-blender for 20 minutes and transferred to 100 ml waterproof plastic bottle and sealed.";
found_list = {};
Pat Match(
text,
Pat Any("., ") + Pat Regex("(\d+.kg)") >> match + Pat Any("., ") + Pat Test(
Insert Into(found_list, match);
0;
)
);
Show(found_list);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I identify numbers in a paragraph and extract as column in new tabel?
You can use Words for this but you have to have a list of non-numeric characters. You can then use Concat Items() to turn the list you get back into a string
Names Default To Here(1);
str = "The bags are weighed on a mobile scale at storage location. Each 5. bag is spear sampled. The primary sample of A: 10 kg, B: 20 kg, C: 30 kg, 40 kg and E: 50 kg was dried for 24 hours at 105 C. The sample was subject to repeated splitting, leaving some 800-1000 grams which was further milled to pass a 100-mesh test sieve. The crushed sample is blended in a V-blender for 20 minutes and transferred to 100 ml waterproof plastic bottle and sealed.";
matchchar = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" || Get Punctuation Characters() || Get Whitespace Characters();
nums = Words(str, matchchar);
// {"5", "10", "20", "30", "40", "50", "24", "105", "800", "1000", "100", "20", "100"}
Concat Items(nums, ", "); // "5, 10, 20, 30, 40, 50, 24, 105, 800, 1000, 100, 20, 100"
If more complicated matching is required, Regex() or Pat Match() are options but in my opinion it isn't really worth it to learn Pat Match() with it's current documentation and examples found from documentation. Add flag to Regex Match() to find all non-overlapping occurances of pattern would make tasks like this much easier to complete.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I identify numbers in a paragraph and extract as column in new tabel?
Thanks for all input so far - testing it now
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I identify numbers in a paragraph and extract as column in new tabel?
@CalibrationBear, have you found a solution and will you share it? Thanks
- « Previous
-
- 1
- 2
- Next »