Hello there,
I just saw a great talk at the 2023 Discovery Summit in Sitges on Regex and I wondered if this function could be used to extract key variables from my filenames which I get into JMP using the PDF Import Wizard. The problem is that the filenames are created by multiple operators and there is no common standard. I usually just use the re-code function and some general cleaning to get out what I need but it's very time-consuming, especially when I'm importing 100+ PDFs into JMP per week.
Here is an example script to show you what my data looks like. The key variables I want to extract are "Product Type", "Batch Number" and "Box Sampled". It was mentioned in the talk that another JMP User had a similar problem and a script was created to pull out the data they need. I'm not an expert in JSL so would love to get some help with this problem but appreciate it may be too complex to automate as there are so many inconsistences in the filenames. Any help/advice would be amazing! Best regards, Alicia
New Table( "REGEX Function Filename Example",
Add Rows( 8 ),
New Column( "Source Table",
Character,
"Nominal",
Set Selected,
Set Values(
{
"TYPE-A Z012345 BOX 10 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"TYPE-A Z012345 BOX 129 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"TYPE-B Z012346 BOX12 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"Z012347 TYPEB BOX16 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"TYPEC Z12347 BOX 103 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"BOX 14 TYPE-C Z012350 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"TYPE-D Z012450 BOX 142 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)",
"TYPED Z12451 BOX7 ImpactTemplate_01.pdf - 2023-02-14 14-09-03 Page(2) Table(1)"
}
)
),
New Column( "Product Type",
Character,
"Nominal",
Set Selected,
Set Values(
{"TYPE-A", "TYPE-A", "TYPE-B", "TYPE-B", "TYPE-C", "TYPE-C", "TYPE-D",
"TYPE-D"}
)
),
New Column( "Batch Number",
Character,
"Nominal",
Set Selected,
Set Values(
{"Z012345", "Z012345", "Z012346", "Z012347", "Z012347", "Z012350",
"Z012450", "Z012451"}
)
),
New Column( "Box Sampled",
Character,
"Nominal",
Set Selected,
Set Values(
{"BOX 10", "BOX 129", "BOX 12", "BOX 16", "BOX 103", "BOX 14", "BOX 142",
"BOX 7"}
)
)
)