Problem
You have some XML data that you want to import into a JMP Data Table.
In this example, I am using the JMP User Community's RSS feed for add-ins found
here. Below is a snippet of the feed.
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
<channel>
<title>JMP Add-Ins articles</title>
<link>
https://community.jmp.com/t5/JMP-Add-Ins/tkb-p/add-ins
</link>
<description>JMP Add-Ins articles</description>
<pubDate>Tue, 19 Dec 2017 21:15:06 GMT</pubDate>
<dc:creator>add-ins</dc:creator>
<dc:date>2017-12-19T21:15:06Z</dc:date>
<item>
<title>JMP User Community RSS Viewer Add-In</title>
<link>
https://community.jmp.com/t5/JMP-Add-Ins/JMP-User-Community-RSS-Viewer-Add-In/ta-p/45501
</link>
<description>Description Here
</description>
<pubDate>Wed, 04 Oct 2017 20:56:38 GMT</pubDate>
<guid>
https://community.jmp.com/t5/JMP-Add-Ins/JMP-User-Community-RSS-Viewer-Add-In/ta-p/45501
</guid>
<dc:creator>Justin_Chilton</dc:creator>
<dc:date>2017-10-04T20:56:38Z</dc:date>
</item>
<item> ... </item>
<item> ... </item>
</channel>
</rss>
Solution
Use the Parse XML JSL function to grab the text from the XML structure and insert the data into a JMP data table.
There are three named arguments that you need to understand to use Parse XML(): On Element, Start Tag, and End Tag.
- On Element: defines what to do for a particular tag (using the Start Tag and/or End Tag arguments)
- Start Tag: defines a script to run when the start tag (e.g. "<name>") is reached. Within the Start Tag script, you can access an XML attribute using the XML Attr() function.
- End Tag: defines a script to run when the end tag (e.g. "</name>") is reached. Within the End Tag script, you can access the text of an element (i.e. the text between the start and end tags) using the XML Text() function.
The order of the On Element arguments you include within your Parse XML function does not matter. Think of these arguments as defining what should happen when you encounter each element looking down the document top to bottom.
The Start Tag script should be used when you need to get an XML Attr() from an element and when you need to set or create variables that will be set within the elements child elements (or End Tag). The End Tag script should be used to grab an XML elements text (between the start and end tags). These Start Tag and End Tag scripts are all defined before JMP starts parsing the actual XML string. This is why the order you put them in do not matter, these scripts are referred to when a start or end tag is found when parsing the string.
When the Parse XML function reaches an element in the XML string, it checks to see if there is a Start Tag script defined for that element and if there is, it is evaluated. It does this for each child element until it reaches an end tag for an element. When it reaches an end tag, it looks up the End Tag script for that element and evaluates it, if one exists.
rssFeed_str = Load Text File( "https://community.jmp.com/kvoqx44227/rss/board?board.id=add-ins" );
// create the data table. could also do this in the
dt = New Table( "RSS Data",
New Column( "ChannelTitle", "Character" ),
New Column( "ChannelPubDate", "Character" ),
New Column( "ChannelLink", "Character" ),
New Column( "ChannelDescription", "Character" ),
New Column( "Title", "Character" ),
New Column( "Link", "Character" ),
New Column( "Description", "Character" ),
New Column( "PubDate", "Character" ),
New Column( "Creator", "Character" )
);
// boolean value to know if we are in an item tag
inItem = 0;
Parse XML( rssFeed_str,
On Element(
"item",
Start Tag(
// turn on the inItem flag
// this is needed to differentiate between
// tags for the channel and ones for and item
inItem = 1;
// ad a row to the data table
dt << Add Rows( 1 );
// add row data to the data table for the channel
dt:ChannelTitle = channel_title;
dt:ChannelPubDate = channel_pubDate;
dt:ChannelLink = channel_link;
dt:ChannelDescription = channel_description;
),
End Tag(
// reset the inItem flag
inItem = 0;
)
),
on element(
"title",
End Tag(
// since the title tag exists for both the channel and each item,
// we want to preserve the channel title separately from the
// current item's title
// This same logic is used for the remaining elements
If( inItem == 0,
channel_title = XML Text(),
dt:Title = XML Text()
)
)
),
on element(
"link",
End Tag(
If( inItem == 0,
channel_link = XML Text(),
dt:Link = XML Text()
)
)
),
on element(
"description",
End Tag(
If( inItem == 0,
channel_description = XML Text(),
dt:Description = XML Text()
)
)
),
on element(
"pubDate",
End Tag(
If( inItem == 0,
channel_pubDate = XML Text(),
dt:PubDate = XML Text()
)
)
),
on element(
"dc:creator",
End Tag( If( inItem == 1, dt:Creator = XML Text() ) )
)
);
Result:
Discussion
This script is built specifically for this RSS feed, so your script might (and will almost definitely) be very different from the above script. I hope this gives you a good example of how you can utilize JSL variables to make data available for child elements.
See Also
Scripting Guide > Extending JMP > Parsing XML
Depending on what you need, the XML Importer Add-In might do all the work for you (but it does not use Parse XML).