- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How can I set a cell value to "missing" through a script?
I have a script that I use to remove out of family data points from large numeric data sets. When a value was found that was outside a defined distribution then that value was set to missing by using the code "columnReference[specificCell] = ."
I used to work great but now it seems the script does not like me setting the cell value to "." Any suggestions on how I can set a cell value to "missing"?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
Feels like something else must have changed in your code. For example, this works:
dt = Open("$SAMPLE_DATA/Big Class.jmp");
hVals = Column(dt, "height") << getValues;
hVals[Loc(hVals > 60)] = .;
Column(dt, "height") << setValues(hVals);
What error do you see?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
BTW, you do not need to explicitly loop through the target rows. You can simply and simultaneously assign missing to them using the multiple subscripts this way:
cleaningColumnRef[badDataPoints] = .;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
Feels like something else must have changed in your code. For example, this works:
dt = Open("$SAMPLE_DATA/Big Class.jmp");
hVals = Column(dt, "height") << getValues;
hVals[Loc(hVals > 60)] = .;
Column(dt, "height") << setValues(hVals);
What error do you see?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
The only feedback I get from the log is the following.
For( badPt = 1, badPt <= N Row( badDataPoints ), badPt++,
cleaningColumnRef[badDataPoints[badPt]] = /*###*/. /*###*/
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
BTW, you do not need to explicitly loop through the target rows. You can simply and simultaneously assign missing to them using the multiple subscripts this way:
cleaningColumnRef[badDataPoints] = .;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
It should be straightforward to assign a missing value to a cell.
Maybe the column has been locked, given a formula property or set to character data type?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
I double checked column being locked or of the wrong data type.
Thanks for the feedback. A capability report was added to the data I was analyzing. I guess since this was referencing the data table it would not allow me to delete individual cells through a script. It will however let me delete the bad data points manually. If I close the capability report then my script works again.
Seems like there should be a script function that will let me delete the data even though the capability report is open since I can do it manually.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
You can explore mimicking a manual delete with the MainMenu("Delete") command. A warning, save your files before running because a forced "manual" delete will erase whatever is selected in the front window.
Example using your terminology but without a loop:
dt = Open("$SAMPLE_DATA/Big Class.jmp");
badDataPoints = [1, 10];
cleaningColumnRef = Column("height");
dt << Go To(cleaningColumnRef);
dt << select rows(badDataPoints);
dt << Bring Window To Front;
Main Menu("Delete");
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
Just use:
"columnReference[specificCell] = empty()"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
Does anyone have a recommendation on which approach for executing a number of these types of data cleaning operations has the best performance?
It seems that there are quite a few ways to accomplish these replacements, but for large data sets selecting the fastest may be important.
I used the multiple subscripts approach on a 10,000 column by 750 rows data set, and it took several minutes to replace 0 with missing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How can I set a cell value to "missing" through a script?
I take it from this thread that you would like to do such operations programmatically. It's difficult to give general guidelines because there are so many possibilities. If you are really concerned about efficiency, you can profile your code to see where the bottlenecks are:
http://www.jmp.com/support/help/Debug_or_Profile_Scripts.shtml
In the specific case of replacing 0 with missing, if all the columns in your data table (dt) are numeric continuous, you can do:
m = dt << GetAsMatrix;
m[Loc(m == 0)] = .;
dt2 = AsTable(m);
Then you would have to fix up the column names and properties in dt2 of course.