How do I replace Missing Values w/ '0' using JSL?
I can do this manually using 'Find and Replace All', but I need to incorporate it in a code that runs automatically everyday.
Yet another way to do it but with a single loop (Numeric columns only).
dt = Current Data Table();
nc = dt << get column names( Numeric );
For( i = 1, i <= N Items( nc ), i++,
nc[i][dt << get rows where( Is Missing( nc[i][] ) )] = 0
);
Hi,
This is based on book 'Jump into scripting'. I replace all the missing values with z.
allCols = dt2 << Get Column Names( String );
For( i = 1, i <= N Items( allCols ), i++,
/* Loop through each row. */
For( j = 1, j <= N Rows( dt2 ), j++,
/* The IsMissing function will return a ‘1’ or true if the
value is missing. If true, cell is assigned zero. */
If( Is Missing( Column( dt2, allCols ) ),
Column( dt2, allCols ) = "z"
)
)
);
Hope this helps.
I would avoid using j as a looping variable because it is also a JSL function.
Function: Creates a matrix of identical values.
nrows: Number of rows in matrix. If ncols
is not specified, nrows
is also used as ncols
.
ncols: Number of columns in matrix.
value: The value used to populate the matrix. If value
is not specified, 1 is used.
This is essentially the same as exj's response but a little cleaner. It does; however, only change missing in Numeric columns.
dt = current data table();
for(ij = 1, ij <= NCol(dt), ij +=1,
CurCol = Column(dt, ij);
If(CurCol << get data type == "Numeric",
For Each Row(
IF(IsMissing(CurCol[]), CurCol[] = 0)
);
);
);
Yet another way to do it but with a single loop (Numeric columns only).
dt = Current Data Table();
nc = dt << get column names( Numeric );
For( i = 1, i <= N Items( nc ), i++,
nc[i][dt << get rows where( Is Missing( nc[i][] ) )] = 0
);
thanks everone!
cheers!
Sorry to be late to this party, but I feel compelled to interject: Setting missing values to zero (or the mean or any other single imputation method) has serious drawbacks. Most of these approaches produce biased parameter estimates, even in ideal conditions where the data are Missing Completely At Random. There are much better ways to deal with missing data.
While what you say is true, it makes assumptions about the data. For example, maybe his data isn't continuous? Sometimes 0 isn't recorded and instead left as missing. If someone were to try and do a Oneway analysis using 0 as false and 1 as true they'd have a problem if their zero's ended up displaying as missing, since Oneway analysis ignores missing rows in JMP. They'd end up with ONLY ONE category!
There are other reasons 0 could show up as missing. Usually storing 0 in a large database is a waste of space. Since Pokemon Go is big right now, I'll use it as an example. I might want to record all the pokemon I caught in a day, 12 ratattas and 6 pidgeys. I wouldn't however, want to then store 0 for the other 148 pokemon I didn't catch that day. If I did I'd quickly fill up my database with useless information. Thus, if I queried my data I'd get missing for most of my pokemon at any given time, but missing is = to 0.
In similar situations you could get missing data that are equivalent to 0 b/c of data conversion issues, poorly designed queries, ect.
That's all true. I felt like it was time to put up a Caution flag. Just because we can do something in JMP doesn't always mean we should.
Absolutely, I appreciate the word of caution. I for sure, didn't think about it.
It's also important to remember that missing data can be valuable data, and sometimes it's best to leave it as is!