Hi,
Subject to Dan's caveats regarding random assignment, this will do it (in this case, for 75% training data).
dt << new column ("Validation", nominal, <<set values(randomshuffle( (1::nrow(dt))` > 0.75*nrow(dt))));
Why this works:
1) The (1::nrow)`piece creates a column vector [1 2 3 ... nrow(dt)], and transposes (using the ` operator) it into a row vector [1,2,3, ... nrow(dt)].
2) Then, this row vector is compared to 0.75*nrow(dt). If greater, assign 1, if not, assign 0. So, suppose we have nrow(dt) = 100. Then the original vector is:
[1, 2, 3, ... , 74, 75, 76, 77, ... 100]. After the comparison with 75, the result vector is:
[0, 0, 0, ... , 0, 0, 1, 1, ... 1]. That is, 75 0s followed by 25 1s.
3) Randomshuffle ( ) puts the contents of a vector into random order... so the 75 0s and 25 1s (still using a 100-row table as an example) will be encountered in random fashion.
4) Finally, the << set values message fills the column with this random assortment of 75 0s and 25 1s.
FWIW, another way to do this interactively is to select Cols > New Columns... from the main menu, then fill out the dialog as below:
Cheers,
Brady