For such an ideal table, (if the probabilities of your data set fit to the fractions you want), you can pick random samples - random samples per variant or a specific number of samples per variant - or a combination of all 3 ...
you will always get the subgroups with the requested fraction (1/2, 1/3, 1/3 and 1/5)
// random sampling : full data set
if(not(current data table() << has column ("cum_prob")),New Column( "cum_prob",
Formula(
Col Rank( random uniform()) / (
Col Number( 1 ))
)
));
// random sampling : per variant
if(not(current data table() << has column ("cum_prob_indiv")),New Column( "cum_prob_indiv",
Formula(
tmp = random uniform(); // tmp =1; // **)
Col Rank( tmp, :gender, :income, :region, :age ) / (
Col Number( tmp, :gender, :income, :region, :age ))
)
));
// force ratios: 1/2, 1/3, 1/3, 1/5
if(not(current data table() << has column ("rank_indiv")),
New Column( "rank_indiv",
Formula( Col Rank( random uniform(), :gender, :income, :region, :age ) )
));
Graph Builder(
Size( 518, 448 ),
Show Control Panel( 0 ),
Graph Spacing( 4 ),
Variables( X( :gender ), X( :income ), X( :region ), X( :age ) ),
Elements( Position( 1, 1 ), Bar( X, Summary Statistic( "N" ) ) ),
Elements( Position( 2, 1 ), Bar( X, Summary Statistic( "N" ) ) ),
Elements( Position( 3, 1 ), Bar( X, Summary Statistic( "N" ) ) ),
Elements( Position( 4, 1 ), Bar( X, Summary Statistic( "N" ) ) ),
Local Data Filter(
Title( "how many samples do you want ? " ),
Add Filter(
columns( :cum_prob, :cum_prob_indiv, :rank_indiv )
)
)
);
**) instead of using CDFs with random uniform(), one can randomizing the row order and use CDFs of "1".