I need to randomize a list.
randomizeList = Function({List}, {Default Local},
//Code goes here
return List;
);
list = {1,2,3,4,5};
randomList = randomizeList(list);
print(randomList); //Expected output something like {2,4,5,1,3}
There are lots of ways to do this and no wrong answer. I appreciate any feedback, thanks!
Here's one way to do it. Would love to see other approaches!
randomizeList = Function({List}, {Default Local},
nr = nitems(List);
dt = New Table( "", invisible, Add Rows( nr ),
New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),
Set Values( List ) ),
New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),
Formula( Random Uniform() ) )
);
dt << sort(replace table, by(:Randomized));
random_list = dt:Data << get values;
close(dt, nosave);
random_list;
);
list = {1,2,3,4,5};
randomList = randomizeList(list);
print(randomList); //Expected output something like {2,4,5,1,3}
Here's one way to do it. Would love to see other approaches!
randomizeList = Function({List}, {Default Local},
nr = nitems(List);
dt = New Table( "", invisible, Add Rows( nr ),
New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),
Set Values( List ) ),
New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),
Formula( Random Uniform() ) )
);
dt << sort(replace table, by(:Randomized));
random_list = dt:Data << get values;
close(dt, nosave);
random_list;
);
list = {1,2,3,4,5};
randomList = randomizeList(list);
print(randomList); //Expected output something like {2,4,5,1,3}
Here's my implementation.
randomizeList =Function({List},{Default Local},
//Code goes here
List[rank(j(nitems(List), 1, random uniform()))]
);
list = {100,200,300,400,500};
randomList = randomizeList(list);
print(randomList);
In case it's not clear what I'm doing here
-Jeff
In JMP there are almost always at least three ways to do anything. So here's the third.
Starting with alist, make a list of shuffled numbers equal to the number of items in alist. Then make blist with the values of alist in the order of rslist.
alist={a,b,c,1,2,3};
rslist= random shuffle(1::nitems(alist));
blist=alist[rslist[1::nitems(alist)]];
It doesn't matter if you are using numbers or characters or a combination in the list.
I'm pretty sure I got this idea from the JSL companion book
-Byron
Very interesting. My table solution seemed kind of brute force but it outperforms the other two functions when the number of list elements gets above 15,000. Table functions are very efficient apparently.
Here's the code I used:
// pmroz version
pmroz_randomizeList = Function({aList}, {Default Local},
nr = nitems(aList);
dt = New Table( "", invisible, Add Rows( nr ),
New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),
Set Values( aList ) ),
New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),
Formula( Random Uniform() ) )
);
dt << sort(replace table, by(:Randomized));
random_list = dt:Data << get values;
close(dt, nosave);
random_list;
);
// Jeff Perkinson Jul 13, 2016 2:00 PM (in response to Peter Mroz)
jeff_randomizeList =Function({aList},{Default Local},
aList[rank(j(nitems(aList), 1, random uniform()))]
);
// Byron's function
byron_randomizeList = Function({aList},{Default Local},
nr = nitems(alist);
rslist = random shuffle(1::nr);
blist = alist[rslist[1::nr]];
);
nm = 4000; // Set the number of list elements
my_list = (as list(1::nm))[1];
start = today();
randomList = pmroz_randomizeList(my_list);
elapsed = today() - start;
print("pmroz: " || char(elapsed));
wait(0);
my_list = (as list(1::nm))[1];
start = today();
randomList = jeff_randomizeList(my_list);
elapsed = today() - start;
print("Jeff: " || char(elapsed));
wait(0);
my_list = (as list(1::nm))[1];
start = today();
randomList = byron_randomizeList(my_list);
elapsed = today() - start;
print("Byron: " || char(elapsed));
wait(0);
the results will change in JMP 13.
400,000 looks like this in JMP 13:
"pmroz: 0.533333333332848"
"Jeff: 0.233333333333576"
"Byron: 0.216666666667152"
So the JMP 13 Insert Into change will fix the same issue? This is great news!
Table functions are very efficient for large data sets. One of the main reasons is b/c JMP takes advantage of multiple cores while JSL does not. However, for small data sets the process of creating the table adds a lot of overhead. For the application I am using this for, I actually had something similar to PMroz's answer, but since my lists were all <1000 items, I noticed it was adding quite a bit of time.
Thanks Everyone!
Note: Today() is only accurate to the second, Tick Seconds() is accurate to 1/60 of a second, and HP TIME() is accurate to the microsecond. Of course, this all depends on your computer. Also, to give a fair advantage, I changed your tables to private, and removed the close statement, which will speed up your script.
Names Default to Here(1);
// pmroz version
pmroz_randomizeList = Function({aList}, {Default Local},
dt = New Table( "", private,
New Column( "Data", Set Values( aList ) ),
New Column( "Randomized", Numeric, Continuous, Formula( Random Uniform() ) )
);
dt << sort(replace table, by(:Randomized));
random_list = dt:Data << get values;
);
// Jeff Perkinson Jul 13, 2016 2:00 PM (in response to Peter Mroz)
jeff_randomizeList =Function({aList},{Default Local},
aList[rank(j(nitems(aList), 1, random uniform()))]
);
// Byron's function
byron_randomizeList = Function({aList},{Default Local},
nr = nitems(alist);
rslist = random shuffle(1::nr);
blist = alist[rslist[1::nr]];
);
nm = 100; // Set the number of list elements
imax = 100; // Set number of iterations
my_list = (as list(1::nm))[1];
start = HP TIME();
For(i=1, i<=imax, i++,
randomList = pmroz_randomizeList(my_list);
);
elapsed = HP TIME() - start;
print("pmroz: " || char(elapsed));
wait(0);
start = HP TIME();
For(i=1, i<=imax, i++,
randomList = jeff_randomizeList(my_list);
);
elapsed = HP TIME() - start;
print("Jeff: " || char(elapsed));
wait(0);
start = HP TIME();
For(i=1, i<=imax, i++,
randomList = byron_randomizeList(my_list);
);
elapsed = HP TIME() - start;
print("Byron: " || char(elapsed));
wait(0);
//OUTPUT
"pmroz: 121646"
"Jeff: 3990"
"Byron: 3488"
The private keyword is interesting - the table doesn't even show up in the window list. I imagine that it saves a bit of memory. I can use this in a bunch of places. Thanks msharp!
Private tables are super useful, but be careful, while each table holds less memory; they are harder to track. You can quickly take up a lot of memory by creating and opening private tables and it's easy to miss since they don't have a physical window (like in my example above).
However, they can save you a lot of time since you don't need to close them, you can just reassign the variable that holds them (which is what I intended for above but didn't realize your function had a {Default Local} -- DOH!).
From the scripting guide:
Private Data Tables
Completely hide the data table from view by including the private argument:
dt = Open( "$SAMPLE_DATA/Big Class.jmp", "private" );
Making a table private can prevent memory problems when a script opens and closes many
tables. Private data tables have no physical window, so less memory is required than with
invisible tables.
As with invisible tables, analyses that are run on a private data table are linked to the table.
To avoid losing a private data table, you must assign it a reference as shown in the preceding
example. Otherwise, JMP immediately removes the private data table from memory.
Additional uses of the table later in the script generate errors.