Solved: Re: Randomize a List

msharp · Jul 13, 2016 12:32 PM

I need to randomize a list.

randomizeList = Function({List}, {Default Local},

//Code goes here

return List;

);

list = {1,2,3,4,5};

randomList = randomizeList(list);

print(randomList); //Expected output something like {2,4,5,1,3}

There are lots of ways to do this and no wrong answer. I appreciate any feedback, thanks!

pmroz · Jul 13, 2016 01:13 PM

Here's one way to do it. Would love to see other approaches!

randomizeList = Function({List}, {Default Local},

nr = nitems(List);

dt = New Table( "", invisible, Add Rows( nr ),

New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),

Set Values( List ) ),

New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),

Formula( Random Uniform() ) )

);

dt << sort(replace table, by(:Randomized));

random_list = dt:Data << get values;

close(dt, nosave);

random_list;

);

list = {1,2,3,4,5};

randomList = randomizeList(list);

print(randomList); //Expected output something like {2,4,5,1,3}

View solution in original post

pmroz · Jul 13, 2016 01:13 PM

Here's one way to do it. Would love to see other approaches!

randomizeList = Function({List}, {Default Local},

nr = nitems(List);

dt = New Table( "", invisible, Add Rows( nr ),

New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),

Set Values( List ) ),

New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),

Formula( Random Uniform() ) )

);

dt << sort(replace table, by(:Randomized));

random_list = dt:Data << get values;

close(dt, nosave);

random_list;

);

list = {1,2,3,4,5};

randomList = randomizeList(list);

print(randomList); //Expected output something like {2,4,5,1,3}

Jeff_Perkinson · Jul 13, 2016 02:00 PM

Here's my implementation.

randomizeList =Function({List},{Default Local},
     //Code goes here
    List[rank(j(nitems(List), 1, random uniform()))]
);
list = {100,200,300,400,500};
randomList = randomizeList(list);
print(randomList);

In case it's not clear what I'm doing here

The j() function creates a matrix of random numbers, e.g. [0.04757597646676, 0.738883488578722, 0.935981905087828, 0.0233044682536274,0.493906983872876]
The rank() function returns a matrix (e.g. for above: [4, 1, 5, 2, 3]) which is used to return the original list in an order sorted the random numbers.

-Jeff

Byron_JMP · Jul 13, 2016 03:51 PM

In JMP there are almost always at least three ways to do anything. So here's the third.

Starting with alist, make a list of shuffled numbers equal to the number of items in alist. Then make blist with the values of alist in the order of rslist.

alist={a,b,c,1,2,3};

rslist= random shuffle(1::nitems(alist));

blist=alist[rslist[1::nitems(alist)]];

It doesn't matter if you are using numbers or characters or a combination in the list.

I'm pretty sure I got this idea from the JSL companion book

-Byron

JMP Systems Engineer, Health and Life Sciences (Pharma)

pmroz · Oct 18, 2016 8:22 PM

Very interesting. My table solution seemed kind of brute force but it outperforms the other two functions when the number of list elements gets above 15,000. Table functions are very efficient apparently.

Here's the code I used:

// pmroz version

pmroz_randomizeList = Function({aList}, {Default Local},

nr = nitems(aList);

dt = New Table( "", invisible, Add Rows( nr ),

New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),

Set Values( aList ) ),

New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),

Formula( Random Uniform() ) )

);

dt << sort(replace table, by(:Randomized));

random_list = dt:Data << get values;

close(dt, nosave);

random_list;

);

// Jeff Perkinson Jul 13, 2016 2:00 PM (in response to Peter Mroz)

jeff_randomizeList =Function({aList},{Default Local},

aList[rank(j(nitems(aList), 1, random uniform()))]

);

// Byron's function

byron_randomizeList = Function({aList},{Default Local},

nr = nitems(alist);

rslist = random shuffle(1::nr);

blist = alist[rslist[1::nr]];

);

nm = 4000; // Set the number of list elements

my_list = (as list(1::nm))[1];

start = today();

randomList = pmroz_randomizeList(my_list);

elapsed = today() - start;

print("pmroz: " || char(elapsed));

wait(0);

my_list = (as list(1::nm))[1];

start = today();

randomList = jeff_randomizeList(my_list);

elapsed = today() - start;

print("Jeff: " || char(elapsed));

wait(0);

my_list = (as list(1::nm))[1];

start = today();

randomList = byron_randomizeList(my_list);

elapsed = today() - start;

print("Byron: " || char(elapsed));

wait(0);

Craige_Hales · Nov 9, 2016 2:06 AM

the results will change in JMP 13.

Fast List

400,000 looks like this in JMP 13:

"pmroz: 0.533333333332848"

"Jeff: 0.233333333333576"

"Byron: 0.216666666667152"

Craige

msharp · Jul 13, 2016 05:56 PM

So the JMP 13 Insert Into change will fix the same issue? This is great news!

msharp · Jul 13, 2016 05:29 PM

Table functions are very efficient for large data sets. One of the main reasons is b/c JMP takes advantage of multiple cores while JSL does not. However, for small data sets the process of creating the table adds a lot of overhead. For the application I am using this for, I actually had something similar to PMroz's answer, but since my lists were all <1000 items, I noticed it was adding quite a bit of time.

Thanks Everyone!

Note: Today() is only accurate to the second, Tick Seconds() is accurate to 1/60 of a second, and HP TIME() is accurate to the microsecond. Of course, this all depends on your computer. Also, to give a fair advantage, I changed your tables to private, and removed the close statement, which will speed up your script.

Names Default to Here(1);

// pmroz version

pmroz_randomizeList = Function({aList}, {Default Local},

dt = New Table( "", private,

New Column( "Data", Set Values( aList ) ),

New Column( "Randomized", Numeric, Continuous, Formula( Random Uniform() ) )

);

dt << sort(replace table, by(:Randomized));

random_list = dt:Data << get values;

);

// Jeff Perkinson Jul 13, 2016 2:00 PM (in response to Peter Mroz)

jeff_randomizeList =Function({aList},{Default Local},

aList[rank(j(nitems(aList), 1, random uniform()))]

);

// Byron's function

byron_randomizeList = Function({aList},{Default Local},

nr = nitems(alist);

rslist = random shuffle(1::nr);

blist = alist[rslist[1::nr]];

);

nm = 100; // Set the number of list elements

imax = 100; // Set number of iterations

my_list = (as list(1::nm))[1];

start = HP TIME();

For(i=1, i<=imax, i++,

randomList = pmroz_randomizeList(my_list);

);

elapsed = HP TIME() - start;

print("pmroz: " || char(elapsed));

wait(0);

start = HP TIME();

For(i=1, i<=imax, i++,

randomList = jeff_randomizeList(my_list);

);

elapsed = HP TIME() - start;

print("Jeff: " || char(elapsed));

wait(0);

start = HP TIME();

For(i=1, i<=imax, i++,

randomList = byron_randomizeList(my_list);

);

elapsed = HP TIME() - start;

print("Byron: " || char(elapsed));

wait(0);

//OUTPUT

"pmroz: 121646"

"Jeff: 3990"

"Byron: 3488"

pmroz · Jul 14, 2016 08:19 AM

The private keyword is interesting - the table doesn't even show up in the window list. I imagine that it saves a bit of memory. I can use this in a bunch of places. Thanks msharp!

msharp · Jul 14, 2016 10:02 AM

Private tables are super useful, but be careful, while each table holds less memory; they are harder to track. You can quickly take up a lot of memory by creating and opening private tables and it's easy to miss since they don't have a physical window (like in my example above).

However, they can save you a lot of time since you don't need to close them, you can just reassign the variable that holds them (which is what I intended for above but didn't realize your function had a {Default Local} -- DOH!).

From the scripting guide:

Private Data Tables

Completely hide the data table from view by including the private argument:

dt = Open( "$SAMPLE_DATA/Big Class.jmp", "private" );

Making a table private can prevent memory problems when a script opens and closes many

tables. Private data tables have no physical window, so less memory is required than with

invisible tables.

As with invisible tables, analyses that are run on a private data table are linked to the table.

To avoid losing a private data table, you must assign it a reference as shown in the preceding

example. Otherwise, JMP immediately removes the private data table from memory.

Additional uses of the table later in the script generate errors.