Subscribe Bookmark RSS Feed

Randomize a List

msharp

Super User

Joined:

Jul 28, 2015

I need to randomize a list. 

randomizeList = Function({List}, {Default Local},

     //Code goes here

     return List;

);

list = {1,2,3,4,5};

randomList = randomizeList(list);

print(randomList); //Expected output something like {2,4,5,1,3}

There are lots of ways to do this and no wrong answer.  I appreciate any feedback, thanks! 

13 REPLIES
pmroz

Super User

Joined:

Jun 23, 2011

Here's one way to do it.  Would love to see other approaches!

randomizeList = Function({List}, {Default Local},

    nr = nitems(List);

    dt = New Table( "", invisible, Add Rows( nr ),

        New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),

            Set Values( List ) ),

        New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),

            Formula( Random Uniform() ) )

    );

    dt << sort(replace table, by(:Randomized));

    random_list = dt:Data << get values;

    close(dt, nosave);

    random_list;

);

list = {1,2,3,4,5};

randomList = randomizeList(list);

print(randomList); //Expected output something like {2,4,5,1,3}

Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011

Here's my implementation.

randomizeList =Function({List},{Default Local},

     //Code goes here

    List[rank(j(nitems(List), 1, random uniform()))]

);

list = {100,200,300,400,500};

randomList = randomizeList(list);

print(randomList);

In case it's not clear what I'm doing here

  • The j() function creates a matrix of random numbers, e.g. [0.04757597646676, 0.738883488578722, 0.935981905087828, 0.0233044682536274,0.493906983872876]
  • The rank() function returns a matrix (e.g. for above: [4, 1, 5, 2, 3]) which is used to return the original list in an order sorted the random numbers.

-Jeff

-Jeff
Byron_JMP

Staff

Joined:

Apr 26, 2012

In JMP there are almost always at least three ways to do anything.  So here's the third.

Starting with alist, make a list of shuffled numbers equal to the number of items in alist. Then make blist with the values of alist in the order of rslist.

alist={a,b,c,1,2,3};

rslist= random shuffle(1::nitems(alist));

blist=alist[rslist[1::nitems(alist)]];

It doesn't matter if you are using numbers or characters or a combination in the list.

I'm pretty sure I got this idea from the JSL companion book

-Byron

pmroz

Super User

Joined:

Jun 23, 2011

Very interesting.  My table solution seemed kind of brute force but it outperforms the other two functions when the number of list elements gets above 15,000.  Table functions are very efficient apparently.

12002_Randomize List Algorithms.png

Here's the code I used:

// pmroz version

pmroz_randomizeList = Function({aList}, {Default Local},

    nr = nitems(aList);

    dt = New Table( "", invisible, Add Rows( nr ),

        New Column( "Data", Numeric, Continuous, Format( "Best", 12 ),

            Set Values( aList ) ),

        New Column( "Randomized", Numeric, Continuous, Format( "Best", 12 ),

            Formula( Random Uniform() ) )

    );

    dt << sort(replace table, by(:Randomized));

    random_list = dt:Data << get values;

    close(dt, nosave);

    random_list;

);

// Jeff Perkinson Jul 13, 2016 2:00 PM (in response to Peter Mroz)

jeff_randomizeList =Function({aList},{Default Local},

    aList[rank(j(nitems(aList), 1, random uniform()))]

);

// Byron's function

byron_randomizeList = Function({aList},{Default Local},

    nr = nitems(alist);

    rslist = random shuffle(1::nr);

    blist  = alist[rslist[1::nr]];

);

nm = 4000;   // Set the number of list elements

my_list = (as list(1::nm))[1];

start = today();

randomList = pmroz_randomizeList(my_list);

elapsed = today() - start;

print("pmroz: " || char(elapsed));

wait(0);

my_list = (as list(1::nm))[1];

start = today();

randomList = jeff_randomizeList(my_list);

elapsed = today() - start;

print("Jeff: " || char(elapsed));

wait(0);

my_list = (as list(1::nm))[1];

start = today();

randomList = byron_randomizeList(my_list);

elapsed = today() - start;

print("Byron: " || char(elapsed));

wait(0);

Craige_Hales

Staff

Joined:

Mar 21, 2013

the results will change in JMP 13.

Fast List

400,000 looks like this in JMP 13:

"pmroz: 0.533333333332848"

"Jeff: 0.233333333333576"

"Byron: 0.216666666667152"

Craige
msharp

Super User

Joined:

Jul 28, 2015

So the JMP 13 Insert Into change will fix the same issue?  This is great news!

msharp

Super User

Joined:

Jul 28, 2015

Table functions are very efficient for large data sets.  One of the main reasons is b/c JMP takes advantage of multiple cores while JSL does not.  However, for small data sets the process of creating the table adds a lot of overhead.  For the application I am using this for, I actually had something similar to PMroz's answer, but since my lists were all <1000 items, I noticed it was adding quite a bit of time.

Thanks Everyone!

Note: Today() is only accurate to the second, Tick Seconds() is accurate to 1/60 of a second, and HP TIME() is accurate to the microsecond.  Of course, this all depends on your computer.  Also, to give a fair advantage, I changed your tables to private, and removed the close statement, which will speed up your script.

Names Default to Here(1);

// pmroz version

pmroz_randomizeList = Function({aList}, {Default Local},

    dt = New Table( "", private,

              New Column( "Data", Set Values( aList ) ),

        New Column( "Randomized", Numeric, Continuous, Formula( Random Uniform() ) )

    );

    dt << sort(replace table, by(:Randomized));

    random_list = dt:Data << get values;

);

// Jeff Perkinson Jul 13, 2016 2:00 PM (in response to Peter Mroz)

jeff_randomizeList =Function({aList},{Default Local},

    aList[rank(j(nitems(aList), 1, random uniform()))]

);

// Byron's function

byron_randomizeList = Function({aList},{Default Local},

    nr = nitems(alist);

    rslist = random shuffle(1::nr);

    blist = alist[rslist[1::nr]];

);

nm = 100;   // Set the number of list elements

imax = 100; // Set number of iterations

my_list = (as list(1::nm))[1];

start = HP TIME();

For(i=1, i<=imax, i++,

       randomList = pmroz_randomizeList(my_list);

);

elapsed = HP TIME() - start;

print("pmroz: " || char(elapsed));

wait(0);

start = HP TIME();

For(i=1, i<=imax, i++,

       randomList = jeff_randomizeList(my_list);

);

elapsed = HP TIME() - start;

print("Jeff: " || char(elapsed));

wait(0);

start = HP TIME();

For(i=1, i<=imax, i++,

       randomList = byron_randomizeList(my_list);

);

elapsed = HP TIME() - start;

print("Byron: " || char(elapsed));

wait(0);


//OUTPUT

"pmroz: 121646"

"Jeff: 3990"

"Byron: 3488"

pmroz

Super User

Joined:

Jun 23, 2011

The private keyword is interesting - the table doesn't even show up in the window list.  I imagine that it saves a bit of memory.  I can use this in a bunch of places.  Thanks msharp!

msharp

Super User

Joined:

Jul 28, 2015

Private tables are super useful, but be careful, while each table holds less memory; they are harder to track.  You can quickly take up a lot of memory by creating and opening private tables and it's easy to miss since they don't have a physical window (like in my example above).

However, they can save you a lot of time since you don't need to close them, you can just reassign the variable that holds them (which is what I intended for above but didn't realize your function had a {Default Local} -- DOH!).

From the scripting guide:

Private Data Tables

Completely hide the data table from view by including the private argument:

dt = Open( "$SAMPLE_DATA/Big Class.jmp", "private" );

Making a table private can prevent memory problems when a script opens and closes many

tables. Private data tables have no physical window, so less memory is required than with

invisible tables.

As with invisible tables, analyses that are run on a private data table are linked to the table.

To avoid losing a private data table, you must assign it a reference as shown in the preceding

example. Otherwise, JMP immediately removes the private data table from memory.

Additional uses of the table later in the script generate errors.