Subscribe Bookmark RSS Feed

Is there a quick way to make a large list without a for loop?

msharp

Super User

Joined:

Jul 28, 2015

Many programming languages have some form.

For example to make a list from 1 to 100:

list = 1:100,

or

list = range(1,100)

or

list = {1..100}

or

list = seq(1,100)

Let me know! I couldn't find anything in the scripting guide.

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

a=3;


b=9;


AsList(a::b)[1]


{3, 4, 5, 6, 7, 8, 9}

several things going on here: a::b makes a matrix of one row: [3 4 5 6 7 8 9]

AsList( ... ) converts the matrix to a list of lists (because it is really two dimensional, with one of the dimensions being just 1.

[1] extracts the inner list.

Perhaps you just want the matrix rather than the list.  If you are interested in fast code, use the built in matrix functions when you can.

This is not a magical iterator from some other programming language; it will create the full matrix and list in memory, so you probably don't want to make a list of a billion items.

Craige
4 REPLIES
Solution

a=3;


b=9;


AsList(a::b)[1]


{3, 4, 5, 6, 7, 8, 9}

several things going on here: a::b makes a matrix of one row: [3 4 5 6 7 8 9]

AsList( ... ) converts the matrix to a list of lists (because it is really two dimensional, with one of the dimensions being just 1.

[1] extracts the inner list.

Perhaps you just want the matrix rather than the list.  If you are interested in fast code, use the built in matrix functions when you can.

This is not a magical iterator from some other programming language; it will create the full matrix and list in memory, so you probably don't want to make a list of a billion items.

Craige
msharp

Super User

Joined:

Jul 28, 2015

Thanks this is helpful.  I was hoping this would speed up my code, but feels like the for loop has similar speeds which surprises me.  Any suggestions?

Names Default to Here(1);

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );

t1 = tick seconds();

numberarray = Function({list},

  temparray = Associative Array(list);

  tempnum = nitems(temparray << Get Keys);

  tempitems = temparray << Get Keys;

  templist = {};

  For(l=1, l<=tempnum, l++, insertinto(templist, l));

  temparray = Associative Array(tempitems, templist);

  //return(temparray)

  );

for(i=1, i<=100,i++,

  array = numberarray( Column(dt, 1) );

  array2 = numberarray( Column(dt, 2) );

  array3 = numberarray( Column(dt, 3) );

  array4 = numberarray( Column(dt, 4) );

  array5 = numberarray( Column(dt, 5) ););

t2 = tick seconds();

Print( Concat( Char( t2 - t1 ), " seconds." ) );

t1 = tick seconds();

numberarray = Function({list},

  temparray = Associative Array(list);

  tempnum = nitems(temparray << Get Keys);

  tempitems = temparray << Get Keys;

  temparray = Associative Array(tempitems,  1::tempnum);

  );

for(i=1, i<=100,i++,

  array = numberarray( Column(dt, 1) );

  array2 = numberarray( Column(dt, 2) );

  array3 = numberarray( Column(dt, 3) );

  array4 = numberarray( Column(dt, 4) );

  array5 = numberarray( Column(dt, 5) ););

t2 = tick seconds();

Print( Concat( Char( t2 - t1 ), " seconds." ) );

Craige_Hales

Staff

Joined:

Mar 21, 2013

Nice performance test.

JSL has a profiler that can help answer performance questions:

9885_profiler.PNG

I ran it to get this:

9886_profiler2.PNG

which tells me setting the values into the associative array is a bottle-neck in this code.  (line 10 vs line 11.)  (Line 23 probably includes opening the log the first time, ignore that; 42 is more representative.)

Big class only has 40 rows.  If you had many more rows (10K?) you might see other bottle-necks.  Don't call temparray<<getkeys twice.

It looks like you are trying to create a lookup table of small consecutive integers for each unique value in the column.  There may be a clever way to do this, but I don't see it right now.  Maybe someone else will...maybe some database join operation?

Craige
msharp

Super User

Joined:

Jul 28, 2015

This profiler tool will come in handy.  Thanks for sharing it.

It looks like you are right, the reason it doesn't appear to run faster is b/c there are larger bottle necks in the code.  It does seem that the "::" matrix code runs faster than the for loop which is intuitive.  I get that from, comparing %'s in line 10+11 versus %'s in line 31.

My current code won't ever compare large sets of data, but I was still curious b/c I foresee building similar scripts for larger amounts of data and getting rid of for loops is a general sure fire way to make code faster.