<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Matrix vs Data Table in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57869#M32181</link>
    <description>&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/4536"&gt;@David_Burnham&lt;/a&gt;,&amp;nbsp;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; While I agree, I am generating multiple tables iteratively that the reference gets overwritten and hence making the data table private might result in loss of the table. But in general, I agree and follow the approach of making the data table private where ever possible.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 22 May 2018 13:07:44 GMT</pubDate>
    <dc:creator>uday_guntupalli</dc:creator>
    <dc:date>2018-05-22T13:07:44Z</dc:date>
    <item>
      <title>Matrix vs Data Table</title>
      <link>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57810#M32168</link>
      <description>&lt;P&gt;All,&amp;nbsp;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; Wondering if someone has done similar trails to time matrix vs data table.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Clear Log(); Clear Globals(); Close All(DataTables,"No Save"); 

// Inputs 
a = 7; 

// Approach 1 
TimerStart_1 = Tick Seconds(); 
dt1 = New Table("Approach-1","Invisible"); 
dt1 &amp;lt;&amp;lt; New Column("Random-1",Numeric,Continuous,&amp;lt;&amp;lt; Set Values(Random Index(10^8,10^a)))
	&amp;lt;&amp;lt; New Column("Random-2",Numeric,Continuous,&amp;lt;&amp;lt; Set Values(Random Index(10^8,10^a)))
	&amp;lt;&amp;lt; New Column("Add",Numeric,Continuous,Formula(:Name("Random-1")+:Name("Random-2")))
	&amp;lt;&amp;lt; New Column("Subtract",Numeric,Continuous,Formula(:Name("Random-1")-:Name("Random-2")))
	&amp;lt;&amp;lt; New Column("Multiply",Numeric,Continuous,Formula(:Name("Random-1")*:Name("Random-2")))
	&amp;lt;&amp;lt; New Column("Mod",Numeric,Continuous,Formula(Mod(:Name("Random-1"),:Name("Random-2"))));
TimerEnd_1 = Tick Seconds(); 
Show(TimerEnd_1 - TimerStart_1); 
Close All(DataTables,"No Save"); 

// Approach 2 
TimerStart_2 = Tick Seconds(); 
Mat_1 = Random Index(10^8,10^a); 
Mat_2 = Random Index(10^8,10^a); 
Add = Mat_1 + Mat_2 ; 
Difference = Mat_1 - Mat_2 ; 
Prod =E Mult (Mat_1,Mat_2); 
Mod = Mod(Mat_1,Mat_2);
TimerEnd_2 = Tick Seconds(); 
Show(TimerEnd_2 - TimerStart_2); 

&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 655px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/10795iBAC58BE38A878DB1/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 557px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/10796iFFA28A5F636E217F/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Making the data table private, might shave some more time, but in general,&amp;nbsp;Is this fair or does aybody favor one data container over another solely for speed ?&lt;/P&gt;</description>
      <pubDate>Mon, 21 May 2018 21:59:22 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57810#M32168</guid>
      <dc:creator>uday_guntupalli</dc:creator>
      <dc:date>2018-05-21T21:59:22Z</dc:date>
    </item>
    <item>
      <title>Re: Matrix vs Data Table</title>
      <link>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57820#M32175</link>
      <description>&lt;P&gt;You have 2 questions: is this fair? and do you favor one over another?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regarding fair, memory management is all up to the language provider.&amp;nbsp; Keep in mind tables have more methods than matrices.&amp;nbsp; I would have phrased your first questions as, "Is this change in preformance between 1 million and 10 million expected?" Also, I am just guessing that after 1 million rows, JMP might be doing some storage compression, in other words saving memory, the trade-off being time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I favor depends upon the task. If I am doing a simulation, where a method is applied numerous times and I am getting summary performance, then I'll use a matrix.&amp;nbsp; Prior to JMP 13 which has better subtable referencing, if I had large tables, I would work with matrices then set the values to the table specifically for performance, and I would &lt;U&gt;&lt;EM&gt;&lt;STRONG&gt;never&lt;/STRONG&gt;&lt;/EM&gt;&lt;/U&gt; use formulas for large tables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have attached a script that appends to your script a third method that uses the JMP 13 subtable referencing syntax. The snippet below is the syntax for the Add column. [This site would not allow me to post the attachment, so go to the end to see the fill script.]&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt; dt1[0, "Add"]      = dt1[0,"Random-1"]+ dt1[0,"Random-2"] ;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Note since you are interested in performance, try this,&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;a=5;
tb1 = TickSeconds();
Mat_1 = Random Index(10^8,10^a); 
te1 = TickSeconds();


tb2 = TickSeconds();
Mat_2 = Round (J(10^a, 1, Random Uniform(10^8) )*10^8,0); 
te2 = TickSeconds();

show(te1-tb1, te2-tb2);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;For a&amp;lt;7, the second method is far superior to method 1.&amp;nbsp;&lt;/P&gt;&lt;P&gt;a = 6:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;te1 - tb1 = 2.43333333334886;&amp;nbsp; &amp;nbsp;te2 - tb2 = 0.25;&amp;nbsp; &amp;nbsp; //second mehod superior&amp;nbsp;&lt;/P&gt;&lt;P&gt;a = 7:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;te1 - tb1 = 2.48333333339542;&amp;nbsp; &amp;nbsp;te2 - tb2 = 2.56666666665114;&amp;nbsp; //both methods the same&lt;/P&gt;&lt;P&gt;a = 8:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;te1 - tb1 = 3.03333333344199;&amp;nbsp; &amp;nbsp;te2 - tb2 = 25.8166666666511;&amp;nbsp; //second method much worse&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And like you when there is a big difference, I send a note to JMP as an FYI.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For people running the script beware it is closing all tables, etc.&amp;nbsp; Run in a new session of JMP.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Clear Log(); Clear Globals(); Close All(DataTables,"No Save"); 

// Inputs 
a = 7; 

// Approach 1 
TimerStart_1 = Tick Seconds(); 
dt1 = New Table("Approach-1","Invisible"); 
dt1 &amp;lt;&amp;lt; New Column("Random-1",Numeric,Continuous,&amp;lt;&amp;lt; Set Values(Random Index(10^8,10^a)))
	&amp;lt;&amp;lt; New Column("Random-2",Numeric,Continuous,&amp;lt;&amp;lt; Set Values(Random Index(10^8,10^a)))
	&amp;lt;&amp;lt; New Column("Add",Numeric,Continuous,Formula(:Name("Random-1")+:Name("Random-2")))
	&amp;lt;&amp;lt; New Column("Subtract",Numeric,Continuous,Formula(:Name("Random-1")-:Name("Random-2")))
	&amp;lt;&amp;lt; New Column("Multiply",Numeric,Continuous,Formula(:Name("Random-1")*:Name("Random-2")))
	&amp;lt;&amp;lt; New Column("Mod",Numeric,Continuous,Formula(Mod(:Name("Random-1"),:Name("Random-2"))));
TimerEnd_1 = Tick Seconds(); 
Show(TimerEnd_1 - TimerStart_1); 
Close All(DataTables,"No Save"); 

// Approach 2 
TimerStart_2 = Tick Seconds(); 
Mat_1 = Random Index(10^8,10^a); 
Mat_2 = Random Index(10^8,10^a); 
Add = Mat_1 + Mat_2 ; 
Difference = Mat_1 - Mat_2 ; 
Prod =E Mult (Mat_1,Mat_2); 
Mod = Mod(Mat_1,Mat_2);
TimerEnd_2 = Tick Seconds(); 
Show(TimerEnd_2 - TimerStart_2); 


// Approach 3 
TimerStart_3 = Tick Seconds(); 
dt1 = New Table("Approach-3","Invisible", add rows(10^a),
 New Column("Random-1",Numeric,Continuous ),
 New Column("Random-2",Numeric,Continuous ),
 New Column("Add",Numeric,Continuous ),
 New Column("Subtract",Numeric,Continuous ),
 New Column("Multiply",Numeric,Continuous ),
 New Column("Mod",Numeric,Continuous) 
 );
// Column(dt1, "Random-1") &amp;lt;&amp;lt; Set Values( Random Index(10^8,10^a) );
// Column(dt1, "Random-2") &amp;lt;&amp;lt; Set Values( Random Index(10^8,10^a) );
// Column(dt1, "Add")      &amp;lt;&amp;lt; Set Values( dt1[0,"Random-1"]+ dt1[0,"Random-2"] );
// Column(dt1, "Subtract") &amp;lt;&amp;lt; Set Values( dt1[0,"Random-1"]- dt1[0,"Random-2"] );
// Column(dt1, "Multiply") &amp;lt;&amp;lt; Set Values( dt1[0,"Random-1"]:* dt1[0,"Random-2"] );
// Column(dt1, "Mod")      &amp;lt;&amp;lt; Set Values( Mod(dt1[0,"Random-1"], dt1[0,"Random-2"]) );
 dt1[0,"Random-1"]  = Random Index(10^8,10^a) ;
 dt1[0,"Random-2"]  = Random Index(10^8,10^a) ;
 dt1[0, "Add"]      = dt1[0,"Random-1"]+ dt1[0,"Random-2"] ;
 dt1[0, "Subtract"] = dt1[0,"Random-1"]- dt1[0,"Random-2"] ;
 dt1[0, "Multiply"] = dt1[0,"Random-1"]:* dt1[0,"Random-2"] ;
 dt1[0, "Mod"]      = Mod(dt1[0,"Random-1"], dt1[0,"Random-2"] );

TimerEnd_3 = Tick Seconds(); 
Show(TimerEnd_3 - TimerStart_3); 
Close All(DataTables,"No Save"); &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 22 May 2018 01:10:11 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57820#M32175</guid>
      <dc:creator>gzmorgan0</dc:creator>
      <dc:date>2018-05-22T01:10:11Z</dc:date>
    </item>
    <item>
      <title>Re: Matrix vs Data Table</title>
      <link>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57868#M32180</link>
      <description>&lt;P&gt;"&lt;SPAN&gt;Making the data table private, might shave some more time"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;- actually, you might find that making the table private has a substantial impact on performance&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 22 May 2018 12:53:17 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57868#M32180</guid>
      <dc:creator>David_Burnham</dc:creator>
      <dc:date>2018-05-22T12:53:17Z</dc:date>
    </item>
    <item>
      <title>Re: Matrix vs Data Table</title>
      <link>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57869#M32181</link>
      <description>&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/4536"&gt;@David_Burnham&lt;/a&gt;,&amp;nbsp;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; While I agree, I am generating multiple tables iteratively that the reference gets overwritten and hence making the data table private might result in loss of the table. But in general, I agree and follow the approach of making the data table private where ever possible.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 May 2018 13:07:44 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57869#M32181</guid>
      <dc:creator>uday_guntupalli</dc:creator>
      <dc:date>2018-05-22T13:07:44Z</dc:date>
    </item>
    <item>
      <title>Re: Matrix vs Data Table</title>
      <link>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57877#M32185</link>
      <description>&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/70"&gt;@gzmorgan0&lt;/a&gt;,&amp;nbsp;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; Thank you for your detailed response. Your interpretation is mostly accurate and I wish I provided more clarity to begin with. I agree and share your preferences between data tables and matrices, would use data tables if I needed more built in methods vs matrices. However, the question I wanted to pick the communities brain on was the speed of handling large data and if and why does the behavior change as the data size increases. I would like to believe it is because of the storage compression that you are referring to.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; One interesting aspect is the amount of time that is saved via data table sub-scripting. At a = 7, it was shaving a good 7 seconds w.r.t to the traditional column formula approach - with matrices still leading in terms of performance.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 May 2018 16:06:29 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Matrix-vs-Data-Table/m-p/57877#M32185</guid>
      <dc:creator>uday_guntupalli</dc:creator>
      <dc:date>2018-05-22T16:06:29Z</dc:date>
    </item>
  </channel>
</rss>

