<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Column formula to summarize column by multiple columns in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45924#M26178</link>
    <description>&lt;P&gt;Great, I'm glad that helps! I should have mentioned that in recent versions of JMP (I think since JMP 11) you can have multiple column parameters directly in ColMean(). So, rather than concatenating you could have:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Col Mean( :count, :softness, :temperature )&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm so used to&amp;nbsp;&lt;SPAN&gt;concatenating that I still do that.&amp;nbsp;&lt;/SPAN&gt;Old habits die hard :)&lt;/img&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Additionally, in JMP 12 and on, with Instant Formulas in the data table you can select those columns in the table, right-click one of their headers, select New Formula Column &amp;gt; Group By.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="groupby.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/7994iC927AD2AB10DB51C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="groupby.png" alt="groupby.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then, select the count column, right-click the header, select New Formula Column &amp;gt; Aggregate&amp;nbsp;&amp;gt; Mean&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="aggregate.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/7995iDFF6D705C1542750/image-size/medium?v=v2&amp;amp;px=400" role="button" title="aggregate.png" alt="aggregate.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This will design the formula for you automatically; very friendly for your new users.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 13 Oct 2017 17:01:45 GMT</pubDate>
    <dc:creator>jules</dc:creator>
    <dc:date>2017-10-13T17:01:45Z</dc:date>
    <item>
      <title>Summary statistics with multiple binning columns.</title>
      <link>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45918#M26174</link>
      <description>&lt;P&gt;I often summarize a column by some combination of other categorical columns using a formula column.&amp;nbsp;&amp;nbsp;For example in the Detergent sample data I might want to know the average count for each combination of softness and temperature. Below are three methods to do this, which one I use depends on the size of the data table and who my audience is.&amp;nbsp; Is there an easier way?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I shy away from an add-in containing a custom 'Col Mean by Multiple Columns()' function due to portability issues.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Tabluate&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Create summary statistics in a new table using tabulate and then join the results to the original data table.&lt;/P&gt;&lt;P&gt;Pro: Easy to make very complex summary statistics.&lt;/P&gt;&lt;P&gt;Con: Not live, need to refresh the data if a referenced column changes during an analysis, no way for users to follow the source of the data.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Binning Formula&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Create&amp;nbsp;a binning formula and then use the second argument of the Col Mean() function to refrence that bin.&lt;/P&gt;&lt;P&gt;Pro: Easy to set up and relatively easy for users to follow.&lt;/P&gt;&lt;P&gt;Con:&amp;nbsp;Creates extra columns that clutter the data table and have no use except for this formula, especially when creating many summary columns.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;dt = Open( "$sample_data/Detergent.jmp" ); 

//binning formula
dt &amp;lt;&amp;lt; New Column(
	"Bin by Temp and Softness",
	Character,
	"Nominal",
	Formula( :softness || "-" || :temperature )
);

//summary statistic
dt &amp;lt;&amp;lt; New Column(
	"Mean Count by Temp and Softness Binned",
	Numeric,
	"Continuous",
	Format( "Best", 12 ),
	Formula( Col Mean( :count, :Bin by Temp and Softness) )
);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Identify bins and calculate summary statistic using JSL inside a formula&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Identify similar rows in the data table then calculate the summary statistic on that data.&amp;nbsp; I suspect JMP does not cache results using this method so the whole table is re-evaluted for every row. In any case it is slow for large data tables.&lt;/P&gt;&lt;P&gt;Pro: Tons of flexibility&lt;/P&gt;&lt;P&gt;Cons: Slow calculations, very hard for users to follow.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;//summary statistic by bin
dt &amp;lt;&amp;lt; New Column(
	"Mean Count by Temp and Softness Direct",
	Numeric,
	"Continuous",
	Format( "Best", 12 ),
	Formula(
		matches = Function( {a, b},
			out = {};
			For( i = 1, i &amp;lt;= N Items( a ), i++,
				found = "no";
				For( j = 1, j &amp;lt;= N Items( b ), j++,
					If( a[i] == b[j],
						found = "yes"
					)
				);
				If( found == "yes",
					out = Insert( out, a[i] )
				);
			);
			out;
		);
		soft = Loc( :softness &amp;lt;&amp;lt; get values(), :softness );
		temp = Loc( :temperature &amp;lt;&amp;lt; get values(), :temperature );
		Mean( (:count &amp;lt;&amp;lt; get values())[matches( soft, temp )] );
	)
);&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 13 Oct 2017 16:46:15 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45918#M26174</guid>
      <dc:creator>ih</dc:creator>
      <dc:date>2017-10-13T16:46:15Z</dc:date>
    </item>
    <item>
      <title>Re: Column formula to summarize column by multiple columns</title>
      <link>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45920#M26176</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/6657"&gt;@ih&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I find myself doing just this thing pretty often. One way that works for me is to use a column formula like Col Mean(:column,&amp;nbsp;byvariable), and for the second parameter I use a concatenation of the two categorical columns directly (rather than creating any additional columns in the table). For the detergent example you mentioned this would be:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Col Mean( :count, :softness || :temperature )&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does this do what you need or is there something else you're looking for?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/2026"&gt;@jules&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Oct 2017 16:47:59 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45920#M26176</guid>
      <dc:creator>jules</dc:creator>
      <dc:date>2017-10-13T16:47:59Z</dc:date>
    </item>
    <item>
      <title>Re: Column formula to summarize column by multiple columns</title>
      <link>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45921#M26177</link>
      <description>&lt;P&gt;This is perfect, and so obvious once you point it out.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 13 Oct 2017 16:55:42 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45921#M26177</guid>
      <dc:creator>ih</dc:creator>
      <dc:date>2017-10-13T16:55:42Z</dc:date>
    </item>
    <item>
      <title>Re: Column formula to summarize column by multiple columns</title>
      <link>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45924#M26178</link>
      <description>&lt;P&gt;Great, I'm glad that helps! I should have mentioned that in recent versions of JMP (I think since JMP 11) you can have multiple column parameters directly in ColMean(). So, rather than concatenating you could have:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Col Mean( :count, :softness, :temperature )&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm so used to&amp;nbsp;&lt;SPAN&gt;concatenating that I still do that.&amp;nbsp;&lt;/SPAN&gt;Old habits die hard :)&lt;/img&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Additionally, in JMP 12 and on, with Instant Formulas in the data table you can select those columns in the table, right-click one of their headers, select New Formula Column &amp;gt; Group By.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="groupby.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/7994iC927AD2AB10DB51C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="groupby.png" alt="groupby.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then, select the count column, right-click the header, select New Formula Column &amp;gt; Aggregate&amp;nbsp;&amp;gt; Mean&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="aggregate.png" style="width: 400px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/7995iDFF6D705C1542750/image-size/medium?v=v2&amp;amp;px=400" role="button" title="aggregate.png" alt="aggregate.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This will design the formula for you automatically; very friendly for your new users.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Oct 2017 17:01:45 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Summary-statistics-with-multiple-binning-columns/m-p/45924#M26178</guid>
      <dc:creator>jules</dc:creator>
      <dc:date>2017-10-13T17:01:45Z</dc:date>
    </item>
  </channel>
</rss>

