<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic &amp;quot;Weight&amp;quot; in &amp;quot;Fit Y by X&amp;quot; (and perhaps generally in JMP platforms) in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473368#M71796</link>
    <description>&lt;P&gt;I have data from several (22?) samples, which vary considerably in sample size. I understand that parameters are estimated better with larger samples, but that is not the issue I want to address here. It is that I would like to both graph distributions, and estimate parameters, as if the sample sizes didn't differ across the groups/samples. This means that large samples will need to be de-weighted and small samples up-weighted.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried to do this in "Fit Y by X," as follows: I computed a new variable, WEIGHT, as the sample size of the Group divided by the total sample size. So let's say that I have four groups:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group 1: N1=100&lt;/P&gt;&lt;P&gt;Group 2: N1=500&lt;/P&gt;&lt;P&gt;Group 3: N1=200&lt;/P&gt;&lt;P&gt;Group 4: N1=400&lt;/P&gt;&lt;P&gt;Also Groups 1 and 2 are Type 1 and Groups 3 and 4 are Type 2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;WEIGHT will be:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group 1: 100/1200&lt;/P&gt;&lt;P&gt;Group 2: 500/1200&lt;/P&gt;&lt;P&gt;Group 3: 200/1200&lt;/P&gt;&lt;P&gt;Group 4: 400/1200&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I do "Fit Y by X" where Y is the DV and X is "Type" and weight by WEIGHT, then use "Compare Densities" to get a plot of overlapping densities, it looks right. However, if I continue and get the standard deviations for the DV, these are much larger than the standard deviations for any of the groups. This makes me think that I am not understanding what I'm doing.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Help?&lt;/P&gt;</description>
    <pubDate>Sun, 11 Jun 2023 11:23:09 GMT</pubDate>
    <dc:creator>profjmb</dc:creator>
    <dc:date>2023-06-11T11:23:09Z</dc:date>
    <item>
      <title>"Weight" in "Fit Y by X" (and perhaps generally in JMP platforms)</title>
      <link>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473368#M71796</link>
      <description>&lt;P&gt;I have data from several (22?) samples, which vary considerably in sample size. I understand that parameters are estimated better with larger samples, but that is not the issue I want to address here. It is that I would like to both graph distributions, and estimate parameters, as if the sample sizes didn't differ across the groups/samples. This means that large samples will need to be de-weighted and small samples up-weighted.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried to do this in "Fit Y by X," as follows: I computed a new variable, WEIGHT, as the sample size of the Group divided by the total sample size. So let's say that I have four groups:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group 1: N1=100&lt;/P&gt;&lt;P&gt;Group 2: N1=500&lt;/P&gt;&lt;P&gt;Group 3: N1=200&lt;/P&gt;&lt;P&gt;Group 4: N1=400&lt;/P&gt;&lt;P&gt;Also Groups 1 and 2 are Type 1 and Groups 3 and 4 are Type 2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;WEIGHT will be:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group 1: 100/1200&lt;/P&gt;&lt;P&gt;Group 2: 500/1200&lt;/P&gt;&lt;P&gt;Group 3: 200/1200&lt;/P&gt;&lt;P&gt;Group 4: 400/1200&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I do "Fit Y by X" where Y is the DV and X is "Type" and weight by WEIGHT, then use "Compare Densities" to get a plot of overlapping densities, it looks right. However, if I continue and get the standard deviations for the DV, these are much larger than the standard deviations for any of the groups. This makes me think that I am not understanding what I'm doing.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Help?&lt;/P&gt;</description>
      <pubDate>Sun, 11 Jun 2023 11:23:09 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473368#M71796</guid>
      <dc:creator>profjmb</dc:creator>
      <dc:date>2023-06-11T11:23:09Z</dc:date>
    </item>
    <item>
      <title>Re: "Weight" in "Fit Y by X" (and perhaps generally in JMP platforms)</title>
      <link>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473387#M71797</link>
      <description>&lt;P&gt;l made a mistake in my original post, which I would delete if I knew how. The correct version is below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have data from several (22?) samples, which vary considerably in sample size. I understand that parameters are estimated better with larger samples, but that is not the issue I want to address here. It is that I would like to both graph distributions, and estimate parameters, as if the sample sizes didn't differ across the groups/samples. This means that large samples will need to be de-weighted and small samples up-weighted.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried to do this in "Fit Y by X," as follows: I computed a new variable, WEIGHT, as the sample size of the Group divided by the total sample size. So let's say that I have four groups:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group 1: N1=100&lt;/P&gt;&lt;P&gt;Group 2: N1=500&lt;/P&gt;&lt;P&gt;Group 3: N1=200&lt;/P&gt;&lt;P&gt;Group 4: N1=400&lt;/P&gt;&lt;P&gt;Also Groups 1 and 2 are Type 1 and Groups 3 and 4 are Type 2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;WEIGHT will be:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group 1: 1200/100&lt;/P&gt;&lt;P&gt;Group 2: 1200/500&lt;/P&gt;&lt;P&gt;Group 3: 1200/200&lt;/P&gt;&lt;P&gt;Group 4: 1200/400&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I do "Fit Y by X" where Y is the DV and X is "Type" and weight by WEIGHT, then use "Compare Densities" to get a plot of overlapping densities, it looks right. However, if I continue and get the standard deviations for the DV, these are much larger than the standard deviations for any of the groups. This makes me think that I am not understanding what I'm doing.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Help?&lt;/P&gt;</description>
      <pubDate>Sat, 26 Mar 2022 18:18:52 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473387#M71797</guid>
      <dc:creator>profjmb</dc:creator>
      <dc:date>2022-03-26T18:18:52Z</dc:date>
    </item>
    <item>
      <title>Re: "Weight" in "Fit Y by X" (and perhaps generally in JMP platforms)</title>
      <link>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473672#M71834</link>
      <description>&lt;P&gt;I think this post may help you:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://community.jmp.com/t5/Discussions/Weighted-Standard-Deviation/m-p/376169" target="_blank"&gt;Solved: Weighted Standard Deviation - JMP User Community&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;And probably the following script helps to understand what can happen.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So in your approach calculation of mean works, whatever method you take (role weight or frequency).&lt;/P&gt;
&lt;P&gt;I made a different definition of weight in comparison to yours, I wanted the total sum of weights to be 1200 (you have 4*1200).&lt;/P&gt;
&lt;P&gt;This should not matter, and for weight in the role of frequency it does not, but for weight in role weight it does. See script.&lt;/P&gt;
&lt;P&gt;Unfortunately I cannot exactly explain why, the small dataset gets a very large stddev in comparison to the total average. Its perhaps due to square and root ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I personally would not use your approach, because it's not clear, what is happening. I would do the group summary, and then the average over groups each weighted 1. And then combine that result into your graph.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Names Default To Here( 1 );
// about the role of weight and frequency for calculation of mean and stddev
//
// web("https://www.jmp.com/support/help/en/16.1/?os=win&amp;amp;source=application&amp;amp;utm_source=helpmenu&amp;amp;utm_medium=application#page/jmp/summary-statistics.shtml");
//
nelem_lst = {100, 500, 200, 400};
table_lst = {};
For Each( {value, index}, nelem_lst,
	Eval(
		Eval Expr(
			table_lst[index] = New Table( "Table " || Char( index ),
				add rows( nelem_lst[index] ),
				New Column( "Group", "Character", set each value( "Group " || Char( index ) ) ),
				New Column( "Type", "Character", set each value( If( index &amp;lt;= 2, "Type 1", "Type 2" ) ) ),
				New Column( "DV", "Continuous", formula( Random Normal( Expr( Mod( index, 2 ) ), Expr( Mod( index, 2 ) + 1 ) ) ) )
			)
		)
	);
	Wait( 0.1 );
	table_lst[index]:DV &amp;lt;&amp;lt; delete formula;
);
Wait( 0 );
dt = table_lst[1] &amp;lt;&amp;lt; concatenate( Table Name( "All" ), table_lst[2 :: 4] );
For Each( {value}, table_lst, Close( value, "NoSave" ) );

Summarize( dt, group_lst = by( :group ) );
ngroups = N Items( group_lst );

dt &amp;lt;&amp;lt; New Column( "ColMean[Group]", formula( Col Mean( :DV, :group ) ) );
dt &amp;lt;&amp;lt; New Column( "ColStd[Group]", formula( Col Std Dev( :DV, :group ) ) );
Eval( Eval Expr( dt &amp;lt;&amp;lt; New Column( "weight[Group]", formula( Col Number( :DV ) / Expr( ngroups ) / Col Number( :DV, :group ) ) ) ) );

nw = New Window( "oneway comparison",
	H List Box(
		Panel Box( "w/o weight",
			dt &amp;lt;&amp;lt; Oneway( Y( :DV ), X( :Group ),  Means and Std Dev( 1 ), Mean Error Bars( 1 ), Std Dev Lines( 1 ) );
		),
		Panel Box( "weight in role frequency",
			dt &amp;lt;&amp;lt; dt &amp;lt;&amp;lt; Oneway(
				Y( :DV ),
				X( :Group ),
				Freq( :"weight[Group]"n ),
				Means and Std Dev( 1 ),
				Mean Error Bars( 1 ),
				Std Dev Lines( 1 )
			);

		),
		Panel Box( "weight in role weight",
			dt &amp;lt;&amp;lt; dt &amp;lt;&amp;lt; Oneway(
				Y( :DV ),
				X( :Group ),
				Weight( :"weight[Group]"n ),
				Means and Std Dev( 1 ),
				Mean Error Bars( 1 ),
				Std Dev Lines( 1 )
			);

		)
	)
);

nw = New Window( "Tabulate comparison",
	H List Box(
		Panel Box( "w/o weight",
			dt &amp;lt;&amp;lt; Tabulate(
				Change Item Label( Grouping Columns( :Type( "All" ), "All" ) ),
				Show Control Panel( 0 ),
				Add Table(
					Column Table( Statistics( N ) ),
					Column Table( Analysis Columns( :"weight[Group]"n ), Statistics( Sum ) ),
					Column Table( Analysis Columns( :DV ), Statistics( Mean ) ),
					Column Table( Statistics( Std Dev ), Analysis Columns( :DV ) ),
					Row Table( Grouping Columns( :Type, :Group ), Add Aggregate Statistics( :Type, :Group ) )
				)
			)
		),
		Panel Box( "weight in role frequency",
			dt &amp;lt;&amp;lt; Tabulate(
				Change Item Label( Grouping Columns( :Type( "All" ), "All" ) ),
				Freq( :"weight[Group]"n ),
				Show Control Panel( 0 ),
				Add Table(
					Column Table( Statistics( N ) ),
					Column Table( Analysis Columns( :"weight[Group]"n ), Statistics( Sum ) ),
					Column Table( Analysis Columns( :DV ), Statistics( Mean ) ),
					Column Table( Statistics( Std Dev ), Analysis Columns( :DV ) ),
					Row Table( Grouping Columns( :Type, :Group ), Add Aggregate Statistics( :Type, :Group ) )
				)
			)
		),
		Panel Box( "weight in role weight",
			dt &amp;lt;&amp;lt; Tabulate(
				Change Item Label( Grouping Columns( :Type( "All" ), "All" ) ),
				weight( :"weight[Group]"n ),
				Show Control Panel( 0 ),
				Add Table(
					Column Table( Statistics( N ) ),
					Column Table( Analysis Columns( :"weight[Group]"n ), Statistics( Sum ) ),
					Column Table( Analysis Columns( :DV ), Statistics( Mean ) ),
					Column Table( Statistics( Std Dev ), Analysis Columns( :DV ) ),
					Row Table( Grouping Columns( :Type, :Group ), Add Aggregate Statistics( :Type, :Group ) )
				)
			)
		)
	)
);

dt &amp;lt;&amp;lt; Summary( Group( :Group, :Type, :"ColMean[Group]"n, :"ColStd[Group]"n, :"weight[Group]"n ), Freq( "None" ), Weight( "None" ) );&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 28 Mar 2022 12:27:52 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473672#M71834</guid>
      <dc:creator>Georg</dc:creator>
      <dc:date>2022-03-28T12:27:52Z</dc:date>
    </item>
    <item>
      <title>Re: "Weight" in "Fit Y by X" (and perhaps generally in JMP platforms)</title>
      <link>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473675#M71835</link>
      <description>&lt;P&gt;I don't think parameter estimation requires equal sample sizes or normalization.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2022 12:37:26 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/quot-Weight-quot-in-quot-Fit-Y-by-X-quot-and-perhaps-generally/m-p/473675#M71835</guid>
      <dc:creator>Mark_Bailey</dc:creator>
      <dc:date>2022-03-28T12:37:26Z</dc:date>
    </item>
  </channel>
</rss>

