cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Jakedahsnake
Level II

How to plot Count of values of Nominal data vs. Continuous data

Hi there, I have a question on how to model some data I am looking at (JMP 11).

End goal: Display a Chart that shows Count of values of a Nominal Data Column (X) Vs. Continuous Data Column (Y).

Further in the weeds:

I have a Nominal Data Column: DieNumber.

And a Continuous Data Column: ResDelta.

I want to display the count of each DieNumber (ex: 1, 2, 3) displayed on the X axis instead of each DieNumber value, but in Graph Builder switching the summary statistic to N when ResDelta (Continuous) is plotted on the Y axis is summarizing Y and not summarizing the counts of each value of X.

The only workaround I've found is using the Overlay Plot function, associating ResDelta as \[Y\], associating DieNumber as \[By\], but this splits out each chart individually when I want the data all in one chart.

Really struggling with this so any help would be appreciated, thanks!
15 REPLIES 15
Victor_G
Super User

Re: How to plot Count of values of Nominal data vs. Continuous data

Hi @Jakedahsnake,

 

Like other members have tried to find solutions to your problem, I will try another one, based on the discussions.

In order to create the graph attached, here are the steps :

 

  1. Create a new column "Count" by Die ID : 
    // New formula column: Count
    Data Table( "dieexample" ) << New Formula Column(
    	Operation( Category( "Aggregate" ), "Count" ),
    	Columns( :ResDelta ),
    	Group By( :Die )
    );
  2. Create a graph showing the distribution of ResDelta depending on the number of points, with the information of Die ID in "Overlay" and boxplots to facilitate the comparison (optional, and might be messy if you have a large dataset): 
    Graph Builder(
    	Size( 984, 474 ),
    	Show Control Panel( 0 ),
    	Variables( X( :Count ), Y( :ResDelta ), Overlay( :Die ) ),
    	Elements( Points( X, Y, Legend( 10 ) ), Box Plot( X, Y, Legend( 12 ) ) ),
    	SendToReport(
    		Dispatch(
    			{},
    			"400",
    			LegendBox,
    			{Legend Position(
    				{10, [0, 1, 2, 3, 4], 12, [-1, -1, -1, -1, -1, -3, -3, -3, -3, -3]}
    			)}
    		)
    	)
    );

Once this is done, you should get to the graph attached (I added boxplots in the graph and script, but you can remove them if this is not useful for your objective): 

Victor_G_1-1673341984973.png

 

I also attached the dataset I used for demonstration with graph script attached.

Hope this will help you,

Victor GUILLER
Scientific Expertise Engineer
L'Oréal - Data & Analytics
Jakedahsnake
Level II

Re: How to plot Count of values of Nominal data vs. Continuous data

Thanks for the help everyone, if this clarifies things I would like to present a similar problem that might be easier to understand.

 

Example Problem:

 

Tim, Alex, & Sara run around a track. After each lap, a measurement is taken of Calories burned.

 

Tim ran around the track 2 times with Calories Burned: 75, 155 resepectively.

Alex ran around 4 times with calorie measurements of: 60, 130, 195, & 240 cal.

Sara ran around 5 times with measurements of: 65, 125, 180, 235, & 280 cal.

 

How do I plot (# of times ran around) vs. (Calories Burned)?

 

To relate this back to my problem:

(Nominal) Names = ChipDieNum,

(Nominal) # of times ran around = Count of ChipDieNum,

(Continuous) Calories Burned = Delta Res.

 

For the record I have 8,000+ data entries that I would like to model.

ih
Super User (Alumni) ih
Super User (Alumni)

Re: How to plot Count of values of Nominal data vs. Continuous data

Aah, thank you ask this example clarifies your use case!  If I understand correctly, you want all of the individual track times plotted as separate points, not just a single point representing both of Tim's runs.  I would approach this by making a column counting the number of times each person ran around the track so far, and then just plotting that against Calories Burned.

 

The formula for the times around track column could look like this (note how it just sums the number instead of having a column reference in the first argument):

ih_2-1673367524792.png

 

ih_1-1673367482406.png

Script to recreate chart:

View more...
Names default to here(1);

dt = New Table( "Count Table",
	Add Rows( 11 ),
	New Column( "Name", Character, "Nominal",
		Set Values( {"Tim", "Tim", "Alex", "Alex", "Alex", "Alex", "Sara", 
			"Sara", "Sara", "Sara", "Sara"} ) ),
	New Column( "Calories Burned", Numeric, "Continuous", Format( "Best", 12 ),
		Set Values( [75, 155, 60, 130, 195, 240, 65, 125, 180, 235, 280] ) ),
	New Column( "Times around Track", Numeric, "Continuous", Format( "Best", 12 ),
		Formula( Col Cumulative Sum( 1, :Name ) ) )
);

g = dt << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Variables( X( :Times around Track ), Y( :Calories Burned ), Overlay( :Name ) ),
	Elements( Points( X, Y, Legend( 13 ) ), Smoother( X, Y, Legend( 15 ) ) )
);
Jakedahsnake
Level II

Re: How to plot Count of values of Nominal data vs. Continuous data

This is super close, I don't think JMP 11 has Col Cumulative Sum() as an argument. But that is the type of graph and modeling I am trying to see!

txnelson
Super User

Re: How to plot Count of values of Nominal data vs. Continuous data

If you sort your data by "Name" (Using @ih example), the following formula will give you the Column Cumulative Sum in JMP 11

If(Lag(:Name) != :Name | Row() == 1, x = 0); 
x++; 
x
Jim
Jakedahsnake
Level II

Re: How to plot Count of values of Nominal data vs. Continuous data

Thanks for your help! Here's what it ended up looking like. I'll clean it up a bit more, but that's it!

Jakedahsnake_0-1673382981199.png