Subscribe Bookmark RSS Feed

Complex tabulation in transactional data to count frequency based on different dates per CustomerID

Alexllorens

Occasional Contributor

Joined:

Oct 3, 2017

Hello, I want to convert the transactional data below into a matrix to count the frequency of DIFFERENT days the customer made orders.

 

I'm basically trying to figure out, based on transactional data for 2 years. How many different days a customerID has made purchases.

 

For example, Customer ID 17850 has frequency of 1 in the data set below (it purchased different items during that day as you can see in the transactional data).

 

I'm trying TABULATE, but not sure how to come up with rows of single customerIDs and the frequency (based on unique days).

 

Example of Tabulation outcome (based on unique dates, not on ROWS!):

 

 

Customer IDFrequency 
1785045
1304730

 

 

PS: I attach the excell with transactional data

 

InvoiceNoStockCodeDescriptionQuantityInvoiceDateUnitPriceCustomerIDCountry
53636585123AWHITE HANGING HEART T-LIGHT HOLDER61/12/2010 8:262.5517850United Kingdom
53636571053WHITE METAL LANTERN61/12/2010 8:263.3917850United Kingdom
53636584406BCREAM CUPID HEARTS COAT HANGER81/12/2010 8:262.7517850United Kingdom
53636584029GKNITTED UNION FLAG HOT WATER BOTTLE61/12/2010 8:263.3917850United Kingdom
53636584029ERED WOOLLY HOTTIE WHITE HEART.61/12/2010 8:263.3917850United Kingdom
53636522752SET 7 BABUSHKA NESTING BOXES21/12/2010 8:267.6517850United Kingdom
53636521730GLASS STAR FROSTED T-LIGHT HOLDER61/12/2010 8:264.2517850United Kingdom
53636622633HAND WARMER UNION JACK61/12/2010 8:281.8517850United Kingdom
53636622632HAND WARMER RED POLKA DOT61/12/2010 8:281.8517850United Kingdom
53636784879ASSORTED COLOUR BIRD ORNAMENT321/12/2010 8:341.6913047United Kingdom
53636722745POPPY'S PLAYHOUSE BEDROOM 61/12/2010 8:342.113047United Kingdom
53636722748POPPY'S PLAYHOUSE KITCHEN61/12/2010 8:342.113047United Kingdom
53636722749FELTCRAFT PRINCESS CHARLOTTE DOLL81/12/2010 8:343.7513047United Kingdom
53636722310IVORY KNITTED MUG COSY 61/12/2010 8:341.6513047United Kingdom
53636784969BOX OF 6 ASSORTED COLOUR TEASPOONS61/12/2010 8:344.2513047United Kingdom
53636722623BOX OF VINTAGE JIGSAW BLOCKS 31/12/2010 8:344.9513047United Kingdom
53636722622BOX OF VINTAGE ALPHABET BLOCKS21/12/2010 8:349.9513047United Kingdom
53636721754HOME BUILDING BLOCK WORD31/12/2010 8:345.9513047United Kingdom
53636721755LOVE BUILDING BLOCK WORD31/12/2010 8:345.9513047United Kingdom
53636721777RECIPE BOX WITH METAL HEART41/12/2010 8:347.9513047United Kingdom
53636748187DOORMAT NEW ENGLAND41/12/2010 8:347.9513047United Kingdom
53636822960JAM MAKING SET WITH JARS61/12/2010 8:344.2513047United Kingdom
53636822913RED COAT RACK PARIS FASHION31/12/2010 8:344.9513047United Kingdom
53636822912YELLOW COAT RACK PARIS FASHION31/12/2010 8:344.9513047United Kingdom
53636822914BLUE COAT RACK PARIS FASHION31/12/2010 8:344.9513047United Kingdom
53636921756BATH BUILDING BLOCK WORD31/12/2010 8:355.9513047United Kingdom
53637022728ALARM CLOCK BAKELIKE PINK241/12/2010 8:453.7512583France
53637022727ALARM CLOCK BAKELIKE RED 241/12/2010 8:453.7512583France
53637022726ALARM CLOCK BAKELIKE GREEN121/12/2010 8:453.7512583France
53637021724PANDA AND BUNNIES STICKER SHEET121/12/2010 8:450.8512583France
53637021883STARS GIFT TAPE 241/12/2010 8:450.6512583France
2 REPLIES
txnelson

Super User

Joined:

Jun 22, 2012

The solution to your question is very simple.  You can easily do this interactively or as a script. These steps are all found in the introductory JMP documentation, which I suggest you take the time to read.  It will give you a good overview on the different components available.

     Help==>Books==>Using JMP

Interactive:

1. Open the Excel file from JMP

     File==>Open

2. The CustomerID opens as a Numeric/Continuous column.  It needs to be an Ordinal value.

3. The InvoiceDate column is formatted to have the Time Part also displayed.  You want to deal with InvoiceDate as separate days, not times within days, so change open the Column Info for that column and change the format to "d/m/y"

4. To summarize the data to one row per CustomerID/InvoiceDate, run

     Tables==>Summary

Put both CustomerID and InvoiceDate into the Group selection and Click OK

5. Now to Summarize down to one row per CustomerID, run the Summary Platform again, against the newly created table

     Tables==>Summary

Placing the CustomerID column into the Group selection, and Clicking on OK

 

The resulting table will have your requested values

 

The script below, was basically generated by running the above steps interactively and then just saving the scripts for each step into a script window.  I wrote only a minor part of the code below.  JMP generated it.

names default to here(1);

// Open the Excel file
dt=Open(
	// Change the below to the path to the Excel File
	"<Path to Folder>\Online Retail.xlsx");

// The CustomerID should not be treated as a continuous,
// analytical column of data, so
// Change CustomerID to an Ordinal Modeling Type
dt:CustomerID<<modeling type("Ordinal");

// The InvoiceDate column is formatted to reflect time of day,
// and only the Date portion is wanted for this problem, so
// Change the display format to a Date format
dt:InvoiceDate<<format("d/m/y");

// Summarize the data to create a table that has only a single
// row in the new table for each CustomerID/InvoiceDate
dtSum1 = dt << Summary(
	Group( :CustomerID, :InvoiceDate ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 )
);

// Display the results using 
//     Analyze==>Tabulate
// which will allow for the Calculation of the
// counts of the number of different InvoiceDates
dtSum1 << Tabulate(
	Show Control Panel( 0 ),
	Include missing for grouping columns( 1 ),
	Add Table(
		Column Table( Analysis Columns( :InvoiceDate ), Statistics( N ) ),
		Row Table( Grouping Columns( :CustomerID ) )
	)
);

// Or

// Run the Summary Platform again, but this time run it on
// the current Summarized data table to count the number of
// rows for each CustomerID
dtSum2 = dtSum1 << Summary(
	Group( :CustomerID ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 )
);

// Now Clean up the N Rows name to # of Invoiced Days
dtSum2:N Rows << set name("# of Invoiced Days");

 

Jim
dale_lehman

Community Trekker

Joined:

Jan 29, 2015

An alternative way to do this.  First, as Jim says, change the Customer ID (either to Nominal or to Ordinal).  Create a new date column by changing the current InvoiceDate to a Character Nominal variable.  Create a new column for date by using the Left (Invoice Date,10) - this strips away the time.  Then use Tabulate, with Customer ID followed by the new date column (both as rows).  At the red arrow, Make into Data Table.

 

Then, create a new column with the formula Column Number (date, by Customer ID).  This gives the number of distinct days each customer placed orders.