I attach a dataset: How to substract duplicates of invoices starting by a 'letter' ?
Oct 5, 2017 6:56 AM(521 views)
Hello, I have a dataframe with a column named 'InvoiceNumber' (invoiceNo, example 541431). And another column named ' Stockcode'.
When InvoiceNo starts by a C (example C541433) and it matches a previous StockCode I have to cancel the 'Quantity' and/or 'Unitprice' from my analysis. (it means a customer RETURNED the item). I'm trying to do an RFM analysis and a Market Basket Analysis but first, I need to take this into consideration first.
How can I solve this problem? I have 500K+ rows of transactional data.