How to Categorize Missing Values (MCAR, MAR, NMAR)
May 20, 2019 4:43 PM(272 views)
I am working with petrophysical well log data (gamma ray, porosity, density, etc.) from an unconventional reservoir. The data I have are very sparse with only a total of 14 wells, 2 of which have no logging information available for me. I am working on classiying missing values in my dataset so I can impute these missing values. I have identified about 5 different mechanisms for the missing values and have generated categorical "dummy" variables for those missing values. For example, where data are present I labled everything a 1, where data are missing due to the tool's staggard placement on the cable I labled a 2, where data are not being logged below formation of interest I labled a 3, where data were not available I labled a 4, and outliers were labled a 5. I would like to figure out a way to categorize these mechanisms as either missing at random, missing completely at random or not missing at random. I was thinking about using a the fit Y by X platform and running some tests to compare percentages of these missing variables across the wells. What are some options to do this? I am open to other ideas as well. Thanks in advance!