cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar

Is there a possibilty to distinguish between two types of missing data?

Hi,

 

I have an excel data table that looks like this (file attached):

TestSignal 1Signal 2Signal 3
1334
24not reached 
35 6
46 not reached

It includes a number of tests. Each test has a number of signal channels (Signal 1-3), but not all channels were used for all tests. In cases where the signal channel was not used for a given test the corresponding field is empty.

The numerical values for each channel mark the time when the signal reaches a certain threshold value. If the signal does not reach the threshold, this in indicated by the string "not reached".

For the evaluation it would be important to be able to distinguish between the two cases of missing data:

1) there is no data because the channel was not used.

2) there is no data because the threshold was not reached.

Currently the data are all of type "character" but of course I need to convert them into "numeric" for the evaluation. However, as soon as I convert the data type of a column to "numeric" the entries "not reached" are deleted and there is no distinction anymore between the two types of missing data.

I've tried to use the recode option to replace "not reached" by NaN or nan and then convert the columns to numeric but the result was the same (the NaN or nan were deleted).

Is there a way that would allow distinguishing between the two cases of missing data? I thought of having no value in one case and something like NaN in the other case.

(Either an interactive or a scripting solution would be fine.)

 

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Is there a possibilty to distinguish between two types of missing data?

Couple of suggestions come quickly to my mind (these solutions depend on the analysis you want to do and might not work correctly in your case)

 

You could leave empty values as empty (will be replaced with . when changed to numeric) and replace not reached with 9999 or something similar. And then use one of options found from Column Properties (Value Labels, Missing Value Codes might work).

 

Or then you could stack the data. All columns which are to be stacked must have same data type, so change the not reached to 9999 before stacking. Then after stacking you could create a formula, which will tell what the issue was with the specific signal:

jthi_0-1628666442205.png

And then again use Column Properties if needed depending on what you are doing.

-Jarmo

View solution in original post

2 REPLIES 2
jthi
Super User

Re: Is there a possibilty to distinguish between two types of missing data?

Couple of suggestions come quickly to my mind (these solutions depend on the analysis you want to do and might not work correctly in your case)

 

You could leave empty values as empty (will be replaced with . when changed to numeric) and replace not reached with 9999 or something similar. And then use one of options found from Column Properties (Value Labels, Missing Value Codes might work).

 

Or then you could stack the data. All columns which are to be stacked must have same data type, so change the not reached to 9999 before stacking. Then after stacking you could create a formula, which will tell what the issue was with the specific signal:

jthi_0-1628666442205.png

And then again use Column Properties if needed depending on what you are doing.

-Jarmo

Re: Is there a possibilty to distinguish between two types of missing data?

Thanks. That seems to be doing quite nicely what I was looking for.