Discussions

caseylott · Jan 8, 2021 04:43 PM

Hi all. I've been struggling with something for a few days, so I thought it might be time to ask for the community's help. Note, this is a pretty long problem description since it leads step by step through 5 analysis scripts that are embedded in the attached data table. I haven't been able to figure out a more concise way to illustrate my analysis goal and the problems I'm having meeting it. If anyone has the time to look at this and find a solution, I'd be grateful. I've been stuck on it for a few days now.

The attached data table, called "LottSurveyDataForJMP" has results from an anonymous survey of forest management professionals. The RespondentID field is a unique ID for each participant where the time stamp is used instead of their name to preserve anonymity. The remaining 12 columns have the multiple response data type, since each questions allowed participants to select one or more check box responses. My survey software used semicolons for response delineation, so each column has either a single response (e.g., "forester) or a set of responses separated by semicolons (e.g., forester; forest planner; researcher).

The first 2 multiple response columns, "Affiliation" and "Role" describe the demographic make up of survey respondents. Note that many respondents had more than one affiliation and/or more than one role. The remaining 10 multiple response fields are answers to 10 different survey questions, numbered Q1 to Q10. My analysis goal is to summarize responses to each of these 10 questions by unique combinations of Affiliation and Role (e.g., State Agency * Forester or Non-profit organization * Forest Planner).

I have been able to figure out how to generate single factor summaries for all 10 questions by either Affiliation or Role using the categorical platform. The data table script called "All questions, 1 factor, "each individually" grouping" produces this result. However, I have not been able to figure out how to generate clean summaries of responses to each of the 10 questions by unique 2 factor interaction (e.g., Affiliation * Role). Instead, the first factor is parsed correctly, but the second factor is not. See my failed attempt to do this by running the data table scripts: "All questions, 2 factors, "combinations" grouping" or "All questions, 2 factors, "both" grouping." The only difference between these two scripts was in the value I selected for the "grouping type" drop down menu. Interestingly, selecting either "combinations" or "both" resulted in the same exact tables. In each of these cases, the same 2 factor table is created with Role levels nested within Affiliation levels, BUT... the role levels are no longer unique (e.g., State Agency affiliations are split out into 3 levels: "forester", "forest planner", and "forester; forest planner", when they should be split into only 2 unique levels: "forester" or "forest planner," where the case for the respondent who specified two roles as "forester; forest planner" should be assigned once to each of the separate categories, rather than retained as its own unique level.

Next, I used the "structured" tab of the categorical platform dialog to assign "role" to the "top" position and "affiliation" to the "side" position (run the "use of "structure" tab..." data table script to see this result. This produced exactly the type of two way table that was looking for, summarizing all unique 2 way combinations of Role and Affiliation. In this case, both demographic factors were treated the same way that the "each individually" grouping dealt with single x groping factors. This is exactly the table structure I am looking for in order to summarize results for each of the 10 survey questions. Unfortunately... when you run the next script "attempt to get unique role * affiliation results for individual questions..." you'll see that I wasn't able to generate the desired 2 factor summaries for individual questions.

I am hoping to create an output that looks exactly like the one I got for the "use of "structure" tab..." script where there are 10 tables, one for each question, with results summarized by unique combinations of role and affiliation, but I can't for the life of me figure out how to get this to work.

KarenC · Jan 9, 2021 01:41 PM

Categorical(
	Structured( :Role * :Affiliation, :Q1 Spatial scales ),
	Share Chart( 0 ),
	Legend( 0 )
);

Does this get to what you want? Using the structured tab I put the question on the side and the role nested within the affiliation at the top. The result is a table of counts of the responses for each affiliation/role combinations. What you have run into is that multiple response role is recognized in specific instances but not in all instances in JMP. So the grouping role was not recognizing the multiple response column type, but the structured element in the categorical platform DOES recognize the multiple response column type. Categorical data is tricky!!! I hope this helps you in your quest of making sense of what you have.

Karen
Karen

View solution in original post

dale_lehman · Jan 9, 2021 08:42 AM

I don't know if this helps at all, but I played with your data a bit. Caveat: I haven't used multiple response data, so I did not approach it that way at all (and after looking at this data, I would probably try to avoid multiple response data in the future!). In any case, I was inclined to recode the data. In the attached file (with the same name as the one you attached) there is a final script showing a treemap of one of the temporal responses, after recoding both the affiliation and role. My recoding may not make sense, as it should be guided by content expertise - but also by the frequency in the data (which is why I tried to isolate the most common responses to affiliation and role).

If this approach makes any sense to you, you can look at the second file I attached. It shows one example (example.jmp) - I used the column utility to take the multiple responses for temporal and broke those into separate columns (with indicator values). The saved script shows one example after stacking these new columns of a heatmap showing how common those 5 categories are for various roles and affiliations.

As I said, this may or may not be helpful to you. For me, visualizing the data makes more sense than these lengthy and sparse tables. In fact, the sparsity of the cells in your displays I think will be a problem no matter how you look at this. The combination of small sample size and multiple responses seems (to me, at least) to require extensive recoding of the data to make any sense of it. There are just too many sparse responses (combinations of your 2 demographic variables and particular multiple responses) for this small sample to say anything sensible. Recoding is the way I would approach this data - but since recoding data means losing the detail from the original data, it needs to be done carefully so as to not produce artificial patterns.

The other thought I have (in hindsight, of course) is that if you use multiple responses, it would really help if they had some natural order. Without any ordering, each particular set of responses needs to be equally weighted. I've seen this happen with other survey data. An example is student surveys of majors they are interested in - if they are not in order of preference, and each respondent can put 3 responses, then a particular major counts equally whether it is the first, second, or third choice. It makes analysis easier if the first choice represents the primary one.

KarenC · Jan 9, 2021 01:41 PM

Categorical(
	Structured( :Role * :Affiliation, :Q1 Spatial scales ),
	Share Chart( 0 ),
	Legend( 0 )
);

Does this get to what you want? Using the structured tab I put the question on the side and the role nested within the affiliation at the top. The result is a table of counts of the responses for each affiliation/role combinations. What you have run into is that multiple response role is recognized in specific instances but not in all instances in JMP. So the grouping role was not recognizing the multiple response column type, but the structured element in the categorical platform DOES recognize the multiple response column type. Categorical data is tricky!!! I hope this helps you in your quest of making sense of what you have.

Karen
Karen

caseylott · Jan 11, 2021 12:21 PM

Thank you, @KarenC . This worked perfectly. It's amazing how simple the answer was. I had not figured out that I could nest columns in the structured tab. I hope that other people having the same problem run into your solution, or that this detail gets added to the categorical platform>multiple response documentation, because it saves a TON of time. @dale_lehman, thanks for your work on this as well. I experimented with similar types of recoding, which led me to feel the same way about multiple-response data! But once again, JMP has a simple solution that makes the analysis of multiple response data a LOT quicker.

About the sparse matrix problem that @dale_lehman mentioned... It would be awesome if there was some way to limit table outputs to combinations of factors that have a minimum sample size for cases. For example, if I was able to specify 10 as the minimum number of cases that I'd like to use in subsequent analyses, the table would be a lot smaller and would only include 2 way combinations of affiliation * role that have a decent enough sample size for presentation and discussion of results. I'm not sure how this would be done, but if anyone has any ideas, please tag me @caseylott with your response.

Thank you again for your help, JMP community. You guys are great!

KarenC · Jan 12, 2021 06:12 PM

Ok, I might have a way for you to get to what you are trying to get to:

Run the structured analysis as we figure out.
Save the contingency table.
Use tabulate with your question on the left, the other roles nested at the top....yes you have another unwieldy table...but now...
Use the local data filter on frequency and filter out the low counts (so you can do this interactively to get to the size table that is useful).

Hope that helps!!

Karen

Discussions

Summarizing multiple response survey data by unique combinations of 2 demographic factors

Re: Summarizing multiple response survey data by unique combinations of 2 demographic factors

Re: Summarizing multiple response survey data by unique combinations of 2 demographic factors

Re: Summarizing multiple response survey data by unique combinations of 2 demographic factors

Re: Summarizing multiple response survey data by unique combinations of 2 demographic factors

Re: Summarizing multiple response survey data by unique combinations of 2 demographic factors

Recommended Articles