cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
Generalized Linear Mixed Model Add-in

This add-in allows you to run the generalized linear mixed model in JMP. Default parameters settings are mostly the same as the SAS %GLIMMIX macro, which uses iteratively reweighted likelihoods to fit the model (Wolfinger & O'connell, 1993).

p0.png

The Select Columns panel shows all the variables in the current dataset.

 

The Pick Role Variables panel allows you to specify:

  • Y: the response variable.
  • Weight: a weighting variable for the analysis.
  • Offset: the offset variable. The default is no offset.
  • By: the by variable allows you to fit models for subgroups simultaneously.

The Construct Model Effects panel includes:

  • Fixed effects
  • Random effects

The Options panel includes:

  • Distribution: the response variable distribution. The default is Binomial.
  • Link: the link function to use to transform the response variable into pseudo scores before utilizing the mixed model. The canonical link functions are provided.
  • Max Iteration: the maximum number of iterations for the algorithm to converge. The default value is 30.
  • Convergence criterion: if the RMSE is less than the value of convergence criterion, then the iteration will stop. The default value is 1E-8.
  • Correction factor: a constant added to the response variable at the initial transformation step in order to avoid singularities. The default value is 0.5.
  • By Group Report Limit: when specifying a By variable, if the number of subgroups is greater than this limit number, the report output will be suppressed. Instead, a summary table and a Graph Builder interface will be provided.

We will illustrate the details of the add-in using examples available online (%GLIMMIX macro results website: https://support.sas.com/techsup/notes/v8/25/030.html or the PROC GLIMMIX website: https://go.documentation.sas.com/?docsetId=statug&docsetVersion=15.1&docsetTarget=statug_glimmix_exa... ). To derive comparative results using the SAS Macro, set ‘DDFM=KR’ option in the MODEL statement, also set ‘NOBOUND’ option in the PROC MIXED statement.

 

Demo videos are also available online:

Example 6 -- Salamander data long format: https://youtu.be/6gCqufudFV4

Example 7 -- RNA-seq data: https://youtu.be/o4LeTmVzzVE

 

Example 1 – Poisson distribution (ship_demo.jmp):

The following is glimpse of the ship data (originally from McCullagh and Nelder (1989), Section 6.3.).

The response variable is “N”, which follows Poisson distribution. The fixed effect in the Generalized Linear Mixed Model is “type”; and the random effects are “year, period, year*period”. The offset variable is “service”.

p2.png

The default link function for Poisson distribution is the ‘log’ link. Note that, if we want to use the canonical (default) links for a selected distribution, we can just leave the link box blank.

Interaction terms can be added by selecting two or more variables and click the “cross” button in the Fixed Effects or Random Effects panel.

p3.png

Click “Run” button after specifying all the model parameters. After 28 iterations, the algorithm converged. The result output is of the Mixed Model format in JMP. Note that when comparing to the %GLIMMIX SAS macro results, due to the difference in design matrix coding, we suggest comparing the F-test results, LS means or Variance component estimates instead of the fixed effects parameter estimates.

p4.png

 

Example 2 – Binary distribution (sal1.jmp):

The following is glimpse of the Salamander data (originally from McCullagh and Nelder (1989), Section 14.5). Note that before running the analysis, define proper modeling type for variables of interest. In the example below, the variables “fnum, mnum” are continuous numbers. But if we want to use them as categorical variables, we can right click the variable name > column information, then change the modeling type to nominal.

p5.png

Model specification: Response variable: ybin. Distribution: Binomial. Link function: logit. 

Fixed effect: fpop, mpop, fpop*mpop. Random effect: fpop*fnum, mpop*mnum.

MeichenDong_0-1617129419024.png

 

The algorithm converges after 10 iterations.

p7.png

 

 

Example 2.1 Binary distribution (sal1_ynominal.jmp):

If the outcome is characteristic/nominal, e.g. “ybin_char” in the following data (levels are Y and N), we offer the option ‘Reference level for Nominal Outcome (Y)’. You can specify the reference level, for example ‘N’ as default. Another example: outcome Y has levels ‘High’ and ‘Low’, and you can specify ‘Low’ in the Reference level for Nominal Outcome (Y) box.

MeichenDong_2-1617129492875.pngMeichenDong_3-1617129501010.png

Example 3 – Binomial distribution (blotch.jmp)

p8.png

Data source: https://go.documentation.sas.com/?docsetId=statug&docsetTarget=statug_code_gmxex04.htm&docsetVersion...

 

The response variable “prop” indicates the incidence. “site” and “variety” are the two fixed effects. We select the “Binomial” distribution and the default link function is “Logit”. (We specify no random effect in this example.)

p9.png

The LSmeans of “variety” and “site” are consistent with that from the %GLIMMIX macro and PROC GLIMMIX. However, the F-test results showed discrepancies.

p10.png

 

Example 4 – Binomial distribution (rcb_binomial.jmp)

p12.png

 After 6 iterations, we derive the following result. This result is consistent with the result generated by the %GLIMMIX macro, but slightly different from the result from PROC GLIMMIX.

p13.png

To derive the predicted “events/trials” or “successes/total trials” rates, perform the transformation:

MeichenDong_0-1601061041583.png where the estimate of MeichenDong_1-1601061041585.png is the “_pred” column in our data table. This can be done by manually adding a column in the data table and set formula to it.

Specifically, after running the GLMM addin, your dataset looks like below.

add1.png

We can create a new column, and set formula to it:

add2.png     add3.pngadd4.png

Then, we could eventually derive the new variable predicted_rate as shown below:

add5.png

Example 5 – Binomial distribution (hessianfly.jmp)

p14.png

Data source: https://go.documentation.sas.com/?docsetId=statug&docsetTarget=statug_code_gmxex01.htm&docsetVersion...

 

The response is “y/n”, which is of the form “events/trials” or “successes/total trials”. When specifying the response variables Y, we need to select the “events” variable first, in this case “y”, and then select the “trials” variable afterwards. Hence, “n” is selected after “y”. Set “entry” as fixed effect and “block” as random effect. Choose “Binomial” distribution and use the default “Logit” link function.

p15.png

After 6 iterations, we derive the results as follows:

p16.png

The results of the above examples can be verified by the SAS PROC GLIMMIX and the %GLIMMIX macro. The code SAS_Example_macro_proc_verification.sas is available upon request.

 

Example 6 – Binary distribution (sal1_long.jmp):

When the data consists of subsets and you wish to model each group/subset independently, you can specify the group/subset indicator variable in the “By” field.

In this example, the sal1_long.jmp is the stacked version of the Salamander data (originally from McCullagh and Nelder (1989), Section 14.5). Here, the response variables “y1, y2, y3” are stacked into the variable “ystack”. The variable “ycat” indicates the original response category.

p17.png

Here, specify “ystack” as the response variable, and “ycat” as the By variable. Similar to the settings in example 2, we can specify the distribution, fixed effect, random effect.

p18.png

The GLMM procedure will be performed for each group/subset specified in the By field. The final report is as below. Results are listed for each group/subset – response y1_pseudo, response y2_pseudo, and response y3_pseudo.

p19.png

 

Example 7 – RNA-seq data

Since genomics data usually is of high dimension, where the number of genes / SNPs is usually greater than 10k, performing the integrated analysis could be computationally challenging. For future interest of using this add-in to analyze the RNA-seq data, we test the performance when increase the number of genes which are specified in the By field. This section will serve as a part of the stress test, as the By function is not written in multi-threading.

The example data used here is public online, from the package (Hoffman & Roussos, 2020): https://bioconductor.org/packages/release/bioc/vignettes/variancePartition/inst/doc/dream.html

  • varPartDEdata_count.jmp is the original count matrix
  • varPartDEdata_metadata.jmp is the metadata for the 24 samples
  • varPartDEdata_stack10.jmp is the processed data, where gene expression from the first 10 genes are stacked, along with other metadata variables.
  • varPartDEdata_stack100.jmp is the processed data, where gene expression from the first 100 genes are stacked, along with other metadata variables.

p20.png

From the example online, for each gene, we can fit the model as:

Y ~ DiseaseSubtype + Sex + (1|Individual)

 

, where “DiseaseSubtype” and “Sex” are fixed effects and “Individual” is the random effect. Here the response variable Y

will depend on the choice of distribution. One choice of the distribution is “Poisson” distribution, and in this case we use the raw count “expression” as the response variable. An alternative way to model the RNA-seq data is taking the log transformation of the counts first, and use “logexprs” as the response variable while specifying the “Normal” distribution.

 

p21.png


p22.png

 

Using the example data varPartDEdata_stack10.jmp, we can derive the results for each of the ten genes separately:

p23.png

 

Using the “logexprs” as the response variable, the results from Analyze>Fit Model>Mixed Model and our GLMM add-in are the same. When using the raw counts “expression” as the response and specify “Poisson” distribution, the run time is longer. If the maximum number of iterations is reached, it could imply the selected distribution or link function may not well fit for the data.

 

Number of subgroups (genes) in By variable

Poisson Run time (HH:MM:SS)

Normal Run time (HH:MM:SS)

Mixed Model using log(count) run time (HH:MM:SS)

10

00:01:15

00:00:06

 

100

00:06:00

00:00:43

00:00:01

1000

01:25:19

00:09:05

00:01:41

10000

11:11:05

01:31:58

 

……

 

 

 

 

When the number of genes/subgroups specified in By field is greater than By Group Report Limit=10, for example using varPartDEdata_stack100.jmp, we will no longer show the report. Instead, a “GLMM result” JMP table along with a volcano plot will be provided. The “GLMM result” table includes part of the F-test results (DFDen_, F Ratio_, Prob > F_), the least square means (LSMeans:), as well as part of the Parameter Estimates (Estimate:, -log10(Prob>|t|):).

By specifying “logexprs” as Y, and specifying “Normal” distribution, we can derive the dataset below. (Available as GLMM result.jmp)

 

p24.png

 

The default volcano plot is generated for the first fixed effect specified. You can switch columns to look at volcano plot for other variables of interest.

 

p25.png

(100 genes, logexprs)

 

p26.png

(10000 genes, logexprs)

 

p27.png

(10000 genes, expression, Poisson)

 

References

Hoffman, G. E., & Roussos, P. (2020). dream: Powerful differential expression analysis for repeated measures designs. BioRxiv, 432567.

Wolfinger, R., & O'connell, M. (1993). Generalized linear mixed models a pseudo-likelihood approach. Journal of statistical Computation and Simulation, 48(3-4), 233-243.

Comments
sbrlsi

This add-in will be of great value to JMP users.

PDunn

Hi

Yes, this looks very useful.  I tried the ship example and it works, but the last example with the RNA-seq data gives me a cryptic error, and it also gives me a somewhat similar error with my own RNA-seq data.  With the varPartDEdata_stack10.jmp data set, I followed the directions (and YouTube video) with Y=logexprs, By = Gene, Normal distribution, fixed effects of Sex and DiseaseSubtype and random effect of Individual.  I hit Run and it has an error:  Name Unresolved: Gene at row 1 in access or evaluation of 'Gene', Gene/*###*/.

Any suggestions?

Thanks, Peter

Hi @PDunn , I see your problem and yes it did happen to me sometimes and I'm trying to fix it. With the current version, can you try to re-install the add-in and see if the RNA-seq example work? Renaming the 'by' variable in the datatable sometimes also work. 

PDunn
Hi
Thanks for getting back to me. I just "unregistered" the add-in, then re-installed it. Then I tried the add in again on the "varPartDEdata_stack10" file. The addin did not work even after I renamed the "by" variable to Gene1 (originally Gene) or copied it to a new column and renamed it. I also tried restarting the program (JMP 16EA) and it still had the problem.
So I'm not sure what could be wrong. The error says: Name Unresolved: Gene1 at row1 in access or evaluation of 'Gene1', Gene1/*###*/
If I leave out the "by" variable (Gene1), it will run though, but it is obviously valuable to know what genes are significant here.
Thanks, Peter

Hi Peter @PDunn , thanks for trying that! (did you try not "unregister" it and directly re-install it?) There's definitely something that needs to be fixed! The "By" function is added recently and it might need more debugging. I was using JMP 15.2 while writing the add-in but I don't think the version matters. This is actually my summer internship project and it might take me a while to fix the bug and update it since now I have limited access to JMP. I'll keep you updated once I have a solution! 

PDunn
Hi
Thanks for getting back to me again. Yes, I tried re-installing it and I still had the same error. I also tried it on JMP 14 on a mac. I was using Win10 before. Still the same issue. But after thinking about this more, and looking at the output (the online file GLMM result.jmp), I think I can still use this without the “By” option. For surveying a lot of genes at once, I actually want the per gene significance results which I do not get with the “by” command. On the other hand, you probably need the “by” command to get the volcano plots. At least for normally distributed response variables, I can keep using MixedModel. It would be nice to have this running though. I wonder if they have this running in the JMP Genomics package (I’ve never looked at it).
Thanks for your help again, and have a good rest of your summer. Peter

Hi Peter @PDunn , yes the volcano plot is a nice summary figure and the MixedModel does not directly generate that. Will update if this gets fixed! JMP Genomics has a nice RNA-seq analysis workflow and you can check on that if interested. -Meichen

Hi @PDunn , I've updated the add-in file here. Could you please download it again and try it now? I think this fix should solve the problem. Please let me know if you have more questions! -Meichen

PDunn
Hi Meichen
Thanks very much. It works great! Hopefully that was not too much extra work to fix.
Peter

When you say we have to install this Add-In, how does that work? My university has a site license for JMP, and we can either download a copy onto our main computer, or use it in what we call a Virtual Den via a Citrix environment. When my students use JMP in the classroom on their laptops, they mostly use it in the Virtual Den. In that case, would we need our network administrators to include the Add-In? Thanks.

Can fixed or random effects be nested? That's not clear from the construct model part of the interface shown in the videos. 

Hi @CraigSargent ,

For your first question, when you use JMP on virtual lab/Den, as long as you've saved the addin file to the local drive first, you should be able to install it virtually as well! I just tried it using my school's virtual lab and it works well. So Please let me know if you or your students encounter any problem!

For the second question, our current addin version does not allow fitting the nested effects yet. We might add that function later!

Thank you. Where do I find the add-in file, to save it to my local drive?

On nested effects, I hope you add that feature soon. Most of my experimental designs in my research involve nested effects. I'll keep using GLIMMIX in SAS for that, until you add fitting nested effects to your JMP Add-In.

Thanks for tackling GLMMs in JMP. That's something I've been requesting for years.

Hi @CraigSargent , at the very top of this webpage, you could see our source files. Save the file "Generalized Linear Mixed Model.jmpaddin" to your local drive/folder, say Desktop. Then from the virtual den JMP, choose > File > Open > Desktop > Generalized Linear Mixed Model.jmpaddin. 

Findaddin.PNG

 

For the nested effects question, @sbrlsi  actually suggested that it's simple to avoid using it. We just need to create a column with unique ids within each level of the nesting variable.

Let's see a simple example here: https://www.jmp.com/support/help/en/15.2/index.shtml#page/jmp/two-factor-nested-random-effects-model...

You could access the "2 Factors Nested.jmp" example data by choosing Help> Sample Data > Measurement Systems > 2 Factors Nested.jmp. The data looks like this:2factornested.png

You could see that "Part" is nested in "Operator", and "Part" has unique ID within each level of "Operator".  

You could fit two models like the following, either using Part[operator] or Part, and the results will be the same.

n1.png

 

n2.png

 

n3.png

 

Please let me know if you have questions about this!

Thanks, for clarifying how to get the add-in.

Thanks too for the workaround for nested effects.

Where I use nested effects are in designs like split plots, or long-term designs that involve experimental trials within and among years; e.g. Year is a random effect, and Trial nested within Year is a random effect.

 

 

Hi @CraigSargent , when Year is a random effect and Trial is nested within Year, we could also do the similar thing. You could try to test the following two models:

Model 1: Y = fixedEffects + Year + Trial[Year]

Model 2: Y = fixedEffects + Year + new_variable

where "new_variable" is the unique IDs for each Year and Trial.

For example, if we have Year = 2018, 2019, and for each year, and we have Trial levels = A, B, C, D. So the data may look like:

Year     Trial     new_variable

2018    A          2018_A

2018    B          2018_B

2018    C          2018_C

2018    D          2018_D

2019    A          2019_A

2019    B          2019_B

2019    C          2019_C

2019    D          2019_D

Model 1 and Model 2 should give us the same result. Hence, by using Model 2, we don't need to specify "nested" effects anymore.

Please let me know if the above is not clear enough, I'd be happy to give more examples.

-Meichen

This great, exactly what I was looking for, particularly with the work-around of nested effects.

I am having trouble running it though. The necessary options Distribution and Link are grayed out and cannot be clicked on to make a selection.

What am I doing wrong?

 

Thanks!

Is this maybe version-specific? I am still on Ver. 10 but I have access to 14 or 15 elsewhere, will it work on 14 or 15?

sbrlsi

Hi I_yampolsky,

 

It could be that you are in version 10. Yes, definitely it works in 15.

 

Thanks!

Sandeep123

Is there a way to run this for categorical y variable ( logistic regression)



 

Hi @Sandeep123 , example 2 and example 6 are for the binary y variable. The default link function is logit. Would this be something you are looking for?

Sandeep123

I am interested in nominal outcome variable (mortality yes/no). I have data from multiple sites. Wanted to include sites as random variable in the model

 

jmp won’t allow random effects so I am searching for options

Hi @Sandeep123 , this is a good question! For the current version, we can do "recode" for the outcome variable, and perform the analysis (see figures below). We will update the addin to allow the nominal outcome soon.

MeichenDong_0-1617109983178.pngMeichenDong_1-1617110055105.png

 

Hi @Sandeep123 , I just updated the addin, please see the new added 'Example 2.1' part. This should allow you to work with nominal outcome without manually recoding the outcome to numeric format. Please let me know if you have questions! Thanks for your feedback!

juaccco123

I just downloaded the add in and I am trying to cross variables for my fixed and random effects. However, when i select the variables and click on "cross", nothing happen. 

Could you help me?

 

Thank you

Hi @juaccco123 , did you try to select all variables of interest before clicking on cross? This works on my end. Are you using a mac or windows machine? What JMP version are you using?

I can also take a closer look if you have any toy dataset. 

-Meichen

juaccco123

Hi Meichen, thanks for the quick reponse. 

I am using JMP 15, I tried your suggestion on a PC (putting the variable before the model variables) and stil lnot working. 

How can I share you my dataset?

Hi @juaccco123 , my email is Meichen.Dong@jmp.com . Please feel free to send out detailed questions or toy datasets!

-Meichen

YAMAg

Hello, this is just what I've been looking for and it seems useful.
I would like to ask you a question as I am not sure how to perform the analysis.
I would like to do a GLMM in order to search for factors that are related to an outcome that is a binary variable. In doing so, my data is nested in individuals because it was collected multiple times from the same subject.
I added the outcome (binary variable, name) to Y, exploratory factor to Fixed effect, and ID (to identify the subject) to Random effect; I did not add anything to Weight, Offset, or By. However, I couldn't analyze it well because I got an error message of Columm/*##*/(*Cond Pred Fomula_pseudo*). I hope you can give me some good advice.

 

Hi @YAMAg , thanks for the question! I wonder if you have tried downloading the example data and see if the addin works?  Could you check if the variables' modeling types are correctly set? If the above steps do not help, please feel free to email your toy example/detailed questions to Meichen.Dong@jmp.com .

Best,

Meichen

MagaSganga

Hi @MeichenDong. I´m using your add-in, and I have a problem with By. I cannot make it work. When I select a column and put it in By it throw my this error:

 

Glimmix tiene 13 parámetros (response, fixed_effects, random_effects, distribution, link, maxiter = 50, converge = 0.00000001, offset = 0, cf = 0.5, weight = 1, expon = 1, numder = 0.00001, ynom_reference) pero se han indicado 10 argumentos ("PL MIN", "Posicion unificada", "ID, Contrario", "Poisson", "", 30, 0.00000001, "", 0.5, "").
al acceder o evaluar 'Function' , Function(
{response, fixed_effects, random_effects, distribution, link, maxiter = 50,
converge = 0.00000001, offset = 0, cf = 0.5, weight = 1, expon = 1, numder =
0.00001, ynom_reference},
{iter, rmse, prev, curr, err, dt},
dt = Current Data Table();

 

and the text continues. The column Cat that I put in By es numeric and ordinal. The program works when I left this in blank. 

I download and install this add-in 5 days ago, so I think I have the last version. 

Thank you. 

Hi @MagaSganga ,

Have you tried with our example data and see if that works on your machine with your current JMP version? 

One thing I would suggest to try is to set the "By" variable as character and nominal modeling type. If you still have problem running this, please feel free to send us more details or a toy example data. 

smmarsh2

I am beginning to learn GLIMMIX. However my mentor uses it frequently in SAS (not JMP). His comment for the plug in is:

 

In looking at their examples they have always used the Variance Components  as the covariance matrix. That matrix for me has never given the lowest AICC. Hopefully there is a way to change he matrix used.

 

Are there other options for the covariance matrices?

 

Thank you!

Stephanie

 

Hi @smmarsh2 , thank you for the question! Currently in this addin, we do not provide the options for the covariance matrices. However, we will have a new JMP GLMM platform in JMP 17 (the new release). The new platform initially is G-side covariance matrices only, which in JMP means variance components and random coefficient models (with covariances). R-side covariance structures for repeated measures and/or spatial data are on the plan for a future release.

Please let me know if you have any questions.

Best,

Meichen

Hi @MagaSganga , I just figured out the problem you mentioned, and please try the updated addin! Thanks for your feedback!

smartxalex

Good evening,

 

I was hoping someone might be able to help me with an issue.  I cannot get this add-in to work for the life of me.  I can get it to work for the sample data provided above; however, I have some very simple data I would like to model.  I have a continuous dependent variable; one continuous independent variable, and one binary variable.  I want to test the effect the binary variable has on the continuous dependent variable but I keep getting an error message stating:

 

"all rows have data that is excluded or missing.  The column with the most missing data is (binary variable)"; however, there is no missing data.  All rows have data entered.

 

Upon clicking "Okay," I get another error message stating:

"Could not find column in access or evaluation of 'Column', Column/*###*/("Cond Pred Formula _pseudo")".

 

When looking at the generated table, both "_eta" and "_pseudo" have blank values.

 

I'm currently on JMP 15.  Any help would be greatly appreciated as I would like to rely solely on JMP for my output rather than having to go back and forths between different software to meet my needs.  Thank you in advance!

 

Best regards,
Alex

Hi @smartxalex Alex, you mentioned that your outcome variable is continuous, so how did you specify the parameters in the addin? Specifically, what "Distribution" and "Link" were specified? If the outcome follows normal distribution, then you may try the MIXED model in JMP. Please feel free to email me about more details/toy examples of your question! My email is:   Meichen.Dong@jmp.com

 

Best,

Meichen

Ning214

Hello,

 

All the data samples run smoothly in my jmp pro 15. However, when I'm running my count data (Poisson distribution) here's what I am encountering:

 

Ning214_0-1645078662339.png

Hoping for your assistance.

 

Thank u very much.

 

 

Hi @Ning214 , I wonder what your variables' names are. It will be great if you can provide more details or send me a toy example. Also please feel free to email your details to Meichen.Dong@jmp.com

Hello Meichen,

 

I'm having troubles finding the report for the GLMM when I run it by a subgrouping factor. I get the graph builder and ANOVA table, but want to run post-hoc analyses. Typically I run those in the final report that is automatically created, is there a way for me to call that report? Or alternatively, run post-hoc analyses another way?

 

Thanks!

Hello @AvgBrontosaurus , thanks for your question! Did you mean that the GLMM report did not show up after specifying the parameters and variables? Could you run some of the examples we provided successfully? Would you like to share more details? Please feel free to send more questions to my email: Meichen.Dong@jmp.com ! 

Hey Meichen, I have tried one of the examples and I'm having the
same problem. After I hit run the only two windows that pop up are the
spreadsheet with all of the ANOVA table values and LSMs and the graph
builder. When I don't include a "by" factor the full report pops up
automatically.

Hi @AvgBrontosaurus ,

This might be related to the parameter “By Group Report Limit” – for example, if your BY variable has 20 levels and you want to see all the model details, you can set this value to 21 or greater. This parameter is designed to avoid outputting an overwhelming number of results: say when you have 10k levels, you probably don’t want to look at the results one by one.

MeichenDong_0-1654184496580.png 

Hello Meichen,

 

I am new to using the GLMM add-in . My questions are:

1. Is there a "correct" max number of iterations that can be specified for a particular model or is it a trial and error kind of approach? I tried running the same model with the default max number (30) and then with  100 iterations and that yielded drastically different results. At 30 iterations, R2 was close to zero and none of the variables had significant p-values (effect test section); at 100 iteration - model R2 was 0.7, and an explanatory variable of interest was significant at alpha 0.5.

2. When I choose a gamma distribution, the LS means and all the estimates seem to be on a some sort of transformed scale and not on the original data scale. Do I need to back-transform the LS means, and which transformation should I use?

Thank you!

Hi @MANCOVAChicken1 , Thanks for your questions!

For question 1, ideally if the model fits the data well, the convergence(1e-8) may be achieved before the 30th iteration. However, if the convergence is not achieved after the 30th iteration, you may increase the number of iterations or loose the convergence criteria ( for example, change it to 1e-7) to see if convergence can be achieved. In your case, 30 iterations seem not to be enough, so increasing the number of iterations is the correct way to go. I suggest that you may increase this number to larger values like 200 to see if results change, comparing to when you set it to 100. Since if convergence is achieved, further increasing the number of iterations will give you the same result.

For question 2, if choosing Gamma distribution, the canonical link 'reciprocal' or 'inverse' link is used. Therefore, if you want to derived the LS means on the original data scale, transformation is needed. That means that your expected value E(Y_i) = mu_i is modeled as: 1/(mu_i) = b0+b1*x_1i + ... 

To derive mu_i, you will need to take the reciprocal of the predicted values: 1/(b0+b1*x_1i + ... ).

Please let me know if you have further questions! I can always be reached at Meichen.Dong@jmp.com

Best,

Meichen

kandonov

Dear Meichen,

 

Thank you for this great add-in!

 

However, I have a question. Can one perform a multivariate GLMM using this add-in? Perhaps if I add a few dependent variables in the "Y" box it will work, won't it?

 

Cheers,

K/

Hi @kandonov ,

Thanks for the question. If you have multiple dependent variables, for example {y1, y2, y3}, specifying them all in the Y box would not work unfortunately. For this addin, the only case to put multiple variables in Y box currently is our example 5, which shows the binomial case where you are using a second variable to specify n. 

However, there are some workarounds I would suggest you to try.

First, you could stack the data (stack y1, y2, y3 into one variable y) and run the current add-in or 'Fit Model' with 'Mixed Model' personality, creating a new variable to indicate the multivariate variable (for example, the values of this variable could be the variable names 'y1', 'y2', 'y3' to suggest what variable is the y value corresponding to) and specify it as a fixed effect along with all others and maybe also some interactions with it. Then specify 'subject' as a random effect--this would set up a compound symmetry cov structure, which is not fully multivariate but a simpler approximation to it.

Second, if the data are Poisson or Binomial it may also work to center and scale it, and use a normal approximation with a linear mixed model. Please see examples for 'Fit Model' with 'Mixed Model' personality.

Please let me know if you have more questions! I can always be reached at Meichen.Dong@jmp.com.

Best,

Meichen

You mentioned in a previous response that the option of additional covariance structures will be available in the upcoming JMP 17. Is this message true for JMP 17 or for JMP 17 Pro only?

Hi @winfriedkoch0 ,

I believe the GLMM platform will be available in JMP PRO 17.

pwrege

I study forest elephant ecology and behavior in central Africa, using acoustic monitoring. I'm interested in looking at the affect of logging activity on the probability that an elephant call occurs during the night rather than during the day, because other studies have suggested that increased night calling reflects a response to human disturbance (mostly poaching). The data are successes/trials, determined from recordings at 9 different sites at differing distance from active logging (= closer than a few kilometers). 

I wanted to use the GLMM add-in because of the output of marginal means (no random effects in this simple analysis), but the parameter estimates and significance as predictors are hugely different from the GLMM vs GLM and I am interested in why this should be.

Below are screenshots of the output from the two personalities:

GLMM - events/trials - binomial - logit

Summary of Fit

RSquare

0.004847

RSquare Adj

0.002292

Root Mean Square Error

0.939466

Mean of Response

 -0.0881

Observations (or Sum Wgts)

194.2821

 

Analysis of Variance

Source

DF

Sum of Squares

Mean Square

F Ratio

Model

2

3.34890

1.67445

1.8972

Error

779

687.54205

0.88260

Prob > F

C. Total

781

690.89095

 

0.1507

 

Parameter Estimates

Term

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

0.0484165

0.097423

0.50

0.6193

fiscalYr[1]

 -0.084323

0.078483

 -1.07

0.2830

logexpos2[1yes]

0.1389546

0.091878

1.51

0.1308

-

GLM - events/trials - binomial - logit

Whole Model Test

Model

 -LogLikelihood

L-R ChiSquare

DF

Prob>ChiSq

Difference

247.258921

494.5178

2

<.0001*

Full

9444.02167

 

 

 

Reduced

9691.28059

 

 

 

 

Goodness Of Fit Statistic

ChiSquare

DF

Prob>ChiSq

Pearson

14001.87

779

<.0001*

Deviance

18426.55

779

<.0001*

 

Effect Tests

Source

DF

L-R ChiSquare

Prob>ChiSq

 

fiscalYr

1

62.133979

<.0001*

 

logexpos2

1

413.68264

<.0001*

 

 

Parameter Estimates

Term

Estimate

Std Error

L-R ChiSquare

Prob>ChiSq

Lower CL

Upper CL

Intercept

0.291917

0.0233196

162.24916

<.0001*

0.2463549

0.3377743

fiscalYr[1]

 -0.137987

0.0175207

62.133979

<.0001*

 -0.172342

 -0.103659

logexpos2[1yes]

0.4319415

0.0217689

413.68264

<.0001*

0.3894144

0.4747556

-

Thanks for any input