Hi all!
I've been looking into the PH modelling and i have some question on how to properly interpret the data.
To demonstrate, i modified the CA Lung Cancer dataset, to help my point along.
My first question is a rather general one and it regards the PH model assumptions. Here a nice paper reviewing the issues with the published papers that use PH. They also list some strategies on how to deal with violations. There was an older thread on this. Is anyone aware of any developments? In some tutorials i found, the quickest way to check for proportionality is to make sure there are no intercepts in the kaplan-meyer curves or to plot the data in log-log and confirm visually that the lines are parallel. This is easily done via the Life Distribution (first script in the data table).
So, basically, for this dataset, this would be the place to stop (let's not mind the KPS variable for this. I kept it just in case somebody wants to use a continuos variable for an answer). Can we add a time-dependent covariate in JMP to adjust for non-proportionality somehow?
My second question, and this is something we have in the studies we run, would be regarding different treatment regiments (for instance Drug A gets dose schedule 1 and 2, Drug B gets the schedules 3, 4 and 5). We want to compare the drugs and the treatments. Let's treat them as categories, even though we might be comparing the dosings within a drug and would then treat the dose as a continuous variable, nested within the drug (any concerns here?).
To illustrate I separated the :Cell Type into :Cancer type just to get some level of categorical nesting. Would it be valid to run the model this way?

If the answer is yes - we still want to know what the differences between the sublevels are, because the numerical HRs we get for the categories do not make much sense, if we have two drugs with different schedules. Sure, those were selected based on, let's call it, "best practice", but that's completely arbitrary data.
Can we, in the next step, run a PH model with a local filter, only selecting the non-small cell lung cancer?

so that we get HR values, that are actually of interest (or the "per unit change in regressor" for continuos data):

Hope the assumptions make sense and somebody could help with some insights!
Best wishes,
Konstantin
PS: our survival data is coming from murine in vivo studies. Usual group sizes are at 6-12 animals. Using JMP18
Linking @Jonas_Rinne as well