Choose Language Hide Translation Bar

Heuristic Perspectives on Parametric Survival Analysis (2020-US-30MP-567)

Thor Osborn, Principal Systems Research Analyst, Sandia National Laboratories

 

Parametric survival analysis is often used to characterize the probabilistic transitions of entities — people, plants, products, etc. — between clearly defined categorical states of being. Such analyses model duration-dependent processes as compact, continuous distributions, with corresponding transition probabilities for individual entities as functions of duration and effect variables. The most appropriate survival distribution for a data set is often unclear, however, because the underlying physical processes are poorly understood. In such cases a collection of common parametric survival distributions may be tried (e.g., the Lognormal, Weibull, Frechét and Loglogistic distributions) to identify the one that best fits the data. Applying a diverse set of options improves the likelihood of finding a model of adequate quality for many practical purposes, but this approach offers little insight into the processes governing the transition of interest. Each of the commonly used survival distributions is founded on a differentiating structural theme that may offer valuable perspective in framing appropriate questions and hypotheses for deeper investigation. This paper clarifies the fundamental mechanisms behind each of the more commonly used survival distributions, considering the heuristic value of each mechanism in relation to process inquiry and comprehension.

 

 

Auto-generated transcript...

 


Speaker

Transcript

  Hello, and welcome to my
00 14.633
  3
  over the past 25 years, I
  have performed many studies and
00 31.533
  7
  share with you a way of thinking
  about the distributions we
00 49.366
  11
  motivated by precedent, ease of
  use, or empirically demonstrated
00 05.666
  15
  about its processes. Further,
  when an excellent model fit is
00 20.666
  19
  genesis of the distributions
  commonly used in parametric
00 36.366
  23
  seen in the workplace as well as
  in the academic literature.
00 51.400
  27
  literature, including textbooks
  and web based articles, as well
00 07.166
  31
  reexamination that may fail to
  glean full value from the work.
00 21.633
  35
  the exponential.
  Much is often made about the
00 39.066
  39
  because they model fundamentally
  different system archetypes. In
00 56.200
  43
  distribution does in fact, fit
  the lognormal data very well.
  The quality of the fit may also
00 32.066
  48
  fits much better. And secondly,
  there's only a modest coincident
00 55.333
  52
  the core process mechanisms
  these distributions represent
00 11.600
  56
  analysis, but it provides a very
  familiar starting point for
00 27.133
  60
  uncorrelated effects. Let's see
  if that is true.
  In order to create a good
00 52.733
  65
  25,000. For the individual
  records, we'll use the random
00 29.400
  06.333
  70
  71
  see that we did indeed obtain
  the normal distribution.
  Now let's consider the
00 14.400
  76
  not able to imprint my brain
  with a sufficient knowledge of
00 34.300
  80
  lognormal distribution are also
  very simple. As you can see, the
00 50.233
  84
  this demonstration, we reuse the
  fluctuation data that were
00 05.766
  88
  JSL scripting because I find it
  much more convenient for
00 32.566
  92
  the number of records in each
  sample. Next, it extracts the
00 53.200
  96
  products.
  The outer loop tracks the
00 17.133
  101
  on the previous slide. The
  amplified product compensates
00 33.700
  105
  distributions may be considered
  as generated secondarily from
00 18.400
  110
  many similar internal processes
  is represented by its maximum
00 35.000
  114
  to be Frechet distributed. The
  Weibull distribution represents
00 50.466
  118
  processes that complete when any
  of multiple elements have
00 08.766
  122
  using the Pareto distribution
  as the source. In this case, the
00 27.600
  126
  absolute value of the normal
  distribution as the source.
  Now let's have a quick look at
00 58.666
  131
  maximum is used.
  For the square root of the
00 50.766
  136
  is not available, you can also
  see that the other common
00 33.233
  46.033
  141
  value of the normal distribution
  quite well.
  Incidentally, Weibull
00 28.066
  146
  distribution when its core
  behavior is substantially
00 43.600
  150
  the four heme containing
  subunits mechanically interact
00 59.166
  154
  up to now have all relied on
  independent samples. Professor
00 15.766
  158
  extended to produce auto
  correlated data. Generation of
00 32.100
  162
  sequence autocorrelation is
  about .75, yet the
00 59.033
  02.300
  167
  the common survival
  distributions. You can see that
00 26.400
  171
  good example of the relationship
  between real-world analytical
00 42.000
  175
  commingle a single family
  residences with heavy industry.
00 55.266
  179
  have similar features. The
  landowner must apply to the
00 09.000
  183
  an opportunity to comment. Local
  officials then weigh the
00 22.433
  187
  parties. This example is not
  approached as a demonstration
00 36.633
  191
  processing time is 140 days. The
  fit is obviously imperfect, but
00 52.733
  195
  distributed data results from
  processes yielding the combined
00 08.400
  199
  ubiquitous, but the loglogistic
  is less frequently used. Without
00 24.466
  203
  multistep process may be
  insufficient to impart log
00 38.200
  207
  considered and the complexity
  of the underlying process should
00 53.166
  211
  whether a process is
  substantially impacted by
00 05.566
  215
  whether the cooperative element
  is connoted by positive terms such
00 22.733
  219
  often been said, I would
  sincerely appreciate your
00 35.033