Thor Osborn, Principal Systems Research Analyst, Sandia National Laboratories

Parametric survival analysis is often used to characterize the probabilistic transitions of entities — people, plants, products, etc. — between clearly defined categorical states of being. Such analyses model duration-dependent processes as compact, continuous distributions, with corresponding transition probabilities for individual entities as functions of duration and effect variables. The most appropriate survival distribution for a data set is often unclear, however, because the underlying physical processes are poorly understood. In such cases a collection of common parametric survival distributions may be tried (e.g., the Lognormal, Weibull, Frechét and Loglogistic distributions) to identify the one that best fits the data. Applying a diverse set of options improves the likelihood of finding a model of adequate quality for many practical purposes, but this approach offers little insight into the processes governing the transition of interest. Each of the commonly used survival distributions is founded on a differentiating structural theme that may offer valuable perspective in framing appropriate questions and hypotheses for deeper investigation. This paper clarifies the fundamental mechanisms behind each of the more commonly used survival distributions, considering the heuristic value of each mechanism in relation to process inquiry and comprehension.

Auto-generated transcript...


Speaker

Transcript

Hello, and welcome to my
00 14.633
3
over the past 25 years, I
have performed many studies and
00 31.533
7
share with you a way of thinking
about the distributions we
00 49.366
11
motivated by precedent, ease of
use, or empirically demonstrated
00 05.666
15
about its processes. Further,
when an excellent model fit is
00 20.666
19
genesis of the distributions
commonly used in parametric
00 36.366
23
seen in the workplace as well as
in the academic literature.
00 51.400
27
literature, including textbooks
and web based articles, as well
00 07.166
31
reexamination that may fail to
glean full value from the work.
00 21.633
35
the exponential.
Much is often made about the
00 39.066
39
because they model fundamentally
different system archetypes. In
00 56.200
43
distribution does in fact, fit
the lognormal data very well.
The quality of the fit may also
00 32.066
48
fits much better. And secondly,
there's only a modest coincident
00 55.333
52
the core process mechanisms
these distributions represent
00 11.600
56
analysis, but it provides a very
familiar starting point for
00 27.133
60
uncorrelated effects. Let's see
if that is true.
In order to create a good
00 52.733
65
25,000. For the individual
records, we'll use the random
00 29.400
06.333
70
71
see that we did indeed obtain
the normal distribution.
Now let's consider the
00 14.400
76
not able to imprint my brain
with a sufficient knowledge of
00 34.300
80
lognormal distribution are also
very simple. As you can see, the
00 50.233
84
this demonstration, we reuse the
fluctuation data that were
00 05.766
88
JSL scripting because I find it
much more convenient for
00 32.566
92
the number of records in each
sample. Next, it extracts the
00 53.200
96
products.
The outer loop tracks the
00 17.133
101
on the previous slide. The
amplified product compensates
00 33.700
105
distributions may be considered
as generated secondarily from
00 18.400
110
many similar internal processes
is represented by its maximum
00 35.000
114
to be Frechet distributed. The
Weibull distribution represents
00 50.466
118
processes that complete when any
of multiple elements have
00 08.766
122
using the Pareto distribution
as the source. In this case, the
00 27.600
126
absolute value of the normal
distribution as the source.
Now let's have a quick look at
00 58.666
131
maximum is used.
For the square root of the
00 50.766
136
is not available, you can also
see that the other common
00 33.233
46.033
141
value of the normal distribution
quite well.
Incidentally, Weibull
00 28.066
146
distribution when its core
behavior is substantially
00 43.600
150
the four heme containing
subunits mechanically interact
00 59.166
154
up to now have all relied on
independent samples. Professor
00 15.766
158
extended to produce auto
correlated data. Generation of
00 32.100
162
sequence autocorrelation is
about .75, yet the
00 59.033
02.300
167
the common survival
distributions. You can see that
00 26.400
171
good example of the relationship
between real-world analytical
00 42.000
175
commingle a single family
residences with heavy industry.
00 55.266
179
have similar features. The
landowner must apply to the
00 09.000
183
an opportunity to comment. Local
officials then weigh the
00 22.433
187
parties. This example is not
approached as a demonstration
00 36.633
191
processing time is 140 days. The
fit is obviously imperfect, but
00 52.733
195
distributed data results from
processes yielding the combined
00 08.400
199
ubiquitous, but the loglogistic
is less frequently used. Without
00 24.466
203
multistep process may be
insufficient to impart log
00 38.200
207
considered and the complexity
of the underlying process should
00 53.166
211
whether a process is
substantially impacted by
00 05.566
215
whether the cooperative element
is connoted by positive terms such
00 22.733
219
often been said, I would
sincerely appreciate your
00 35.033
Published on ‎05-21-2024 05:33 PM by | Updated on ‎07-07-2025 12:03 PM

Thor Osborn, Principal Systems Research Analyst, Sandia National Laboratories

Parametric survival analysis is often used to characterize the probabilistic transitions of entities — people, plants, products, etc. — between clearly defined categorical states of being. Such analyses model duration-dependent processes as compact, continuous distributions, with corresponding transition probabilities for individual entities as functions of duration and effect variables. The most appropriate survival distribution for a data set is often unclear, however, because the underlying physical processes are poorly understood. In such cases a collection of common parametric survival distributions may be tried (e.g., the Lognormal, Weibull, Frechét and Loglogistic distributions) to identify the one that best fits the data. Applying a diverse set of options improves the likelihood of finding a model of adequate quality for many practical purposes, but this approach offers little insight into the processes governing the transition of interest. Each of the commonly used survival distributions is founded on a differentiating structural theme that may offer valuable perspective in framing appropriate questions and hypotheses for deeper investigation. This paper clarifies the fundamental mechanisms behind each of the more commonly used survival distributions, considering the heuristic value of each mechanism in relation to process inquiry and comprehension.

Auto-generated transcript...


Speaker

Transcript

Hello, and welcome to my
00 14.633
3
over the past 25 years, I
have performed many studies and
00 31.533
7
share with you a way of thinking
about the distributions we
00 49.366
11
motivated by precedent, ease of
use, or empirically demonstrated
00 05.666
15
about its processes. Further,
when an excellent model fit is
00 20.666
19
genesis of the distributions
commonly used in parametric
00 36.366
23
seen in the workplace as well as
in the academic literature.
00 51.400
27
literature, including textbooks
and web based articles, as well
00 07.166
31
reexamination that may fail to
glean full value from the work.
00 21.633
35
the exponential.
Much is often made about the
00 39.066
39
because they model fundamentally
different system archetypes. In
00 56.200
43
distribution does in fact, fit
the lognormal data very well.
The quality of the fit may also
00 32.066
48
fits much better. And secondly,
there's only a modest coincident
00 55.333
52
the core process mechanisms
these distributions represent
00 11.600
56
analysis, but it provides a very
familiar starting point for
00 27.133
60
uncorrelated effects. Let's see
if that is true.
In order to create a good
00 52.733
65
25,000. For the individual
records, we'll use the random
00 29.400
06.333
70
71
see that we did indeed obtain
the normal distribution.
Now let's consider the
00 14.400
76
not able to imprint my brain
with a sufficient knowledge of
00 34.300
80
lognormal distribution are also
very simple. As you can see, the
00 50.233
84
this demonstration, we reuse the
fluctuation data that were
00 05.766
88
JSL scripting because I find it
much more convenient for
00 32.566
92
the number of records in each
sample. Next, it extracts the
00 53.200
96
products.
The outer loop tracks the
00 17.133
101
on the previous slide. The
amplified product compensates
00 33.700
105
distributions may be considered
as generated secondarily from
00 18.400
110
many similar internal processes
is represented by its maximum
00 35.000
114
to be Frechet distributed. The
Weibull distribution represents
00 50.466
118
processes that complete when any
of multiple elements have
00 08.766
122
using the Pareto distribution
as the source. In this case, the
00 27.600
126
absolute value of the normal
distribution as the source.
Now let's have a quick look at
00 58.666
131
maximum is used.
For the square root of the
00 50.766
136
is not available, you can also
see that the other common
00 33.233
46.033
141
value of the normal distribution
quite well.
Incidentally, Weibull
00 28.066
146
distribution when its core
behavior is substantially
00 43.600
150
the four heme containing
subunits mechanically interact
00 59.166
154
up to now have all relied on
independent samples. Professor
00 15.766
158
extended to produce auto
correlated data. Generation of
00 32.100
162
sequence autocorrelation is
about .75, yet the
00 59.033
02.300
167
the common survival
distributions. You can see that
00 26.400
171
good example of the relationship
between real-world analytical
00 42.000
175
commingle a single family
residences with heavy industry.
00 55.266
179
have similar features. The
landowner must apply to the
00 09.000
183
an opportunity to comment. Local
officials then weigh the
00 22.433
187
parties. This example is not
approached as a demonstration
00 36.633
191
processing time is 140 days. The
fit is obviously imperfect, but
00 52.733
195
distributed data results from
processes yielding the combined
00 08.400
199
ubiquitous, but the loglogistic
is less frequently used. Without
00 24.466
203
multistep process may be
insufficient to impart log
00 38.200
207
considered and the complexity
of the underlying process should
00 53.166
211
whether a process is
substantially impacted by
00 05.566
215
whether the cooperative element
is connoted by positive terms such
00 22.733
219
often been said, I would
sincerely appreciate your
00 35.033


0 Kudos