2019-11-21
dates of symptom onset
contact data: exposure (who infected you?) and contact tracing (who could you have infected?)
dates of exposure / infection
dates of outcome: death / recovery
metadata on patients: age, gender, location, occupation, etc.
data from past outbreaks
Disease-dependent, but generally includes:
Source: Ebola response epicell weekly presentation, Goma (DRC), 19 June 2019
Source: Ebola response epicell weekly presentation, Goma (DRC), 29 May 2019
Source: Ebola response epicell weekly presentation, Goma (DRC), 29 May 2019
Definition: time interval between the date on infection and the date of symptom onset
Definition: time interval between onset of symptoms in primary and secondary cases.
Definition: time interval between date of infections in primary and secondary cases.
From empirical distribution (data) to estimated distribution.
choose type of distribution (e.g. normal, Poisson, Gamma)
find \(\theta_x\) which maximise \(p(x)\), i.e. the likelihood
visually: best fit between bars (data) and curve (distribution)
A relative measure of fit between data and model
Using continuous distributions to model discrete variables:
flexible distribution (many shapes possibles)
2 parameters: shape, scale
alternatively: mean, coefficient of variation
typical choice for delay distributions
needs to be discretised
Definition: the proportion of cases who die of the infection.
where \(Q\) is a Normal quantile (e.g. \(1.96\) for \(\alpha=0.05\))
\[ CI_{95\%} = 0.6 \pm 1.96 \times 0.158 = [0.30 ; 0.90] \]
\[ CI_{95\%} = 0.6 \pm 1.96 \times 0.05 = [0.50 ; 0.70] \]
"case fatality rate": this is a proportion, not a rate
computation using wrong denominator, i.e. including unknown outcome:
\[ \frac{D}{D + R + U} \]
(leads to underestimating the CFR)
Definition: the incidence is the number of new cases on a given time period.
relies on dates, typically of onset of symptoms
only daily incidence is non-ambiguous
other definitions (e.g. weekly) rely on a starting date
prone to reporting delays
\(log(y) = r \times t + b + \epsilon\:\:\) so that \(\:\:\hat{y} = e^{r \times t + b}\)
with:
Let \(T\) be the time taken by the incidence to double, given a daily growth rate \(r\).
\[ y_2 / y_1 = 2 \:\: \Leftrightarrow e^{rt_2 + b} / e^{rt_1 + b} = 2 \]
\[ \Leftrightarrow e^{r(t_2 - t_1)} = 2 \Leftrightarrow T = log(2) / r \]
Pros:
Cons:
Serial interval: time interval between onset of symptoms of primary and secondary cases.
\[ \lambda_t = R_0 \times \sum_i w(t - t_i) \]
with: \(\lambda_t\): global force of infection; \(w()\): serial interval distribution; \(t_i\): date of symptom onset
Treat incidence \(y_t\) on day \(t\) as a Poisson distribution of rate \(\lambda_t\):
\[ p(y_t | R_0, y_1, ..., y_{t-1}) = e^{-\lambda_t} \frac{\lambda_t^{y_t}}{(!y_t)} \] with (slight rewriting): \(\lambda_t = R_0 \times \sum_{s = 1}^{t-1} y_s w(t - s)\)
Pros:
Cons: