Extension: Unbounded Response

9. Extension: Unbounded Response#

Basic IRT methods assume the response sits inside a bounded interval. Several AV safety metrics, however, are unbounded. Two adaptations show up in practice: the Log-normal IRT and Poisson IRT models, both of which sidestep the need for metric normalization and provide a ranking from raw driving score input.

9.1. Kilometers per Infraction: Log-normal IRT#

One of the most common self-driving metrics is km/infraction: the distance a driving policy travels between infractions. This quantity is non-negative, unbounded above, and right-skewed. To address this issue, we model the likelihood as:

\[\log X_{s,n} \sim \mathcal{N}(\theta_s - \beta_n, \sigma^2)\]

where \(\theta_s\) is the driving policy’s ability, \(\beta_n\) is the route’s difficulty on the log scale, and \(\sigma^2\) captures the residual noise. This is the log-normal IRT model, developed for response-time modelling in educational testing [1].

The expected response is \(\mathbb{E}[X_{s,n}] = e^{\theta_s - \beta_n + \sigma^2/2}\), so ability and difficulty appear as a log-ratio. A driving policy with \(\theta_s = 2\) on a route with \(\beta_n = 1\) is expected to drive about \(e^1 \approx 2.7\times\) farther between infractions than a driving policy that matches the route difficulty exactly.

Fitting is straightforward: the log-likelihood is Gaussian, so SVI with Normal guides extends directly from 1PL. No bespoke likelihood is needed, just a change of input variable. There is one limitation: log-normal IRT has no answer for the no-infraction case, leaving \(X_{s,n}\) undefined or \(\infty\).

9.2. Infraction per Kilometers: Poisson IRT#

The complementary metric, the infraction count itself, sidesteps that limitation: zero is a valid observation rather than a divide-by-zero. Treating the route length \(L_n\) as exposure, the count follows a Poisson with mean infraction rate \(\mu_{s,n} = L_n \exp(\beta_n - \theta_s)\):

\[P(Y_{s,n} = y) = \frac{\mu_{s,n}^{\,y} \, e^{-\mu_{s,n}}}{y!}, \qquad y = 0, 1, 2, \dots\]

Variable scene lengths fold in naturally: longer scenes contribute more evidence without changing the model. Fitting reuses the SVI/Normal-guide machinery, with the integer count (and \(L_n\) when it varies) as the only additional observables.

9.3. References#

[1]

W. J. van der Linden. A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31(2):181–204, 2006.

Extension: Unbounded Response

Contents

9. Extension: Unbounded Response#

9.1. Kilometers per Infraction: Log-normal IRT#

9.2. Infraction per Kilometers: Poisson IRT#

9.3. References#