9. Extension: Unbounded Response#
Basic IRT methods assume the response sits inside a bounded interval. Several AV safety metrics, however, are unbounded. Two adaptations show up in practice: the Log-normal IRT and Poisson IRT models, both of which sidestep the need for metric normalization and provide a ranking from raw driving score input.
9.1. Kilometers per Infraction: Log-normal IRT#
One of the most common self-driving metrics is km/infraction: the distance a driving policy travels between infractions. This quantity is non-negative, unbounded above, and right-skewed. To address this issue, we model the likelihood as:
where \(\theta_s\) is the driving policy’s ability, \(\beta_n\) is the route’s difficulty on the log scale, and \(\sigma^2\) captures the residual noise. This is the log-normal IRT model, developed for response-time modelling in educational testing [1].
The expected response is \(\mathbb{E}[X_{s,n}] = e^{\theta_s - \beta_n + \sigma^2/2}\), so ability and difficulty appear as a log-ratio. A driving policy with \(\theta_s = 2\) on a route with \(\beta_n = 1\) is expected to drive about \(e^1 \approx 2.7\times\) farther between infractions than a driving policy that matches the route difficulty exactly.
Fitting is straightforward: the log-likelihood is Gaussian, so SVI with Normal guides extends directly from 1PL. No bespoke likelihood is needed, just a change of input variable. There is one limitation: log-normal IRT has no answer for the no-infraction case, leaving \(X_{s,n}\) undefined or \(\infty\).
9.2. Infraction per Kilometers: Poisson IRT#
The complementary metric, the infraction count itself, sidesteps that limitation: zero is a valid observation rather than a divide-by-zero. Treating the route length \(L_n\) as exposure, the count follows a Poisson with mean infraction rate \(\mu_{s,n} = L_n \exp(\beta_n - \theta_s)\):
Variable scene lengths fold in naturally: longer scenes contribute more evidence without changing the model. Fitting reuses the SVI/Normal-guide machinery, with the integer count (and \(L_n\) when it varies) as the only additional observables.
9.3. References#
W. J. van der Linden. A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31(2):181–204, 2006.