Averaging Log-Likelihood Values:Numerical Stability

1. Weighted log-likelihood values

$\bar{l} = \frac{1}{\sum_i \omega_i} \sum_i \omega_i \cdot l_i$

It’s very advisable to employ log-likelihoods (and log-weights) for a extended dynamic range. This avoids the problem of weights becoming zero.

Then, if we have as inputs the log-weights ($l\omega_i$) and the individual log-likelihoods ($ll_i$), the problem becomes:

$\log \bar{l} = \log \left( \frac{1}{\sum_i e^{l\omega_i}} \sum_i e^{l\omega_i} \cdot e^{ll_i} \right)$ $\quad \quad = - \log \sum_i e^{l\omega_i} + \log \sum_i e^{l\omega_i+ll_i}$

The problem: the exponential of very large/small numbers will produce under/overflow.

The solution: to shift all the exponents and next correct it in the log sum:

$l\omega_{max} = \max_i l\omega_i,~~~~ ll_{max} = \max_i ll_i$ $\log \bar{l} = - \log \sum_i e^{l\omega_i} + \log \sum_i e^{l\omega_i+ll_i}$ $~~~~~= - \log \sum_i e^{l\omega_i-l\omega_{max}} + \log \sum_i e^{l\omega_i+ll_i-ll_{max}-l\omega_{max}} + ll_{max}$

This method is implemented in the C++ function mrpt::math::averageLogLikelihood.

2. Unweighted log-likelihood values (Arithmetic mean)

If all the samples have equal weights, the formula simplifies to:
$\log \bar{l} = - \log N + \log \sum_{i=1}^N e^{ll_i-ll_{max}} + ll_{max}$