Methodological and conceptual challenges in rare and severe event forecast verification



There are distinctive methodological and conceptual challenges in rare and severe event (RSE) forecast verification, that is, in the assessment of the quality of forecasts of rare but severe natural hazards such as avalanches, landslides or tornadoes. While some of these challenges have been discussed since the inception of the discipline in the 1880s, there is no consensus about how to assess RSE forecasts. This article offers a comprehensive and critical overview of the many different measures used to capture the quality of categorical, binary RSE forecasts – forecasts of occurrence and non-occurrence – and argues that of skill scores in the literature there is only one adequate for RSE forecasting. We do so by first focusing on the relationship between accuracy and skill and showing why skill is more important than accuracy in the case of RSE forecast verification. We then motivate three adequacy constraints for a measure of skill in RSE forecasting. We argue that of skill scores in the literature only the Peirce skill score meets all three constraints. We then outline how our theoretical investigation has important practical implications for avalanche forecasting, basing our discussion on a study in avalanche forecast verification using the nearest-neighbour method (Heierli et al., 2004). Lastly, we raise what we call the “scope challenge”; this affects all forms of RSE forecasting and highlights how and why working with the right measure of skill is important not only for local binary RSE forecasts but also for the assessment of different diagnostic tests widely used in avalanche risk management and related operations, including the design of methods to assess the quality of regional multi-categorical avalanche forecasts.

