Working towards data-based definitions of avalanche danger levels ¶

Based on the five-level European avalanche danger scale, the avalanche warning services of all European countries assign one of the danger levels to the current avalanche situation. However, some of the key terms used in the danger level scale – such as what are "some" or "many" danger points – lack precise definitions, which in turn leaves some room for interpretation in their application. A working group of the European Avalanche Warning Services (EAWS) has taken on the task of clarifying these terms. Until now, there has been no data-based foundation for this.

The five-point European avalanche danger scale was introduced in the winter of 1993/94. It contains a brief description of each danger level in relation to

snowpack stability (or probability of an avalanche release),
the frequency of avalanche prone locations and
avalanche size.

These are qualitative definitions based on the experience of the avalanche forecasters in the production team.

Almost 30 years later, the avalanche danger scale remains the primary point of reference for the European avalanche warning services. Avalanche forecasters assign one of the five avalanche danger levels to the prevailing avalanche situation. This leaves room for a certain amount of interpretation because, first, more than five avalanche situations exist in the real world, and second, avalanche danger is not scientifically measurable. Although based on an interpretation of a wealth of data relating to the current situation and its projected trend, the danger assessment always depends to some extent on the subjective appraisal of an expert.

Among the consequences is inconsistency in the issued danger levels across avalanche warning service boundaries and national borders (Techel et al., 2018) despite endeavours to harmonise their application as far as possible. One of the reasons for the lack of consistency is the considerable scope for interpretation allowed by the absence of definitions of some key words. Although avalanche magnitude is described according to European avalanche size categories, (data-based) definitions, in particular of terms that describe the prevalence or frequency of avalanche prone locations, are largely non-existent. This inevitably gives rise to questions, including for example: What do words describing the frequency of unstable locations, such as “a few”, “several”, or “many”, actually mean? It is practically impossible to give a definitive answer – studies show that people assign different numbers to such qualitative terms. Two individuals can interpret the same number of unstable locations as constituting “several” and “many” respectively. This problem can be resolved by asking whether the terms “a few”, “several” and “many” can be defined by reference to data.

How prevalent are the avalanche prone locations? ¶

There are two ways, described simply below, of arriving at an answer to this question:

Recording the distribution of snowpack stability in the field. Snowpack stability can be measured by testing, as with the rutschblock test, or with an instrument, such as the SnowMicroPen. Performing measurements in numerous places in the field gives rise to a frequency distribution of differing snowpack stability values. Conducting such a campaign over a large area, however, is very time consuming. For this reason, investigations of this kind are generally restricted to small regions and a few days (Schweizer et al., 2003; Reuter et al. 2015).
Modelling the snowpack. The structure of the snowpack can be calculated on the basis of data delivered by weather stations or weather models. Conclusions about possible weak layers and stability can then be drawn from the modelled snowpack structure. As with stability tests, an individual model may not truly reflect actual local conditions. A collection of numerous simulations spanning a relatively large area, on the other hand, can paint a realistic picture of the snowpack conditions. Simulating a robust stability parameter, however, is no easy matter.

The recording of local stability, at dispersed points in the field, on the basis of snow profiling performed by observers and avalanche forecasters, is fundamental to the work of all avalanche warning services. Given that only a single test (or very few tests) can be performed within a region in a day – as described above – however, profiling is not a reliable indicator of stability distribution. How can these data be used, despite this drawback, to obtain large numbers of typical stability distributions? For this purpose, statistical methods have to be applied. The adopted approach outlined below is explained in greater detail towards the end of the article.

Distribution of snowpack stability with danger levels 1 (low) to 4 (high) ¶

Fig. 1: Distribution of stability classes by danger level based on rutschblock test results. There are no data for danger level 5.

In the Swiss Alps in the last twenty years, more than 4,000 rutschblock tests have been performed and, on the day of testing, observers have also assessed the danger level. Each of the rutschblock test results was assigned to one of four stability classes – very poor, poor, fair and good. The lower the stability at the profiling site, the greater the probability of an avalanche release on a slope with similar stability. In a place where stability is very poor, an avalanche can be triggered. Conversely, avalanche releases are unlikely (but not ruled out) where stability is good.

A chart depicting these data for danger levels 1 (low) to 4 (high) (Fig. 1) clearly shows that the number of locations where triggering is possible (very poor rutschblock test results) is much larger in places where danger levels 3 and 4 apply than where danger level 1 applies. It is also evident, however, that such unstable locations can exist even where danger level 1 applies. Conversely, as is to be expected, the proportion of good rutschblock test results decreases sharply as the danger level increases.

Fig. 2: Statistically simulated stability distributions illustrating the proportions of very poor and good rutschblock test results. The colour of an individual data point corresponds to the most frequently observed danger level giving consideration to the two different stability classes. The histogram shows, for example, that danger level 4 predominates where the proportion of very poor stability is 40% and the proportion of good stability is 25%, and that danger level 1 predominates where the proportion of very poor stability is 0% and the proportion of good stability is 75%. There are no data for danger level 5.

Statistical methods can be applied to calculate a large number of further stability distributions from the available data. Using several stability distributions, rather than one (as in Fig. 1), for each danger level, and comparing the proportions of very poor and good stability in a scatterplot (Fig. 2), the typical frequency ranges of very poor and good test results for the individual danger levels can be illustrated. It also becomes clear that the distributions differ between the danger levels and that a single typical distribution does not exist.

How many avalanche prone locations is indicated by “a few”, “several” and “many”? ¶

Using the statistical method described below, researchers assigned the frequency distributions to qualitative descriptors, including "none", “a few”, “several” and “many”:

None: 0%
A few: >0% to <4%
Several: 4% to 20%
Many: >20%

It is particularly noteworthy that the frequency threshold for the descriptor “many” is as low as 20%. In other words, if an elevated probability of avalanche release were to exist on as few as one in five slopes, the number of avalanche prone slopes would be described as “many”.

What is the useful purpose of these findings? ¶

If a similar analysis is performed for the size of the observed avalanches (not described here), and the results are combined with the frequency of the avalanche prone locations, danger levels 1 to 4 can be described by reference to the most frequent combinations; this is not possible for danger level 5 because too few data are available.

Figure 3 is a data-based characterisation indicating the most typical number of places where avalanches are likely to be triggered, together with the largest observed avalanche size for danger levels 1 to 4. Snowpack stability as such is not the focal point, but is used to describe the frequency of avalanche prone locations (“places where stability is very poor”). In this respect, this approach differs from that adopted for characterising the danger levels in the European avalanche danger scale, which describes the most typical (rather than the poorest) stability for each danger level. It is interesting to note that the frequency of avalanche prone locations increases from one danger level to the next, whereas the distribution of avalanche sizes changes comparatively little. This indicates that it is sufficient initially to describe snowpack stability and the frequency of avalanche prone locations solely by focusing on the prevalence of places where snowpack stability is very poor. In other words, when danger levels are being defined, the frequency of avalanche prone locations should be afforded more weight than is currently applied, for example in the current avalanche danger scale.

Fig. 3: Data-based characterisation of danger levels, in each case describing only the most typical combination for each danger level.

Conclusion and next steps ¶

This study illuminates for the first time how key terms used in the avalanche danger scale can be defined by reference to a large quantity of observation data. In addition, an application of the described approach (Fig. 3) illustrates a way of characterising the danger levels with the aid of data and statistical methodology.

The results constitute a significant starting point for a review of the danger level definitions within the EAWS, which is a topic that is high on the agenda of an international group of avalanche forecasters. At the same time, these findings and their impact on the everyday work of the avalanche warning services also raise new questions. One example is: How can the frequency of avalanche prone locations be estimated? As indicated above, in most cases only a few observations are available for this purpose. Against this background, the spatial modelling of snowpack stability is expected to be of great importance in the future. It would allow the frequency of avalanche prone locations to be estimated as objectively as possible.

Link to the study: Techel et al., 2020

Further information ¶

Fig. 4: Distribution of the proportion of very poor rutschblock test results. There is a high frequency of cases in which the proportion of very poor test results was very low, and a very low frequency of cases in which this proportion was high.

Statistical simulation of snowpack stability distribution

How did Techel et al. (2020) arrive at typical distributions for the frequency of avalanche prone locations in their study?

Since only a single rutschblock test was available in a region on any day as a rule, and the aim was to describe the frequency of avalanche prone locations, all test results recorded with an identical danger level were placed into a bin (Fig. 1). A certain number (n) of rutschblock test results were then randomly drawn from the bin, and the proportion of the very poor stability class was calculated for each one of the stability distributions represented by the results. It makes sense to examine the poorest rutschblock test results in particular because they are especially good indicators of the frequency of avalanche prone locations – places in which avalanche triggering is most probable. This procedure was repeated the same number of times for each danger level. This is necessary in order to obtain a large number of stability distributions – rather than just one – for each danger level.

If the proportions of the very poor rutschblock test results are then depicted in a histogram (Fig. 4), this time with all the danger levels placed into a bin, a typical distribution is revealed: there is a high frequency of cases in which the proportion of very poor test results is very low, and a low frequency of cases in which this proportion is high. This is a typical distribution for natural hazards – there are many weak earthquakes, for example, but only a few powerful ones. Small avalanches likewise occur much more frequently than very large ones, and the number of natural avalanche releases within a region is very small (or zero) on most days, and very big on a few days (Schweizer et al., 2020).

In a final step, the distribution associated with the very poor test results is automatically subdivided into four frequency classes. The algorithm sets the class boundaries so that the next higher class is x times as wide as the preceding one. The qualitative descriptors "none or nearly none", “a few”, “several” and “many” can then be assigned to these four classes, as before. This results in the forenamed four classes for the frequency of very poor test results and therefore of danger levels.

Hauptinhalt