Application of the asymptotic theory of extreme values to risk simulation of breaking-out of large forest fires Aplicación de la teoría asintótica de los valores extremos al modelado de los riesgos de grandes incendios forestales

Issues of applying the asymptotic theory of extreme values to the risks analysis of breaking-out of the largest-area yearly forest fires, were considered. As original material, the paper authors used data on areas of the forest fires occurring in the south of Russia’s Khabarovsk Territory from 1968 to 2017. For each year, the largest-area forest fire was selected from the considered period of time. In all, 50 fires were selected for this period of time (according to quantity of the years in the period). This sample analysis showed that the general population of these fires areas (where the sample was selected) has probability distribution of extreme values of the first type. An analytical expression for the probability distribution function of this general population was received. On the basis of this distribution analysis, a forecast was made concerning risks of the breaking-out and the average recurrence periods of such fires for various values of the burning area. The conducted analysis showed that in 87.5% of cases, in the south of Khabarovsk territory, the largest-area yearly forest fires, with an area of from 50 to 400 km2, will break out with the 1.2 years recurrence interval. In other words, almost every year, with the exception of these rare events when fires with other areas will occur. It was supposed that the probability distribution of extreme values of the first type can be applied not only to the forest area of Russia’s Khabarovsk territory, but also to other zones in the world with large forest areas.


INTRODUCTION
The forest fires are one of the most dangerous emergency situations. In this connection, the issues of forecasting and assessing the risk of the breaking-out and the recurrence periods of potential large forest fires are of particular importance.
This research is based on analyzing the meteorological, including synoptic data (air temperature, precipitation amount, air humidity, wind velocity and direction, etc.), information about the forest composition (average height, forest stand frequency, floristic composition, area and form of the forest tract) and the relief. On the basis of comprehensive analysis of these data, various maps of risks of the fire danger appearance for an area are built (Dieu et al., 2018, pp.104-116;Eugenio et al., 2016, pp. 65-71;Moayedi et al., 2020;Zhiwei et al., 2015, pp. 106-116), and the mathematical models of fire spread are developed, and only on this base the fire situation is forecasted (Adou et al., 2015, pp. 11-18;Ginzburg & Sokolova, 2014, pp. 68-475;Perminov, 2010;Sokolova & Makogonov, 2013, pp. 222-226).
As the forest fire is notable for fortuity of breaking-out as well as fortuity and complicatedness of its spread, then the research, which is based on the use of probabilistic-statistical methods of processing such fire parameters as area, burning time, quantity of fires per a year, damage, expenses for firefighting, etc. are also of great importance in analyzing the fire situation of a region (Chang, et al., 2016, Eugenio et al., 2019Grishin & Filkov, 2011). This paper covers the issues of applying the extreme values statistics to forecasting and assessing the risk of breaking-out of the largest-area yearly forest fires as exemplified by the south of Russia's Khabarovsk territory.
A risk of breaking-out of an emergency situation, in particular, a fire, includes two aspects: probability of this event and material losses related. It is clear that if to consider all the fires: small medium and large, then the frequency (probability) of breaking-out of small and medium fires will be much higher than frequency of large, but very rare fires causing great financial damage.
Reasons for breaking out of such fires are diverse, and they are related, mainly, to the facts that they are found out too late (in most cases, on weather condition), that manpower and resources are delivered too late to the fire area, or that they could not be delivered there, that the fire cannot be fought in connection with rugged relief. This leads to the fire extinguishing time increase, and as a consequence, to vast burnt-out areas.
Traditional methods of probability theory and mathematical statistics, which are used in forecasting and assessing the risk of the fire breaking-out, do not make it quite possible to assess the large fires breaking-out risk, since the probability of the large fires breaking out in comparison with fires of with small or medium burned area is always small.
Search for special methods of analyzing and forecasting the risk of breaking-out of extreme events led to appearance of the so-called asymptotic theory of extreme values probabilities, which, in the main, deals with equally distributed independent random variables and with properties of distributing their maximum (Dey et al., 2016;Mohammad, 2016;Leadbetter, 2016). This theory was applied in many areas of science and technology: in analysis of flood situation on rivers, in material strength research, in seismology, in aviation, in analysis of catastrophic weather phenomena and in many other areas (Leadbetter, 2016).
Up to now, this theory application to the analysis of multiyear data on the extreme-area yearly forest fires was complicated by absence of sufficient quantity (minimum of 50 years) of multiyear data on this problem.
Within application of the extreme values theory to forecasting and assessing the risk of the large fires breaking-out, this paper analyzed the largest-area yearly forest fires in the south of Khabarovsk territory from 1968 to 2017.

MATERIALS AND METHODS
As original material, the paper authors used the data on the forest fire areas, which had been provided by the territory state institution "Far Eastern facilities of aircraft forest protection"the single information and coordination center for forest fires fighting in Russia's Khabarovsk territory, where all data on the forest fires in Khabarovsk territory are accumulated. Those include the ground investigation data (observation from watch towers, patrolling along the specially developed routes), the aircraft monitoring data and the satellite data (the Earth remote probing data from the space). At present, the satellite data are the basic part of the forest fires monitoring system in Russia, they develop rapidly and are applied in practice over the last twenty-five years (Loupian et al., 2017;Bondur & Gordo, 2018).
In order to conduct the further research, for each year from the considered period of time of from 1968 to 2017, the largest-area forest fire was selected. In all, 50 fires were selected for this period of time (according to quantity of the years in the period). From now on, the fires, which were selected in this way, are called the largest-area yearly forest fires. Table 1 shows the data on the largest areas of yearly forest fires, which were represented in consecutive order by years in the third Table column. The fourth Table column represents the data-based variational series of the maximum (by years) values of the fire areas x i (i = 1…,50), which are sorted by the area magnitude increase. As the sample data are represented by magnitudes of the largest yearly forest fires areas, the general population of these fires areas (where the sample was selected) must have the so-called distribution of maximum values. One of the basic results of the maximum values theory is the fact that any maximum value distribution must belong to one of three only possible types of distributions (Dey, 2016;Mohammad, 2016): (1) Type ΙΙ -Frechet distribution: Type ΙΙΙ -Weibull distribution: where x y=   is for types Ι and ΙΙ distribution, x y=   is for type ΙΙΙ distribution. μ is location parameter, λ is scale parameter, k is nondimensional shape parameter, μ > 0, η > 0, k > 0.
Out of the above-mentioned three types of distributions, the Type I distribution is the most frequently mentioned in applications. Thus, let's consider the Type I distribution in the first instance.
Let's suppose that the maximum values distribution of the yearly forest fires areas in Khabarovsk territory belongs to Type I and let's check correctness of this supposition.
For distribution of the maximum values of Type I, the formula, which describes the probability distribution function F(x) appears as follows: (4) where parameter μ is mode, while λ is scale parameter. Let's assess the distribution parameters μ and λ. For this purpose, on the sample data, firstly let's calculate the sampling mean x and the sampling mean square deviation S.
where xi is sample member with the ordinal number i , n = 50 is sample size. Then the statistical assessments of distribution parameters μ and λ, which are obtained by method of Lmoments, will be as follows (Gubareva & Gartsman, 2010;Dey, 2016;Gubareva & Gartsman, 2010, pp. 437-445): where x and S are represented by the formulae (5), (6).
When having substituted the obtained statistical assessments (7) for parameters μ and λ into the formula (4) for F(x), we finally obtain the supposed probability distribution law F0(x) of the general population of maximum values of the yearly forest fires areas: As the main statistical hypothesis, let's advance the hypothesis H that the general population of maximum values of the yearly forest fires areas in the south of Khabarovsk territory really has the probability distribution F0(x), which is described by the formula (8). In order to check this hypothesis truth, let's choose the K. Pearson fitting criterion χ 2 (Ramachandran & Tsokos, 2020;Ross, 2017). This criterion where m is number of intervals of the variational series breaking; r is number of theoretical distribution parameters F0(x), pi is theoretical probabilities determined by the formula pi = F0(xi+1) -F0(xi), where xi and xi+1 are, respectively, the right and left boundaries of the i-th interval of the variational series breaking, ni is number of variants xi getting into the i-th breaking interval, n is sample size. As a onesided criterion discards the Hypothesis N in a more "tough" way than a two-sided criterion does, let's create a right-side critical region determined by the following expression: where α is significance level,   2 α,k  cr is critical value determined by the χ 2 distribution tables for this significance level of α and k degrees of freedom. Then the hypothesis H acceptance region is determined by the inequality For practical application of the Pearson fitting criterion, the following necessary conditions were observed: the sample size must be 50 or more, each breaking interval must contain 5 variants or more. The intervals, which contain less than 5 variants, were united (Kobzar', 2012).
Stages of calculating the observed value of the criterion χ 2 are represented in Table 2. In the first Table column, the numbers of intervals of the variational series breaking are located. In the second column, these intervals boundaries are located (in km 2 ). The other columns contents are clear from the Table. An optimal number m of the breaking intervals with this sample size n = 50 was chosen according to the H. Sturgess formula (Kobzar', 2012): Taking into account that in the sixth and the seventh breaking intervals, a number of the variational series members getting into them (n 6 = 3, n 7 = 2) is less than 5, these two intervals were united into one interval so that an obtained interval contained 5 members of the variational series. In the Table, uniting of the sixth and seventh intervals is marked by braces. As a new number of the breaking intervals (with account taken of uniting two extreme intervals) m = 6, and the quantity r of the assessed distribution parameters λ and μ is equal to 2 (r = 2), a number of the degrees of freedom k = mr -1 = 6 -2 -1 = 3. According to Table χ 2  . , is less than its critical value, then a divergence between the theoretical and empiric distributions can be regarded as insignificant, and the hypothesis H that the general population of maximum values of the yearly forest fires areas in the south of Khabarovsk territory has a probability distribution F0(x), which is described by the formula (5), corresponds to the sample data (Table 1).
The same check on correspondence to the sample data was also carried out for distributing the extreme values of Type ΙΙ (2). The check showed that a divergence between the theoretical and empiric distributions is significant in this case, while the distribution itself does not correspond to the sample data.
Extreme values distribution of Type ΙΙΙ (3) holds only for distribution of the magnitudes, whose values are focused on a limited-from-above part of the number axis, as such a random magnitude as the largest area of yearly forest fires can be arbitrarily high and it is not limited from above.
Thus, it can be considered as proven with great probability that the general population of magnitudes of the largest areas of the yearly fires, from which the sample is chosen (Table 1), has the probability distribution law of Type I, which is described by the formula (8) (11)

RESULTS AND DISCUSSION
The basic result of the research conducted is obtainment of the analytical expression for the probability distribution function of the largest-area yearly forest fires for the south of Khabarovsk territory. It is shown that the distribution belongs to the first type of extreme distributions and carries complete information about distributing the largest-area yearly fires.
On the basis of this distribution analysis, a forecast was made concerning the risks (probabilities) of breaking-out and medium periods of such fires occurrence for various values intervals of the burning area. The relevant calculation results are stated in Table 3 and Table 4. Table 3. Probability distribution of the largest yearly fires areas Where P(a,b) is probability that a magnitude of area of the largest fire in the year will be within the values interval (a,b), while T(a,b) is an average period of such event occurrence. P(a,b) and T(a,b) were calculated according to the following formulae: where F0 is probability distribution function (8), a is lower interval limit, b is upper interval limit. Result analysis of the calculations, which are stated in Table 3, allows making the forecast of risks of breaking-out of the largest-area yearly forest fires and their occurrence periods in the south of Khabarovsk territory.
The conducted analysis showed that in 87.5% of cases in the south of Khabarovsk territory, the largest-area yearly forest fires with an area of from 50 to 400 km 2 will break out with the 1. occurrence interval, in other words, almost every year, with the exception of the rare cases when fires with other areas will occur.
Thus, small fires with an area of from 10 to 50 km 2 will occur with probability of 0.055 or in 5.5% of cases with the 18 years average occurrence period, while breaking out of fires with an area of more than 650 km 2 is practically impossible.
More detailed information about probability distribution of the largest yearly forest fires areas and about their occurrence periods is in Table 3. and Table 4.
In spite of the fact that the forest fires took place at all times, application of the extreme values theory to the analysis of multiyear data about the extreme-area yearly forest fires was up to now complicated by absence of sufficient quantity (at least for 50 years) of multiyear data on this problem. However, in connection with introducing the satellite forest fire monitoring (Loupian et al., 2017, pp 158-175) (Loupian et al., 2017, pp. 158-175), over the last 25-20 years, activities on systematizing the data on the forest fire areas became much brisker. At the same time, the data on the forest fire areas, which were received in earlier years, when, for receiving the data, only ground observations and the aircraft monitoring data were used, continued to be quite hard-to-receive because of non-systematization of these data or because they were lost.
In Russia, the first and only experiment of applying the extreme values theory to simulation of risks of breaking-out of the largest-area yearly forest fires was performed in 2010 (Bykov, 2012, pp. 53-63). This was based on the data on the largest-area yearly forest fires occurring in Russia's Tver region for the 16 year observation period of from 1990 to 2005. In consequence of small forest cover percent of this region, the largest areas of yearly forest fires were relatively small there and they averaged from 0.1 to 0.8 km 2 depending on a year. By virtue of small sample size, a simplified method of processing the statistical data on extreme fires, which was based on the use of graphic ways of building the quantile-diagrams, was used there. As a result, it was concluded that the probability distribution of the maximum values of the yearly forest fires areas in Tver region belongs to Type ΙI (the so-called Frechet distribution). It comes natural that such simplified graphical method, which was applied to analyzing the small sample, yielded very approximate unstable results.
The forest cover percent of Khabarovsk territory, as well as the forest-covered area, exceed very much the forest cover percent and the forest area in Tver region. Thus, the largest yearly area of forest fires is fluctuating there from 14 to 526 km 2 depending on a year, which is much higher than the relevant largest areas in Tver region. In addition, this paper check for the sample data correspondence (Table 1) to the Type II extreme values distribution (2), showed that the divergence between the theoretical and empiric distributions is significant in this case, and the distribution itself does not correspond to the sample data. Especially, this divergence becomes great for the fire areas of 100 км 2 and more.
This divergence between the results, which were obtained in this paper and results of the research carried out in Tver region (Bykov, 2012, pp. 53-63), is related to small representativeness of the sample used there (for 15 years), to the use of the simplified graphical method of processing the statistical data on extreme fires, and to the fact that in small fire areas, divergence of the Frechet distribution with experimental data is expressed weakly.

CONCLUSIONS
The paper shows a possibility of applying the asymptotic theory of extreme values to simulation of risks of breaking-out of the largest-area yearly forest fires.
The basic result of the research conducted is obtainment of the analytical expression for the probability distribution function of the largest-area yearly forest fires for the south of Russia's Khabarovsk territory. It is shown that this distribution belongs to the first type of extreme distributions and carries complete information about distributing the largest-area yearly forest fires.
On the basis of analysis of the distribution obtained, a forecast was made concerning the risks of breaking-out and medium periods of such fires occurrence for various value intervals of the burning area.
As in the majority of applications, while analyzing the extreme phenomena, the probability distribution of extreme values of the first type is used, it is possible to suppose that this distribution can be applied not only to forest area of Russia's Khabarovsk territory, but also to other forest zones in the world with large forest areas.
The conducted research results on statistical analysis of the largest-area yearly forest fires can be of use in planning and developing the fire-prevention measures to minimize their negative consequences.