Comparaison par simulation de Monte-Carlo des propriétés de deux estimateurs du paramètre d'échelle de la loi exponentielle : méthode du maximum de vraisemblance (MV) et méthode des moindres carrés (MC)

Sambou, S.

doi:https://doi.org/10.7202/705521ar

Feuilleter les articles de ce numéro

Modélisation mathématique de l'évolution, à long terme, des teneurs en nitrates dans la nappe aquifère des craies du Crétacé de Hesbaye (Belgique)

Modélisation des réseaux de microirrigation

Plan de l’article

Retour au début
Résumé
Texte intégral (PDF)Texte intégral (PDF)

Boîte à outils

Sauvegarder Supprimer
PDFTélécharger
Citer cet article
Partager

Résumés

Résumé

La loi exponentielle est très répandue en hydrologie : elle est faiblement paramétrée, de mise en œuvre aisée. Deux méthodes sont fréquemment utilisées pour estimer son paramètre : la méthode du maximum de vraisemblance et la méthode des moments, qui fournissent la même estimation. A côté de ces deux méthodes, il y a celle des moindres carrés qui est très rarement utilisée pour cette loi. Dans cet article, nous comparons le comportement asymptotique de l'estimateur de la méthode des moindres carrés avec celui de la méthode du maximum de vraisemblance en partant d'une loi exponentielle à un seul paramètre a connu, puis en généralisant les résultats obtenus à partir de la dérivation des expressions analytiques. L'échantillon historique disponible en pratique étant unique, et de longueur généralement courte par rapport à l'information que l'on désire en tirer, l'étude des propriétés statistiques des estimateurs ne pourra se faire qu'à partir d'échantillons de variables aléatoires représentant des réalisations virtuelles du phénomène hydrologique concerné obtenus par simulations de Monte Carlo. L'étude par simulation de Monte Carlo montre que pour de faibles échantillons, l'espérance mathématique des deux estimateurs tend vers le paramètre réel, et que la variance de l'estimateur des moindres carrés est supérieure à celle de l'estimateur du maximum de vraisemblance.

Mots-clés:

Distribution exponentielle,
Simulation numérique,
Monte Carlo,
Méthode des Moindres carrés,
Méthode du maximum de vraisemblance,
Moyenne,
Variance

Abstract

Exponential distributions are frequently applied in hydrology, for example: frequency analysis of the duration and severity of water flow conditions MATHIEU L. and al. (1991); regional frequency of storm intensities ARNAUD P., LAVABRE J. (1999); partial duration of hydrological droughts KJELDSEN T. R. and al. (1999) ; and daily rainfall modelling CHAPMAN T.G. (1997) ; KABAILI Z. (1983). This method has only one parameter, and it is easy to use. Its parameter is mainly estimated using the maximum likelihood estimator (MLE) or the method of moments estimator (MOME), but the least square estimator (LSE) can also be applied. For the one-parameter exponential distribution, MOME and MLE give the same expression for the parameter:

E_x

Σ x_k

k = 1

X^{^}₀ = ________

E_x

Using LSE requires a E_x size sample of exponential variables and involves the following steps:

1. Sorting the E_x variables in the sample in ascending order

2. Associating to each quantile x_k whose rank is k in the sorted sample an empirical frequency

F^{^}_k = k - 0.5 / E_x

3. Plotting E_x against ln(1-F^{^}_k) and using LSE to calculate x^{^}₀ by :

x^{^}₀ = - (^ExΣ_k=1X_k ln(1-F^{^}_k)) / (^ExΣ_k=1 [ln(1-F^{^}_k)]²

In this paper we compare the asymptotic behaviour of the statistical properties (mean and variance) of the MLE and the LSE. These comparisons must be made by using a great number of sample parameter estimations. In practice, only one historical sample of variables issued from a known exponential distribution was available, from which only one parameter can be calculated. To overcome this difficulty, samples of variables whose original theoretical exponential distribution is known are generated using the Monte Carlo numerical method. Samples of estimated parameters (using the MLE or the LSE) are then created from the above samples of random variables, and the statistical properties of the two estimators are then calculated. These different successive steps are summarised below:

1. Generate sample of finite size E_x for known exponential variables

2. Use this sample to estimate one parameter using MLE or LSE

3. Do steps 1 to 2 N_p times to collect a N_p size sample of parameter estimations

4. Use this sample to calculate statistical properties (mean and variance) for the two estimators.

According to this approach, sizes E_x and N_p should influence the statistical properties of the two estimators. We have verified this with a one-parameter exponential law, with a known theoretical parameter X₀=1. Samples of estimated parameters of size N_p have been generated from virtual samples of size E_x issued from a population following the above statistical distribution. During this operation, one of these sizes, E_x or N_p, has been held constant, while the other, N_p or E_x, changed with a constant step. Statistical properties of the estimators have then been calculated for each of the two cases.

Let Var _Ex (x^{^}₀(N_p)) and E_Ex (x^{^}₀(N_p)) be statistical properties (variance and mean) of the two estimators for fixed values of E_x, and Var_Np(x^{^}₀(E_x)) and E_Np(x^{^}₀(E_x)), the same statistical properties for fixed values of N_p.

Plotting Var_Ex(x^{^}₀(N_p)) for E_x=10 and E_x=100 shows that for large values of N_p (1000 to 5000) variance tends towards a constant value, close to 0.1 for E_x=10 (Fig. 1a) and to 0.01 for E_x=100 (Fig. 1b), both equal to 1/E_x, when the MLE is used. When parameters are estimated with the LSE, variance tends towards a constant value, greater than the preceding ones (Fig. 1a and Fig. 1b). Plotting Var_Np(x^{^}₀(E_x)) when N_p=1000 is constant, the variance decreases as E_x grows whatever the estimator, but for a given value of E_x, the variance is always greater when the LSE is used (Fig. 3). These two calculations show that asymptotic variance depends only on size E_x of samples of known exponential distribution.

Plotting E_Ex(x^{^}₀(N_p)) when E_x=10, for important values of N_p, the mean is close to the true parameter for the MLE, and greater than this true parameter for the LSE (Fig. 2a). When E_x=100, the mean is close to the true parameter for the two estimators (Fig 2b). From these calculations we notice that the asymptotic mean depends only on the size of E_x for known exponential variables and on the used estimator. The MLE seems to present no bias for the mean, while the LSE presents a bias for small values of E_x, but this bias disappears as E_x increases. To quantify this degree of dependence, we have plotted E_Np(x^{^}₀(E_x)) for N_p=1000 (Fig. 4). For the two estimators, the mean presents an initial bias, when E_x is low and the bias disappears when E_x becomes higher. The initial bias is more important with the LSE.

In summary, the asymptotic statistical properties of the two estimators (mean and variance) depend only on the size of E_x for known exponential distribution variables.

Empirical plots are unstable for low sample sizes, are sensitive to sampling, and are very difficult to explain. Analytical expressions for the asymptotic statistical properties of the two estimators are needed for realistic comparison. According to formulae (1) and (2), statistical properties depend on E_∞(X_k) and E_∞(X²_k) respectively and the asymptotic mean of X_k and X²_k.E_∞(X_k) and E_∞(X²_k) have been derived using the density of probability of X_k through statistics of rank. Asymptotic statistical properties of the two estimators have then been evaluated using the expressions of E_∞(X_k) and E_∞(X²_k).

We let E_∞[x^{^}₀(E_x)] be the asymptotic mean of estimator, Var_∞[x^{^}₀(E_x)] be the asymptotic variance, and x₀ be the theoretical parameter of exponential distribution. By plotting E_∞[x^{^}₀(E_x)] for X₀=1, we note that this expression has a constant value, equal to unity when the MLE was applied, and that it decreases quickly to unity when the LSE was applied (Fig. 6). By plotting Var_∞[x^{^}₀(E_x)] for x₀=1, we also note that the theoretical asymptotic variance diminishes as E_x grows, but is greater when the LSE was applied (Fig. 5). By comparing with empirical plots when x₀=1, we establish the same trends.

Theoretical derivations of asymptotic statistical proprieties have confirmed empirical experiences:

· The MLE for a one-parameter exponential presents no bias

· The LSE for a one-parameter exponential is a consistent estimator of the simple exponential parameter.

Keywords:

One-parameter exponential distribution,
Monte Carlo numerical simulation,
maximum of likelihood estimator,
least squares estimator,
mean,
variance

Veuillez télécharger l’article en PDF pour le lire.

Télécharger

Résumés

Résumé

Abstract

Outils de citation

Citer cet article

Exporter la notice de cet article