Résumés
Résumé
La loi exponentielle est très répandue en hydrologie : elle est faiblement paramétrée, de mise en œuvre aisée. Deux méthodes sont fréquemment utilisées pour estimer son paramètre : la méthode du maximum de vraisemblance et la méthode des moments, qui fournissent la même estimation. A côté de ces deux méthodes, il y a celle des moindres carrés qui est très rarement utilisée pour cette loi. Dans cet article, nous comparons le comportement asymptotique de l'estimateur de la méthode des moindres carrés avec celui de la méthode du maximum de vraisemblance en partant d'une loi exponentielle à un seul paramètre a connu, puis en généralisant les résultats obtenus à partir de la dérivation des expressions analytiques. L'échantillon historique disponible en pratique étant unique, et de longueur généralement courte par rapport à l'information que l'on désire en tirer, l'étude des propriétés statistiques des estimateurs ne pourra se faire qu'à partir d'échantillons de variables aléatoires représentant des réalisations virtuelles du phénomène hydrologique concerné obtenus par simulations de Monte Carlo. L'étude par simulation de Monte Carlo montre que pour de faibles échantillons, l'espérance mathématique des deux estimateurs tend vers le paramètre réel, et que la variance de l'estimateur des moindres carrés est supérieure à celle de l'estimateur du maximum de vraisemblance.
Mots-clés:
- Distribution exponentielle,
- Simulation numérique,
- Monte Carlo,
- Méthode des Moindres carrés,
- Méthode du maximum de vraisemblance,
- Moyenne,
- Variance
Abstract
Exponential distributions are frequently applied in hydrology, for example: frequency analysis of the duration and severity of water flow conditions MATHIEU L. and al. (1991); regional frequency of storm intensities ARNAUD P., LAVABRE J. (1999); partial duration of hydrological droughts KJELDSEN T. R. and al. (1999) ; and daily rainfall modelling CHAPMAN T.G. (1997) ; KABAILI Z. (1983). This method has only one parameter, and it is easy to use. Its parameter is mainly estimated using the maximum likelihood estimator (MLE) or the method of moments estimator (MOME), but the least square estimator (LSE) can also be applied. For the one-parameter exponential distribution, MOME and MLE give the same expression for the parameter:
Ex
Σ xk
k = 1
X^0 = ________
Ex
Using LSE requires a Ex size sample of exponential variables and involves the following steps:
1. Sorting the Ex variables in the sample in ascending order
2. Associating to each quantile xk whose rank is k in the sorted sample an empirical frequency
F^k = k - 0.5 / Ex
3. Plotting Ex against ln(1-F^k) and using LSE to calculate x^0 by :
x^0 = - (ExΣk=1Xk ln(1-F^k)) / (ExΣk=1 [ln(1-F^k)]2
In this paper we compare the asymptotic behaviour of the statistical properties (mean and variance) of the MLE and the LSE. These comparisons must be made by using a great number of sample parameter estimations. In practice, only one historical sample of variables issued from a known exponential distribution was available, from which only one parameter can be calculated. To overcome this difficulty, samples of variables whose original theoretical exponential distribution is known are generated using the Monte Carlo numerical method. Samples of estimated parameters (using the MLE or the LSE) are then created from the above samples of random variables, and the statistical properties of the two estimators are then calculated. These different successive steps are summarised below:
1. Generate sample of finite size Ex for known exponential variables
2. Use this sample to estimate one parameter using MLE or LSE
3. Do steps 1 to 2 Np times to collect a Np size sample of parameter estimations
4. Use this sample to calculate statistical properties (mean and variance) for the two estimators.
According to this approach, sizes Ex and Np should influence the statistical properties of the two estimators. We have verified this with a one-parameter exponential law, with a known theoretical parameter X0=1. Samples of estimated parameters of size Np have been generated from virtual samples of size Ex issued from a population following the above statistical distribution. During this operation, one of these sizes, Ex or Np, has been held constant, while the other, Np or Ex, changed with a constant step. Statistical properties of the estimators have then been calculated for each of the two cases.
Let Var Ex (x^0(Np)) and EEx (x^0(Np)) be statistical properties (variance and mean) of the two estimators for fixed values of Ex, and VarNp(x^0(Ex)) and ENp(x^0(Ex)), the same statistical properties for fixed values of Np.
Plotting VarEx(x^0(Np)) for Ex=10 and Ex=100 shows that for large values of Np (1000 to 5000) variance tends towards a constant value, close to 0.1 for Ex=10 (Fig. 1a) and to 0.01 for Ex=100 (Fig. 1b), both equal to 1/Ex, when the MLE is used. When parameters are estimated with the LSE, variance tends towards a constant value, greater than the preceding ones (Fig. 1a and Fig. 1b). Plotting VarNp(x^0(Ex)) when Np=1000 is constant, the variance decreases as Ex grows whatever the estimator, but for a given value of Ex, the variance is always greater when the LSE is used (Fig. 3). These two calculations show that asymptotic variance depends only on size Ex of samples of known exponential distribution.
Plotting EEx(x^0(Np)) when Ex=10, for important values of Np, the mean is close to the true parameter for the MLE, and greater than this true parameter for the LSE (Fig. 2a). When Ex=100, the mean is close to the true parameter for the two estimators (Fig 2b). From these calculations we notice that the asymptotic mean depends only on the size of Ex for known exponential variables and on the used estimator. The MLE seems to present no bias for the mean, while the LSE presents a bias for small values of Ex, but this bias disappears as Ex increases. To quantify this degree of dependence, we have plotted ENp(x^0(Ex)) for Np=1000 (Fig. 4). For the two estimators, the mean presents an initial bias, when Ex is low and the bias disappears when Ex becomes higher. The initial bias is more important with the LSE.
In summary, the asymptotic statistical properties of the two estimators (mean and variance) depend only on the size of Ex for known exponential distribution variables.
Empirical plots are unstable for low sample sizes, are sensitive to sampling, and are very difficult to explain. Analytical expressions for the asymptotic statistical properties of the two estimators are needed for realistic comparison. According to formulae (1) and (2), statistical properties depend on E∞(Xk) and E∞(X2k) respectively and the asymptotic mean of Xk and X2k.E∞(Xk) and E∞(X2k) have been derived using the density of probability of Xk through statistics of rank. Asymptotic statistical properties of the two estimators have then been evaluated using the expressions of E∞(Xk) and E∞(X2k).
We let E∞[x^0(Ex)] be the asymptotic mean of estimator, Var∞[x^0(Ex)] be the asymptotic variance, and x0 be the theoretical parameter of exponential distribution. By plotting E∞[x^0(Ex)] for X0=1, we note that this expression has a constant value, equal to unity when the MLE was applied, and that it decreases quickly to unity when the LSE was applied (Fig. 6). By plotting Var∞[x^0(Ex)] for x0=1, we also note that the theoretical asymptotic variance diminishes as Ex grows, but is greater when the LSE was applied (Fig. 5). By comparing with empirical plots when x0=1, we establish the same trends.
Theoretical derivations of asymptotic statistical proprieties have confirmed empirical experiences:
· The MLE for a one-parameter exponential presents no bias
· The LSE for a one-parameter exponential is a consistent estimator of the simple exponential parameter.
Keywords:
- One-parameter exponential distribution,
- Monte Carlo numerical simulation,
- maximum of likelihood estimator,
- least squares estimator,
- mean,
- variance
Veuillez télécharger l’article en PDF pour le lire.
Télécharger