International Development Research Centre (IDRC) Canada     
idrc.ca HOME > Publications > IDRC Books > All our books > POVERTY AND EQUITY >
 Topic Explorer  
IDRC Books
     New
     in_focus
     Development/evaluation
     Economics
     Environment/biodiversity
     Food/agriculture
     Health
     IT/communication
     Natural resources
     Science/technology
     Social/political sciences
    All our books

IDRC's 40th anniversary

Subscribe

Free Online Books
 People
Rodrigo Bonilla

ID: 104014
Added: 2006-09-28 17:45
Modified: 2006-09-28 21:32
Refreshed: 2010-03-14 07:38

Click here to get the URL for the RSS format file RSS format file

17. Statistical inference in practice
Prev Document(s) 18 of 20 Next

Assessing statistically the extent of poverty and equity in a distribution, or checking for distributive differences, usually involves three steps. First, one formulates hypotheses of interest, such as that the poverty headcount is less than 20%, or that tax equity has increased over time, or that inequality is greater in one country than in another. Second, one computes distributive statistics, weighting observations by their sampling weights and (when appropriate) by a size variable. Third, one uses these statistics to test the hypotheses of interest. This last step can involve testing the hypotheses directly, or building confidence intervals of where we can confidently locate the true population values of interest. This third step may allow for the effects of survey design on the sampling distributions of distributive indices and test statistics, and may also involve performing numerical simulations of such sampling distributions, if the circumstances make it desirable to do so.

17.1 Asymptotic distributions

Under the null hypothesis that μ = μ0, and under some generally mild regularity conditions, all of the estimators Image and associated test statistics considered in this book and programmed in DAD can be shown to be asymptotically normally distributed with mean μ0 and asymptotic sampling variance Image. This can be simply stated as:

Image

The parameter Image is unknown, but we can typically estimate it consistently by Image — this is indeed usually readily provided by DAD. Asymptotically, we can then also write that:

Image

which also implies that

Image

a statistics that does not depend on unknown (or "nuisance") parameters, and that is therefore typically called "pivotal". Many of the results that follow rely implicitly on this result.

In the simplest cases, the estimators of interest can be expressed as a straight-forward sum of variable values across observations. Take for instance the case of an estimator Image estimated using a sample Image of n observations of y1,i:

Image

This is of course just the sample mean of the yi's. As is well known, the asymptotic sampling distribution of Image is given by

Image

where α1 and Image are respectively the population mean and the population variance of y. That variance can be estimated consistently by the sample variance of the y1,i's.

Unfortunately, most of the distributive estimators do not take the simple form of (17.4). Instead, they often take the following general form:

Image

where

  • Image is expressible as a sum of the n observations of yk,i: Image

  • θ can be expressed as a continuous function g of the α's;

  • and yk,i is usually some k-specific transform of the income of observation i.

The sampling distribution of Image will depend on the function g and on the joint sampling distribution of the estimators Image, k = 1,..., K. This joint sampling distribution is usually easily estimated by considering the joint distribution of the Image (recall (17.4)).

DAD then generally uses Rao (1973)'s linearization approach to derive the standard error of indices such as Image. Define α = (α1, α2,..., αK)' and let G be the gradient of g with respect to the α's:

Image

A linearization of Image then yields

Image

The sampling variance of Image can then be shown to be asymptotically equal to

Image

where V is the asymptotic covariance matrix of the Image and is given by

Image

The gradient elements Image,..., can be estimated consistently using the estimates Image,..., of the true derivatives. The elements of the covariance matrix can also be estimated consistently using the sample data, replacing for instance varImage by Image. Note that it is at the level of the estimation of these covariance elements that the full sampling design structure is taken into account (see Section 16.6).

17.2 Hypothesis testing

The outcome of an hypothesis test is a statistical decision: the conclusion of the test will either be to reject a null hypothesis, H0, in favor of an alternative, H1, or to fail to reject it. Most hypothesis tests involving an unknown true population parameter μ, fall into three special cases:

  1. H0: μ = μ0 against H1: μ ≠ μ0;

  2. H0: μ ≤ μ0 against H1: μ > μ0;

  3. H0: μ ≥ μ0 against H1: μ < μ0;

The ultimate statistical decision may be correct or incorrect. Two types of error can occur:

  1. The first one, a Type I error, occurs when we reject H0 when it is in fact true;

  2. The second one, a Type II error, occurs when we fail to reject H0 when H0 is in fact false.

The power of the test of an hypothesis H0 versus H1 is the probability of rejecting H0 in favor of H1 when H1 is true.

Let α be the level of statistical significance in which we are interested, α is often referred to as the size of an hypothesis test. It is the probability of making a Type I error, namely, the probability that we may wrongly reject a null hypothesis. Typical values of α are 0.01, 0.025, 0.05 and 0.1. Let z(p) be the p-quantile of the standardized normal distribution. That is, if F is a standard normal distribution function, then F(z(p)) ≡ p. Let Image be the sample estimate of Image that is, Image is the value of Image computed from the sample at hand, and define z0 as z0 = Image. The rules of rejection and non-rejection of the usual types of hypothesis tests are then as follows1:

1 (Two-sided H1) Reject H0: μ = μ0 in favor of H1: μ ≠ μ0 if and only if:

Image

Note that (17.11) is equivalent to:

Image

Note also that the size of such a test is α since, under the null hypothesis, we have that

Image

2 (Lower-bounded H1) Reject H0: μ ≤ μ0 in favor of H1: μ > μ0 if and only if:

Image

Again, (17.14) is equivalent to

Image

3 (Upper-bounded H1) Reject H0: μ ≥ μ0 in favor of H1: μ < μ0 if and only if:

Image

which is equivalent to

Image


1DAD: Distribution|Confidence Interval.

17.3 p- values and confidence intervals

Table 17.1 sums up the confidence intervals and p-values for each of the three usual types of hypothesis tests considered above.

The p-value of an hypothesis test is the smallest significance level for which H0 would be rejected in favor of some H1. Roughly speaking, a p- value thus indicates the maximum probability that an error is made when one rejects a null hypothesis in favor of the alternative hypothesis. It therefore gives us the "risk" that there is of rejecting a null hypothesis. The larger the p-value, the more imprudent it is to reject H0 in favor of H1.

A p-value is typically compared to some subjective error probability thresholds such as 1%, 5% or 10%. If the p-value exceeds these thresholds, we do not reject the null hypothesis; if the p-value lies beneath the threshold, we reject the null hypothesis in favor of the alternative hypothesis.

A confidence interval (or, more generally, a confidence set) Image is a range of values that is constructed using the sample data and that has a specified probability (1 - α) of containing the true parameter of interest μ. The probability value 1 - α associated with a confidence interval is known as the confidence level. More formally, let Image be the "parameter space" of μ, that is, the range of all of the possible values that μ could possibly take. A confidence interval Image is then an estimate of μ in the sense that there should be a high probability 1 - α that μ, is in that interval Image.

More precisely, a confidence level (1 - α) is the probability that μ is in Image:

Image

Typical confidence levels are 0.9, 0.95 and 0.99. Note that u(1 - α) is a random variable since it depends on the particular sample drawn from the population. Roughly speaking, a 1 - α confidence level is then the proportion of the times that a confidence interval Image will include the unknown parameter when independent samples are taken repeatedly from the same population, and that a 1 - α confidence interval is calculated for each sample. As for hypothesis tests, confidence intervals can be two sided, lower bounded or upper bounded.

The width of a confidence interval thus gives us some idea about how uncertain we are about the true unknown parameter. In fact, building confidence intervals provides more information than carrying out simple hypothesis tests of the types described above. This is because confidence intervals provide a range of plausible values for the unknown parameter. Looking at Table 17.1, it can also be seen that there is a nice symmetry between the results of hypothesis tests and the confidence intervals that correspond to those tests. Indeed, the confidence intervals of Table 17.1 include all of the hypothesized H0 values that cannot be rejected in favor of the corresponding two-sided, lower-bounded or upper-bounded H1 hypotheses. Said differently, choosing any μ0 value inside of these confidence intervals will not lead to the rejection of H0 but choosing any value of μ0 outside of these intervals will lead to the rejection of H0 in favor of H12.

Table 17.1: Confidence intervals and p values associated to the usual hypothesis tests

Image

17.4 Statistical inference using a non-pivotal bootstrap

The technique of the bootstrap (BTS), inspired in large part by Efron (1979), is being applied with increasing frequency in the applied economics literature. BTS is a method for estimating the sampling distribution of an estimator which proceeds by re-sampling repetitively one's initial data. For each simulated sample, one recalculates the value of this estimator and then uses the generated BTS distribution to carry out statistical inference. In finite samples, neither the asymptotic nor the BTS sampling distribution is necessarily superior to the other. In infinite samples, they are usually equivalent. When combined together, they usually outperform either approach used individually.

The following steps summarize a typical BTS procedure:

  • Draw n observations with replacement from the initial sample by taking into account the precise way in which the original sample was drawn (replicating, for instance, as closely as possible the survey design);

  • Repeat the previous step B - 1 independent times;

  • Assess the sampling distribution of the estimator (for instance, its sampling variance) using the distribution of B simulated values.

Let the vector V be made of B estimates of Image, each one computed from one of B simulated (or bootstrap) samples. The vector V is the main tool for capturing the sampling distribution of the estimator Image. Thus, we have:

Image


2DAD: Distribution|Confidence Interval.

where Image is the estimate of Image computed from the ith bootstrap sample. For two-sided tests and confidence intervals with significance level α or confidence level 1 - α, the number of simulations should be chosen so that α(B + l)/2 is an integer (to facilitate the computation of critical test values). Let Image be the p-quantile of the vector V: we then have that Image. The rules of rejection and non-rejection are then:

1 Reject H0: μ = μ0 in favor of H1: μ ≠ μ if and only if:

Image

2 Reject H0: μ ≤ μ0 in favor of H1: μ < μ0if and only if:

Image

3 Reject H0: μ ≥ μ0 in favor of H1: μ < μ0 if and only if:

Image

Table 17.2 summarizes the confidence intervals and p-values for each of the three usual types of hypothesis tests, using non-pivotal bootstrap statistics. The interpretation and the use of these statistics are analogous to what we saw in Section 17.3.

17.5 Hypothesis testing and confidence intervals using pivotal bootstrap statistics

Let Image and Image be respectively the average of the Imagei in V and the estimate of the asymptotic standard deviation of Image computed from the ith bootstrap sample. Let ti be the following asymptotically pivotal statistics:

Image

ti is asymptotically pivotal since it follows asymptotically a standardized N(0, 1) normal distribution which is free of nuisance parameters, i.e., parameters that are unknown.

Let the vector Image then be defined as:

Image

and let t*(p) be the p-quantile of the vector Image. The rules of rejection and non-rejection of the usual null hypotheses are then as follows:

Table 17.2: Confidence intervals and p values for the usual hypothesis tests, using non-pivotal bootstrap statistic

Image

Table 17.3: Confidence intervals and p values for the usual hypothesis tests, using pivotal bootstrap statistic

Image

1 Reject H0: μ = μ0 in favor of H1: μ ≠ μ0 if and only if3:

Image

2 Reject H0: μ ≤ μ0 in favor of H1: μ > μ0if and only if:

Image

3 Reject H0: μ ≥ μ0 in favor of H1: μ < μ0 if and only if:

Image

Table 17.3 summarizes the confidence intervals and p-values associated to each of the three usual types of hypothesis tests, using pivotal bootstrap statistics. Again, these statistics can be interpreted and used basically as above in Section 17.3.

17.6 References

Much of the statistical inference literature for distributive analysis has focused on deriving the sampling distribution of inequality and poverty indices. See Cowell (1999) and Davies, Green, and Paarsch (1998) for overall reviews, as well as Aaberge (2001b) for cross-country evidence of the role of sampling variability, Barrett and Pendakur (1995) for generalized Gini indices, Beach, Chow, Formby, and Slotsve (1994) for decile means, Bishop, Chakraborti, and Thistle (1990) for Sen's welfare index, Bishop, Chakraborti, and Thistle (1991a) for Gini-based relative deprivation indices, Bishop, Chow, and Zheng (1995b) for decomposable poverty indices, Bishop, Formby, and Zheng (1997) for Sen's poverty index, Bishop, Formby, and Zheng (1998) for Gini-based progressivity indices, Chotikapanich and Griffiths (2001) for approximating S-Gini indices using grouped data, Davidson and Duclos (2000) for various classes of poverty indices with deterministic and estimated poverty lines, Duclos (1997a) for linear progressivity and vertical equity indices, Kakwani (1993) for additive poverty indices, Ogwang (2000) for the Gini index, Preston (1995) for poverty indices with estimated poverty lines, Rongve (1997) for poverty indices with known poverty lines, Rongve and Beach (1997) for the use of approximations to inequality indices, Thistle (1990) for two classes of inequality indices, Van de gaer, Funnell, and McCarthy (1999) and Zheng and Cushing (2001) for comparing inequality across statistically dependent incomes, Xu (1998) for the P(z; ρ = 2) poverty index, and Zheng (2001b) for poverty indices with estimated poverty lines.


3DAD: Distribution| Confidence Interval.

The second major area of statistical inference research in distributive analysis has dealt with the sampling distribution of tools for stochastic dominance. This includes Anderson (1996) for integrals of distribution functions, Bahadur (1966) for quantiles, Beach and Davidson (1983) for the Lorenz curve, Bishop, Chakraborti, and Thistle (1989) for Generalized Lorenz curves, Bishop and Formby (1999) for a review, Dardanoni and Forcina (1999) for different inference approaches to ordering Lorenz curves, Davidson and Duclos (1997) for Lorenz and concentration curves, Davidson and Duclos (2000) for primal and dual dominance curves, Klavus (2001) for an application to health care financing in Finland, Maasoumi and Heshmati (2000) for an application to Swedish distributions, Xu (1997) for Generalized Lorenz curves, Xu and Osberg (1998) for "deprivation curves", Zheng, Formby, Smith, and Chow (2000) for mean-normalized dominance curves, and Bishop, Chow, and Formby (1994b) and Zheng (1999b) for marginal dominance analysis using Lorenz and quantile curves.

Issues, methods and applications dealing with the multiple hypothesis tests associated to inferring stochastic dominance orderings can be found inter alia in Barrett and Donald (2003) for simulations of the distribution of statistics needed for complete sets of hypothesis tests, Beach and Richmond (1985) for the joint sampling distribution of some of these statistics, Bishop, Formby, and Thistle (1992) and Bishop, Chakraborti, and Thistle (1994a) for applications of the union-intersection approach, Kaur, Prakasa Rao, and Singh (1994) for testing second-order dominance, Kodde and Palm (1986) for Wald criteria for the joint testing of equality and inequality hypotheses, and Wolak (1989) for testing multivariate inequality constraints.

For general references to the bootstrap, see Efron and Tibshirani (1993) and MacKinnon (2002). Specific applications of the bootstrap and other resampling simulation methods to distributive analysis can be found inter alia in Biewen (2000) (for inequality indices), Biewen (2002a) (for a demonstration of the consistency of bootstrapping inequality, poverty and mobility indices), Mills and Zandvakili (1997) (for inequality indices), Palmitesta, Provasi, and Spera (2000) (for the Gini family of inequality indices), Xu (2000) (for iterated bootstrapping of the S-Gini indices), and Karagiannis and Kovacevic' (2000) and Yitzhaki (1991) for jackknife calculations of the variance of the Gini.

For the use of the "influence function" in protecting against the possible presence of contaminated data, see Cowell and Victoria Feser (1996b) (for inequality indices), Cowell and Victoria Feser (1996a) (for poverty indices), and Cowell and Victoria Feser (2002) (for social welfare rankings).

Other statistically relevant works can be found (among others) in Elbers, Lanjouw, and Lanjouw (2003) and Hentschel, Lanjouw, Lanjouw, and Poggiet (2000) for "poverty mapping" (the estimation of small-area statistics on poverty and inequality using various data sources); Breunig (2001) for a bias correction to the estimation of the coefficient of variation; and Lerman and Yitzhaki (1989) for the impact of using aggregated data in the estimation of inequality indices and in making social welfare rankings.

To generate estimation and statistical inference results using DAD, the analyst does not need to specify the functional forms of the distribution of the population of interest. Said differently, to estimate, for instance, poverty and equity indices, or to generate the standard errors of such indices, we do not need to tell DAD that the incomes we are studying are distributed according to a normal, a Pareto, or a beta distribution, for instance. In that sense, all of DAD's results are "distribution free".

In some circumstances, it may however be useful to do distributive analysis conditional on some distributional assumption. Examples of such analysis can be found in Chotikapanich and Griffiths (2002) (estimation of Lorenz curves), Cowell, Ferreira, and Litchfield (1998) (density estimation in Brazil), Cheong (2002) (estimation of US Lorenz curves), Horrace, Schmidt, and Witte (1995) (sampling variability of order statistics using parametric and non-parametric approaches), Ogwang and Rao (2000) (parametric models of Lorenz curves), Ryu and Slottje (1999) (parametric approximations of Lorenz curves), Sarabia, Castillo, and Slottje (1999) and Sarabia, Castillo, and Slottje (2001) (general methods for building parametric models of Lorenz curves), and Schluter and Trede (2002b) (parametric estimation of tails of Lorenz curves).







Prev Document(s) 18 of 20 Next



   guest (Read)(Ottawa)   Login Home|Careers|Copyright and Terms of Use|General Infomation|Contact Us|Low bandwidth