Akademisk avhandling
som för avläggande av filosofie doktorsexamen
vid Stockholms universitet
offentligen försvaras i
hörsal 5, hus B, Södra huset, Frescati
mådagen den 6 oktober 1997 kl 10.00


Sixten Lundström
fil lic

Statistiska institutionen
Stockholms universitet


   In a survey we usually control the sample selection process, and thus the design weights can be calculated. However, no matter how carefully a survey is designed and conducted, we must accept the fact that some of the desired data will be missing. If the response probabilities were known, an unbiased two-phase estimator could be constructed. However, the response probabilities are practically never known. In conventional techniques one determines proxies of the response probabilities by modelling the response distribution. There will always be some difference between this model and the true response distribution, and consequently the estimator will suffer from nonresponse bias.
    A standard feature of the derivation of these proxies is the use of auxiliary information. It is also well-known that auxiliary information, for example used in a generalized regression estimator, can reduce the sampling error considerably. This implies that the final weights in conventional estimators are usually formed as the product of three weights, namely the design weight, the nonresponse adjustment weight and the regression weight. It seems clear that simpler approaches have to be developed in order to realize an effective use of the wealth of auxiliary information that is often available in modern society. It is also necessary to have access to computer software which can handle arbitrary sampling designs and arbitrary auxiliary information specifications.
    This thesis suggests an approach which requires neither a response model nor a regression model but which nevertheless guarantees effective use of auxiliary information. The proposed point estimator and the proposed variance estimator are general both in the sampling design and in the use of auxiliary information. Consequently, it is possible to construct a general computer software for our approach.
    The calibration procedure used in this approach generates final weights which are as close as possible to the design weights, while respecting the calibration equation. This equation makes the estimators consistent in the sense that the weights give perfect estimates when applied to each of the auxiliary variables.
    In the approach that we suggest it is essential to specify the auxiliary information to be used in such a way that it explains both the variation in the response probabilities and the variation in the (main) study variables. In this way we realize both of the aims mentioned above, that is, that the auxiliary information is used both to reduce the sampling error and to reduce the nonresponse bias. In addition, the approach has considerable simplicity.

ISBN 91-7153-641-8 

Return to the theses list

Last update: 990916/CE