Akademisk avhandling
som för avläggande av filosofie
doktorsexamen
vid Stockholms universitet
offentligen försvaras i
hörsal 5, hus B, Södra huset,
Frescati
mådagen den 6 oktober 1997 kl
10.00
av
Sixten Lundström
fil lic
Statistiska institutionen
Stockholms universitet
ABSTRACT
In a survey we usually
control the sample selection process, and thus the design weights can be
calculated. However, no matter how carefully a survey is designed and conducted,
we must accept the fact that some of the desired data will be missing.
If the response probabilities were known, an unbiased two-phase estimator
could be constructed. However, the response probabilities are practically
never known. In conventional techniques one determines proxies of the response
probabilities by modelling the response distribution. There will always
be some difference between this model and the true response distribution,
and consequently the estimator will suffer from nonresponse bias.
A standard feature
of the derivation of these proxies is the use of auxiliary information.
It is also well-known that auxiliary information, for example used in a
generalized regression estimator, can reduce the sampling error considerably.
This implies that the final weights in conventional estimators are usually
formed as the product of three weights, namely the design weight, the nonresponse
adjustment weight and the regression weight. It seems clear that simpler
approaches have to be developed in order to realize an effective use of
the wealth of auxiliary information that is often available in modern society.
It is also necessary to have access to computer software which can handle
arbitrary sampling designs and arbitrary auxiliary information specifications.
This thesis suggests
an approach which requires neither a response model nor a regression model
but which nevertheless guarantees effective use of auxiliary information.
The proposed point estimator and the proposed variance estimator are general
both in the sampling design and in the use of auxiliary information. Consequently,
it is possible to construct a general computer software for our approach.
The calibration
procedure used in this approach generates final weights which are as close
as possible to the design weights, while respecting the calibration equation.
This equation makes the estimators consistent in the sense that the weights
give perfect estimates when applied to each of the auxiliary variables.
In the approach
that we suggest it is essential to specify the auxiliary information to
be used in such a way that it explains both the variation in the response
probabilities and the variation in the (main) study variables. In this
way we realize both of the aims mentioned above, that is, that the auxiliary
information is used both to reduce the sampling error and to reduce the
nonresponse bias. In addition, the approach has considerable simplicity.
ISBN 91-7153-641-8
Last update: 990916/CE