Abstract
An important measure
of identification risk associated with the
release of microdata or large complex tables is the number or proportion
of
population units that can be uniquely identified by some set of
characterizing attributes which partition the population into
subpopulations or cells. Various methods for estimating this quantity
based
on sample data have been proposed in the literature by means of
superpopulation models. In the present paper the Poisson-inverse Gaussian
(PiG) distribution is proposed as a possible approach within this context.
Disclosure risk measures are discussed and derived under the proposed
model
as are various methods of estimation. An example on real data is given
and
the results indicate that the PiG model may be a useful alternative to
other models.
Keywords: statistical
disclosure; uniqueness; inverse-Gaussian;
Poisson-mixture; superpopulation.
Michael
Carlson, Department of Statistics, Stockholm University,
SE-106 91 Stockholm, Sweden. E-mail: Michael.Carlson@stat.su.se
|