Assessing Microdata Disclosure Risk Using the Poisson-Inverse
Gaussian Distribution (to appear in Statistics in Transition)


Michael Carlson




Abstract

An important measure of identification risk associated with the
release of microdata or large complex tables is the number or proportion of
population units that can be uniquely identified by some set of
characterizing attributes which partition the population into
subpopulations or cells. Various methods for estimating this quantity based
on sample data have been proposed in the literature by means of
superpopulation models. In the present paper the Poisson-inverse Gaussian
(PiG) distribution is proposed as a possible approach within this context.
Disclosure risk measures are discussed and derived under the proposed model
as are various methods of estimation. An example on real data is given and
the results indicate that the PiG model may be a useful alternative to
other models.

Keywords: statistical disclosure; uniqueness; inverse-Gaussian;
Poisson-mixture; superpopulation.

Michael Carlson, Department of Statistics, Stockholm University,
SE-106 91 Stockholm, Sweden. E-mail: Michael.Carlson@stat.su.se


Close this Window