Print This Page

Michael Carlson
Department of Statistics
Stockholm University
SE-106 91 Stockholm, Sweden
A Data-Swapping Technique Using Ranks - A Method for Disclosure Control

A data-swapping technique based on ranks is described and suggested as an
approach to statistical disclosure control. The proposed method utilizes
the rank structure of disjoint subsets of an original data set. The values
of one subset, called the reference set, are exchanged for the values of
the other subsets, called auxiliary sets. The method is mainly intended to
be applied to quantative data measured on a continuous scale. The resulting
data set will comprise valid samples on an intravariate level but the
association between pairs of variables is weakened in the typical case. The
expected performance is investigated theoretically and cross product
moments and the conditional distribution of the displacement error are
derived. The properties of the resulting swapped set are illustrated by
means of simulation studies, using normally distributed variables. The
results of the simulation studies indicate that the proposed method
performs reasonably well in the bivariate normal case when the correlation
coefficient is used as the measure of association between pairs of
variables. The benefits and possible problems with regard to disclosure
risks are discussed.

Keywords: Concomitants, Data-swapping, Data dissemination, Disclosure
control, Order statistics, Ranks, Synthetic data.

Close this Window