by Martin Karlberg
Research Report 1997:4
Department of Statistics, Stockholm University, S-106 91 Stockholm, Sweden
Abstract
When it is desirable to estimate the induced triad counts in a network, or a network measure derived from these counts, one may use information arising from a random sample of vertices drawn from a graph or digraph of known order. Various kinds of information about the graph may be observable from the sample; four kinds of information are considered here: unlabeled local sets, unlabeled local nets, labeled local sets, and labeled local nets. Unbiased triad count estimators are defined for these observation schemes, and the variances of these estimators (as well as unbiased estimators of these variances) are derived. A main result is that under the unlabeled local nets scheme, the estimator can be written as a sum of vertex attributes; standard estimation formulas for various sampling designs, such as stratified sampling, can therefore effortlessly be applied. The properties of the estimators are compared in a simulation study. It is found that the estimators based on labeled local nets are more efficient than the others in estimating the counts of triads without null dyads, while the estimators based on labeled local sets often are better for estimating counts of triads with one or more null dyads. The estimators based on unlabeled local nets also perform well; this observation scheme provides the most reliable estimates of the estimator variance-covariance matrix.
Key words: Triad Count, Local Set, Local Net, Network Sampling.
Last update: 1997-12-16 / KH