TRIAD COUNT ESTIMATION AND TRANSITIVITY TESTING
IN GRAPHS AND DIGRAPHS

som för avläggande av filosofie doktorsexamen
vid Stockholms universitet
offentligen försvaras i
hörsal 4, hus B, Södra huset, Frescati
tisdagen den 27 maj 1997 kl 10.00

av

Martin Karlberg
fil lic

Statistiska institutionen
Stockholms universitet

ABSTRACT

Triads and transitivity are two concepts within the field of social network analysis that are closely related to each other. We study some estimation and testing problems related to those concepts, using the tools of graph theory; the results obtained could be applied to graphs representing other kinds of data than social relations.
Throughout this thesis, we focus on the role of local networks. A local network for an individual in a friendship network may be regarded as the network between the friends of this person. Two local network attributes that we pay special attention to are the size and the density. The local size of a vertex is defined so that it counts the number of transitive relationships in which that vertex is involved; this definition is not undisputable in the digraph situation, since not all edges in the local network are counted using that definition. We define the local density of a vertex in such a way, that its expected value is equal to the expected overall density of the network under some commonly used simple random graph and random digraph models.
When dealing with triad count estimation, we consider the situation when we have observed information about a probability sample of vertices in a graph or digraph; we let the amount of information observed for each vertex range from the vertex degree to the entire local network of that vertex. Horvitz-Thompson estimators (and variance estimators for those estimators) for the triad counts are given. A main result is that when local networks without information on the identities of the vertices in that network are observed, the triad counts may be expressed as sums of vertex attributes; this greatly facilitates the estimation based on vertex sampling designs more complex than simple random sampling.
Transitivity testing is considered for graphs and digraphs that are observed in their entirety. We study two different kinds of transitivity tests; tests based on the counts of transitive and intransitive triads and triples, and tests based on the mean local density over all vertices. The null hypothesis used is that the graphs and digraphs observed have been generated according to some conditional uniform random graph model that does not imply a high degree of transitivity. The powers of the tests against random graph distributions that generate highly transitive graphs are examined in simulation studies. In other simulation studies, the tests are applied to a large set of school class sociograms. When (undirected) graphs are considered, the test based on the proportion of transitive triads out of the non-vacuously transitive ones is found to be best at detecting transitivity; for digraphs, the test based on the difference between the mean local density and the overall network density is the best transitivity detector.

ISBN 91-7153-595-0

Last update: 990916/CE