Ingegerd Jansson PhD thesis

ON STATISTICAL MODELING OF SOCIAL NETWORKS

Akademisk avhandling
som för avläggande av filosofie doktorsexamen
vid Stockholms universitet
offentligen försvaras i
hörsal 3, hus B, Södra huset, Frescati
fredagen den 31 oktober 1997 kl 13.00

Ingegerd Jansson
fil lic

Statistiska institutionen
Stockholms universitet

ABSTRACT

    A social network is formed by a collection of individuals and the contacts or relations between them. The focus is here on relations defined by sociometric choices, implying that the individuals in the network have been required to choose among the other individuals in the group according to some criteria.
   Some statistical models of the sociometric choice structure in a social network are presented. A recurrent assumption in the models is the assumption that there exists latent or underlying structures in a network, structures that can not be directly observed. The models also have in common the assumption of independence or conditional independence between entities of the network, assumptions not uncommon in statistical network models. These assumptions are often crucial for the possibility to find simple ways of estimating models. Here an attempt is made to find probabilistic models where independence is an ingredient but which also allow for structure.
    The models are illustrated and evaluated by application to data where the sociometric choices are based on friendship or cooperation. The large number of networks available makes it possible to find and evaluate empirical distributions of the test statistics.
    The simplest model presented assumes dyad independence. Possible simplifications by choice or edge independence are also discussed. It appears that dyad independence can not be ruled out, but that choice independence is inappropriate.
   In order to model the transitive structure of a network, a model is introduced that is based on a random choice structure and an unknown underlying clique structure. Two different approaches for estimating the clique structure are discussed, the maximum likelihood approach and the Bayesian approach. The empirical results show no large deviation between model and observations.
    A model of popularity structure is introduced where popularity is viewed as a latent attribute of the individuals in the network. The group of individuals is assumed to have a latent popularity structure, composed of individuals from three popularity groups. It is shown how the popularity structure can be estimated and how latent popularity can be considered in combination with manifest individual attributes.
    A model is also presented for the situation when two sociometric relations are measured on one set of individuals. The model assumes that there exists a latent network structure which can not be observed directly. When the relations are measured on the network, deviations from the underlying structure might occur with some probability. Approximate estimates of model parameters are given. Empirical findings suggest that the birelational latent model is sufficiently accurate for the data, but that there appear to be convergence problems, possibly due to the relatively large number of parameters in the model.

ISBN 91-7153-665-5

Return to the theses list

Last update: 990916/CE