Using Temporal Network Analysis to Uncover Bias in Collections

The Social Network and Archival Context project (SNAC, presents a documentary-social network of over 3.7 million identities described in finding aids and resource descriptions from archives around the globe; that is, a graph in which individuals, corporate bodies, and families as nodes are connected with edges based on their co-occurrence in printed or collected works. SNAC’s network can be considered as an evolving network, in which identities are only connected during their lifespans.

I developed an analysis tool and applied three social network analysis metrics–betweenness, harmonic, and degree centrality–temporally across the SNAC network’s 400-year lifespan to understand the network as it changes over time. This analysis uncovered an irregularity in the overall centrality, i.e., a highly dynamic point in time, in the mid-1700s. We expected the cause to include collections of prominent figures in US history, since the majority of holdings referenced by SNAC are in the United States. However, we found that it was instead caused by the connections of English authors James Boswell and Samuel Johnson, whose descriptions from Harvard’s Theatre collection were over-described compared with identities from that time. I explore the use of these metrics to uncover accession and collection bias.


4:00 PM
20 minutes