Geoffrey Rockwell (Univ of Alberta), Laura Mandell (Texas A&M Univ), Stefan Sinclair (McGill Univ), Matthew Wilkens (Notre Dame), Susan Brown (Univ of Guelph)

Can we find and track theory, especially literary theory, in texts using computers? This project uses subsetting of the HT corpus and text mining to track theory through its textual traces, and develop tools and computational methods for tracking the concept of "theory.”


It takes a two-step approach to trying to track theory through its textual traces.


1. Subsetting: We propose to experiment with two methods for identifying “theoretical” subsets of texts from large collections like the Google-digitized dataset (GDD) of the HathiTrust. The goal would be to identify subsets of the full GDD that are theoretical in different ways.


2. Mining: We would then experiment with large-scale text-mining and clustering methods on these subsets. In particular we propose to try topic modelling and other forms of clustering.