The Trace of Theory (TracT) project looked at the question “Can we find and track theory, especially literary theory, in texts using computers?” We proposed to do this on the large collections of the HathiTrust using a variety of techniques with the support of the HathiTrust Research Centre. This project brought together researchers who are part of the Text Mining the Novel project (http://novel-tm.ca/) led by Dr. Andrew Piper at McGill University.
Final project can be found at https://docs.google.com/document/d/1BwWd_tR6TtA7kp6QYQuAQte88Ri4Vvcx9Bho7NTKQ6o/edit?ts=5665d43e# please refer to the report for project background, technical details, and community impact.
Geoffrey Rockwell (Univ of Alberta), Laura Mandell (Texas A&M Univ), Stefan Sinclair (McGill Univ), Matthew Wilkens (Notre Dame), Susan Brown (Univ of Guelph)
Boris Capitanu (HTRC), Kahyun Choi (HTRC)
- Using keyword lists to identify philosophical and literary critical texts
Figure. Galaxy Viewer Exploring Literary Criticism Subset. Figure. HathiTrust Reader with the Robert Browning Text Seen in Galaxy Viewer
Some of the insights derived from this work include the following, some of which is represented in the graph below (the Y axis represents annual means of philosophical classification scores (each text is scored from -1 to 1).
From appearance to the naked eye, using the literary critical keywords seemed to improve the results. For more information about these test, please see: https://github.com/htrc/ACS-TT/blob/master/tools/notebooks/ClassifyingLitCrit.ipynb
Two conference panels were submitted, and one already been accepted:
The second potential impact is adapting a visual exploration environment like the Galaxy Viewer (GV) to the the exploration of large subsets. Even with successful subsetting techniques, users get too many results to many skim. The GV gives humanists a viable way to explore results using topic modelling. It also allows humanists to then drill down to the actual texts as a way of checking results against the texts themselves. The challenge now is to refine this interface, compare it to alternatives, and test a robust version with a larger set of users. We believe that the GV could become part of a research interface to the open (and closed) HathiTrust collections that would make them accessible to a broad research audience.
Geoffrey Rockwell, Stéfan Sinclair, Laura Mandell, Susan Brown, and Matthew Wilkens. Project final report. https://docs.google.com/document/d/1BwWd_tR6TtA7kp6QYQuAQte88Ri4Vvcx9Bho7NTKQ6o/edit?ts=5665d43e#