Child pages
  • The Trace of Theory project

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

The Trace of Theory (TracT) project looked at the question “Can we find and track theory, especially literary theory, in texts using computers?” We proposed to do this on the large collections of the HathiTrust using a variety of techniques with the support of the HathiTrust Research Centre. This project brought together researchers who are part of the Text Mining the Novel project (http://novel-tm.ca/) led by Dr. Andrew Piper at McGill University.

...

Final project can be found at https://docs.google.com/document/d/1BwWd_tR6TtA7kp6QYQuAQte88Ri4Vvcx9Bho7NTKQ6o/edit?ts=5665d43e#   please refer to the report for project background, technical details, and community impact.

Personnel

Geoffrey Rockwell (Univ of Alberta), Laura Mandell (Texas A&M Univ), Stefan Sinclair (McGill Univ), Matthew Wilkens (Notre Dame), Susan Brown (Univ of Guelph)

Boris Capitanu (HTRC), Kahyun Choi (HTRC)

Workflow

  1. Using keyword lists to identify philosophical and literary critical texts

...

  Figure. Galaxy Viewer Exploring Literary Criticism Subset.     Figure. HathiTrust Reader with the Robert Browning Text Seen in Galaxy Viewer

Findings

Some of the insights derived from this work include the following, some of which is represented in the graph below (the Y axis represents annual means of philosophical classification scores (each text is scored from -1 to 1).

...

From appearance to the naked eye, using the literary critical keywords seemed to improve the results. For more information about these test, please see: https://github.com/htrc/ACS-TT/blob/master/tools/notebooks/ClassifyingLitCrit.ipynb

Community Impact

Two conference panels were submitted, and one already been accepted:

...

The second potential impact is adapting a visual exploration environment like the Galaxy Viewer (GV) to the the exploration of large subsets. Even with successful subsetting techniques, users get too many results to many skim. The GV gives humanists a viable way to explore results using topic modelling. It also allows humanists to then drill down to the actual texts as a way of checking results against the texts themselves. The challenge now is to refine this interface, compare it to alternatives, and test a robust version with a larger set of users. We believe that the GV could become part of a research interface to the open (and closed) HathiTrust collections that would make them accessible to a broad research audience.

Resources

Geoffrey Rockwell, Stéfan Sinclair, Laura Mandell, Susan Brown, and Matthew Wilkens. Project final report. https://docs.google.com/document/d/1BwWd_tR6TtA7kp6QYQuAQte88Ri4Vvcx9Bho7NTKQ6o/edit?ts=5665d43e#

https://github.com/htrc/ACS-TT/blob/master/tools/notebooks/ClassifyingPhilosophicalText.ipynb (Supervised learning: Classifying philosophical texts)

...