...
Figure. Galaxy Viewer Exploring Literary Criticism Subset. Figure. HathiTrust Reader with the Robert Browning Text Seen in Galaxy Viewer
Findings
Some of the insights derived from this work include the following, some of which is represented in the graph below (the Y axis represents annual means of philosophical classification scores (each text is scored from -1 to 1).
the HTRC Genre corpus has a lot of duplicate texts
the HTRC Genre corpus increases the number of volumes per year over time
there's an issue with the HTRC Genre corpus around the end of the 19th century
philosophical variation seems to increase over time
drama is the least philosophical genre
fiction and poetry seem to get less philosophical over time
From appearance to the naked eye, using the literary critical keywords seemed to improve the results. For more information about these test, please see: https://github.com/htrc/ACS-TT/blob/master/tools/notebooks/ClassifyingLitCrit.ipynb
Community Impact
Two conference panels were submitted, and one already been accepted:
CSDH (Canadian Society for Digital Humanties: Calgary May-June 2016) Panel proposal on “On the Track of Literary Theory and Philosophy: Explorations of the HathiTrust Collections”. (accepted)
- DH 2016 (Digital Humanities: Krakow July 2016) Panel proposal on “The Trace of Theory: Extracting Subsets from Large Collections” which includes presentations by HTRC staff. (pending)
...