Child pages
  • Extracted Features in the Wild

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


An approach for visualizing thematic trends within a book.


A Topic Model of Fiction, Jonathan Goodwin

A topic model of fiction, based on the genre-classified dataset (only 1920-22); it may be extended once extracted features are available after 1922.

Image Added


HTRC Feature Reader

A Python library that scaffolds Pandas use of EF data. With example scripts.


Mimno, David. 2014. "Word counting, squared." David Mimno. Blog.

Forster, Chris. 2015. "A Walk Through the Metadata: Gender in the HathiTrust Dataset." (Based on genre-classified subsets.)