Child pages
  • Extracted Features in the Wild

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

An approach for visualizing thematic trends within a book.

 

A Topic Model of Fiction, Jonathan Goodwin

A topic model of fiction, based on the genre-classified dataset (only 1920-22); it may be extended once extracted features are available after 1922.

Image Added

Tools

HTRC Feature Reader

A Python library that scaffolds Pandas use of EF data. With example scripts.

...

Mimno, David. 2014. "Word counting, squared." David Mimno. Blog. http://www.mimno.org/articles/wordsim/

Forster, Chris. 2015. "A Walk Through the Metadata: Gender in the HathiTrust Dataset." (Based on genre-classified subsets.) http://cforster.com/2015/09/gender-in-hathitrust-dataset/