HTRC Data Capsule Tutorial

Detailed tutorial for using the HTRC Data Capsule System  

Use Cases

For convenience, your capsule has been pre-loaded with the packages required to follow these examples.

Since it's performed within the capsule's virtual machine environment, it will be helpful to open a browser in the capsule, e.g. Firefox, and go to the url http://wiki.htrc.illinois.edu/pages/viewpage.action?pageId=22085965 or http://bit.ly/1whzT6H Then you can easily copy and paste the hyperlinks and the commands from the Wiki. 

Use Case: Use Solr API to Retrieve Volume IDs

HTRC provides a search engine API, Solr API, for scholars to search volumes of their interest. Scholars can search by full-text, or MARC catalog fields. An example query is http://chinkapin.pti.indiana.edu:9994/solr/meta/select/?q=title:war which returns all volumes of which the titles contain "war".

Use Case: Perform Text Analytics Using IPython

Use the IPython interactive interface to fetch volume content, and then run vector space model and topic modeling on volumes' OCR content.