This page details the operations and specifications of the HTRC Data Capsules. See the HTRC Data Capsule Tutorial for more a detailed, step-by-step tutorial for how to use you capsule.


The HTRC Data Capsule is a secure computing environment developed to facilitate non-consumptive text analysis research. Each capsule is a virtual machine (VM) that provides researchers a desktop they can use to perform their investigation of volumes in the HathiTrust Digital Library. 

The capsules are configured with special security settings that allow you to interact with them in two modes:  maintenance mode and secure mode

Use the HTRC Portal interface to set-up and interact with your capsule VM, including: 

Once a capsule is started via the HTRC Portal interface, you can access your capsule desktop using a VNC client. To run analysis using data from the HTRC corpus repository, you'll need to switch the capsule VM to secure mode. Here is a typical workflow a new user may follow:

  1. Create and start a capsule in the HTRC Portal

  2. Log into the capsule using a VNC client

  3. Configure the software environment of the capsule as needed. Download the scripts or programs you plan to use in your analysis

  4. Switch capsule to secure mode through HTRC Portal

  5. Run your against the secure HTRC corpus repository

  6. Move your results to the secure volume storage on the capsule

  7. Switch capsule back to maintenance mode to regain normal network access

HTRC Data Capsule Configurations

Each capsule comes pre-loaded with the following libraries, packages, and data. For more information, consult the ReadMe file on the desktop of your capsule for more details about installed packages.

Python Libraries 

System-level Packages

Sample data and programs

HTRC Data Capsule operations

Please note: You are required to log in to the HTRC Portal before you can perform these operations.

Create a capsule virtual machine (VM)

Show capsule VM status


Start a capsule VM

Log into a capsule VM

Switch modes of a capsule VM

Stop a Virtual Machine

Restart a Virtual Machine

Delete a Virtual Machine