Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.



Panel
borderStylesolid

The HTRC Data Capsule environment provides individual, secure computing environments to analyze content in the HathiTrust Digital Library. Researchers can create virtual machines (called Capsules) to which they can import and then analyze HathiTrust text data. Researchers can only perform computational analysis within the secure Data Capsule environment and then export the results of their analysis. Volume text may not be exported outside the HTRC Data Capsule, and data products leaving a Capsule must undergo results review prior to release to ensure they meet the HTRC's policy for non-consumptive data exports.

Button Hyperlink
titleUse a Capsule
typeprimary
urlhttps://analytics.hathitrust.org/staticcapsules
Button Hyperlink
titleRead the guide
typestandard
urlHTRC Data Capsule Specifications and Usage Guide
Button Hyperlink
titleFollow a tutorial
typestandard
urlHTRC Data Capsule Step-by-Step Guides


Capsule specifications

What's in a Capsule?

Out-of-the-box, Capsules are Ubuntu virtual machines with increased security settings. Researchers have the option to set certain parameters for their Capsule when they create it. Capsules come pre-loaded with standard data analysis programs and software. While Capsules come with standard tools pre-installed, ranging from Anaconda and R to Voyant Tools, and can be configured with sample public domain data already loaded for testing, any other data or tools the researcher plans to use will need to be brought into the Capsule by the researcher. A Capsule is an almost blank slate that can be customized for each researcher's needs!

Kinds of Capsules

There are two kinds of capsules: Demo Capsules and Research Capsules. Researchers can request for their Research Capsules to have full-corpus access, and approval is limited to those from HathiTrust member institutions.

Button Hyperlink
titleRead the guide
typestandard
urlHTRC Data Capsule Specifications and Usage Guide
 

Using a capsule

Creating a Capsule

Capsules operate from the HTRC Analytics website, which requires an HTRC account to log-in. 

Button Hyperlink
titleCreate an HTRC Analytics account
typeprimary
urlhttps://analytics.hathitrust.org/signuppage
 
Button Hyperlink
titleFollow a tutorial
typestandard
urlHTRC Analytics step-by-step tutorial

You'll use the site to create and administer your Capsule. 

 

Button Hyperlink
titleCreate a Capsule
typeprimary
urlhttps://analytics.hathitrust.org/staticcapsules
Button Hyperlink
titleFollow a tutorial
typestandard
urlHTRC Data Capsule Step-by-Step GuidesCreate or convert a Capsule

Research in a Capsule

In HTRC Analytics, you'll have the option work with your Capsule either via a remote desktop viewer (to see your capsule's desktop) or a terminal viewer (to interact with your capsule via a command line interface). 

Capsules are intended for researchers who want access to HathiTrust text data in flexible, individually-driven environment. Researchers looking for a point-and-click option should explore HTRC Algorithms

We offer several step-by-step guides for using a Capsule. 

Button Hyperlink
titleFollow a tutorial
typestandard
urlHTRC Data Capsule Step-by-Step Guides
 
Button Hyperlink
titleRead the guide
typestandard
urlHTRC Data Capsule Specifications and Usage Guide

Development details


Read more

The HTRC Data Capsule system was prototyped through funding from the Alfred P. Sloan Foundation (2011-2015). The final report is available here: Final report.  

Extension of the HTRC Data Capsule project to larger compute resources and better integration with the HTRC worksets was recently funded by a grant from the Andrew T. Mellon Foundation (2016-2018).  

Anchor
ref1
ref1
Kevin Borders, Eric Vander Weele, Billy Lau, and Atul Prakash, Protecting Confidential Data on Personal Computers with Storage Capsules. Proceedings of the 18th USENIX Security Symposium, Aug. 2009. 

Anchor
ref2
ref2
Zeng, J., Ruan, G., Crowell, A., Prakash, A., & Plale, B. (2014, June). Cloud Computing Data Capsules for Non-Consumptive Use of Texts. In Proceedings of the 5th ACM workshop on Scientific cloud computing (pp. 9-16). ACM.

Plale, Beth; Prakash, Atul; McDonald, Robert (2015). The Data Capsule for Non-Consumptive Research: Final Report. Available from http://hdl.handle.net/2022/19277