Table of Contents | ||
---|---|---|
|
Introduction
The HTRC Extracted Features (EF) dataset contains informative characteristics, at the page level, of text from public domain volumes in the HathiTrust Digital LIbrary (HTDL). These are slightly more than 5 million volumes, representing about 38% of the total digital content of the HTDL.
...