Skip to content
Skip to breadcrumbs
Skip to header menu
Skip to action menu
Skip to quick search
Linked Applications
Loading...
HTRC Docs
Spaces
Hit enter to search
Help
Online Help
Keyboard Shortcuts
Feed Builder
What’s new
Available Gadgets
About Confluence
Log in
Sign up
Documentation
Pages
Blog
Child pages
HTRC Derived Datasets
Extracted Features Dataset [v.1.5]
Browse pages
Configure
Space tools
View Page
A
t
tachments (0)
Page History
Page Information
View in Hierarchy
View Source
Export to PDF
Export to Word
Pages
…
HathiTrust Research Center
HTRC Derived Datasets
Extracted Features Dataset [v.1.5]
Page Information
Title:
Extracted Features Dataset [v.1.5]
Author:
Alex Kinnaman
Oct 26, 2016
Last Changed by:
Eleanor Dickson Koehl
Jun 02, 2020
Tiny Link:
(useful for email)
https://wiki.htrc.illinois.edu/x/GoA5Ag
Export As:
Word
·
PDF
Incoming Links
Documentation (3)
Page:
HathiTrust+Bookworm
Page:
HTRC Derived Datasets
Page:
About the Collection
Hierarchy
Parent Page
Page:
HTRC Derived Datasets
Labels
There are no labels assigned to this page.
Recent Changes
Time
Editor
Jun 02, 2020 06:16
Eleanor Dickson Koehl
View Changes
Jun 01, 2020 10:28
Eleanor Dickson Koehl
View Changes
May 27, 2020 09:21
Boris Capitanu
View Changes
May 27, 2020 09:19
Boris Capitanu
View Changes
Updated samples and download links
May 22, 2020 10:25
Boris Capitanu
fixed sample counts
View Page History
Outgoing Links
External Links (14)
creativecommons.org/licenses/by/4.0/
www.hathitrust.org/hathifiles
en.wikipedia.org/wiki/Paratext
https://www.loc.gov/marc/bibliographic/bd008.html
https://oclc.org/en-US/home.html
https://code.google.com/p/language-detection/wiki/LanguageL…
dx.doi.org/10.13012/J8TD9V7M
www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_…
https://code.google.com/p/language-detection/
dx.doi.org/10.13012/J8X63JT3
mailto:htrc-help@hathitrust.org
opennlp.apache.org/
www.hathitrust.org/bib_rights_determination
https://www.loc.gov/marc/bibliographic/bd260.html
Documentation (3)
Page:
Extracted Features [v.2.0]
Page:
Downloading Extracted Features
Page:
Extracted Features Dataset [v.1.5]
Overview
Content Tools
{"serverDuration": 62, "requestCorrelationId": "e3541dccc26edc86"}