Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
 
VersionStatusMaturityComments
1.0ReleasedReleased 
1.0.1-SNAPSHOTUnder testingRelease candidateAdded token count feature

Table of Contents

Table of Contents
outlinetrue
excludeTable of Contents

...

The HTRC Data API is a RESTful web service for the retrieval of multiple volumes, pages of volumes, and METS metadata documents.  In order to support the efficient retrieval of volumes and pages in bulk, the Data API necessarily deviates from the typical RESTful API design out of necessity in that resources are not identified on the URL paths, but instead are sent as request parameters.

In version 1.0.1 of the Data API, it adds a feature allowing client to request for token counts of volumes.

API

Retrieve Volumes

All parameter values must be URL encoded
DescriptionReturns requested volumes
URL
/volumes
Supported Response Types

application/zip (normal response)

text/plain (error response)

MethodPOST
Request Types
application/x-www-form-urlencoded
Request Headers
Content-Type: application/x-www-form-urlencoded
Request BodyRequest parameters as body content.  See Parameters below
Parameters
Note
NameDescriptionTypeDefault valueRequiredNote
volumeIDsThe list of volumeIDs to be retrieved.stringN/AyesVolumeIDs are separated by the pipe character '|'
concatThe flag to indicate concatenation option.booleanfalsenoSee section on response format for details on its impact on the returned data
metsThe flag to indicate if METS document should be returnedbooleanfalseno 
versionThe A specific version of the Data API to usestringN/AnoNot implemented.  Place holder only
Responses
HTTP Status CodeResponse BodyResponse TypeDescription
200 (ok)A binary Zip stream
application/zip
Page content and metadata of the requested volumes aggregated as a Zip stream
400 (bad request)
Missing required parameter volumeIDs
text/plain
The required parameter volumeIDs is missing in the request
400 (bad request)
Malformed Volume ID list. Offending token: ${token}
text/plain

The value for volumeIDs is malformed and the Data API cannot parse it.  ${token} will be the token that causes the error.

Example
Expand
Description

Request for volumes inu.3011012 and uc2.ark:/13960/t2qxv15, with concatenation option enabled so each volume is a single text file in the returned Zip stream.

Raw volumeIDs

inu.3011012|uc2.ark:/13960/t2qxv15

URL encoded request bodyvolumeIDs=inu.3011012%7Cuc2.ark%3A%2F13960%2Ft2qxv15&concat=true

...

All parameter values must be URL encoded
DescriptionReturns requested pages
URL/pages
Supported Response Types

application/zip (normal response)

text/plain (error response)

MethodPOST
Request Types
application/x-www-form-urlencoded
Request Headers
Content-Type: application/x-www-form-urlencoded
Request BodyRequest parameters as body content.  See Parameters below
Parameters
Note
NameDescriptionTypeDefault valueRequiredNote
pageIDsThe list of pageIDs to be retrievedstringN/AyesPageIDs are separated by the pipe character '|'
concatThe flag to indicate concatenation optionbooleanfalseno

See section on response format for details on its impact on the returned data

Note

"concat" and "mets" cannot be both set

metsThe flag to indicate if METS documents should be returnedbooleanfalseno
Note

"concat" and "mets" cannot be both set

versionThe A specific version of the Data API to usestringN/AnoNot implemented.  Place holder only
Responses
HTTP Status CodeResponse BodyResponse TypeDescription
200 (ok)A binary Zip streamapplication/zipPage content and metadata of the requested pages aggregated as a Zip stream
400 (bad request)Missing required parameter pageIDstext/plainThe required parameter volumeIDs is missing in the request
400 (bad request)Malformed Page ID list. Offending token: ${token}text/plainThe value for pageIDs is malformed and the Data API cannot parse it.  ${token} will be the token that caused the error.
400 (bad request)Conflicting parameters in page retrieval. Offending Parameters: ${param1}, ${param2}text/plainSome request parameters have conflict. ${param1} and ${param2} will be the names of the parameters that caused the conflict. In the current version of the Data API, this is most likely caused by setting both "mets" and "concat" for page retrieval.
Example
Expand
Description

Request for the 1st, 2nd, 20th, and 30th pages of the volume inu.3011012, and the 11th, 17th, 22th, 30th, 45th, and 55th pages of the volume uc2.ark:/13960/t2qxv15, with each page being a separate text file along with the corresponding METS document of each volume in the returned Zip stream.

Raw pageIDsinu.3011012[1,2,20,30]|uc2.ark:/13960/t2qxv15[11,45,30,17,22,55]
URL encoded request bodypageIDs=inu.3011012%5B1%2C2%2C20%2C30%5D%7Cuc2.ark%3A%2F13960%2Ft2qxv15%5B11%2C45%2C30%2C17%2C22%2C55%5D&mets=true

Token Count

DescriptionReturns token counts of requested volumes
URL/tokencount
Supported Response Types

application/zip (normal response)

text/plain (error response)

MethodPOST
Request Types
application/x-www-form-urlencoded
Request Headers
Content-Type: application/x-www-form-urlencoded
Request BodyRequest parameters as body content.  See Parameters below
Parameters
NameDescriptionTypeDefault ValueRequiredNote
volumeIDsthe list of volumes to be token countedstringN/AyesVolumeIDs are separated by the pipe character '|'
levelspecifies whether the token counts to be aggregated at volume level or page level.  Use "volume" for volume level, and "page" for page levelstringvolumeno 
sortByspecifies the token count output to be sorted on a fields. Use "token" for sorting based on the token's UTF-8 order, and "count" for sorting based on the token count order.  If left unspecified, the results do not guarantee any orders.stringN/AnoToken ordering is based on UTF-8 character values, so character "Z" comes before character "a" (if using ascending ordering).  For token count ordering, tokens with the same count are ordered by token's UTF-8 values.
sortOrderspecifies whether output to use ascending or descending ordering.  Use "asc" for ascending ordering, and "desc" for descending ordering.stringascnothis parameter only has effect when used together with sortBy, otherwise it is ignored.
versionA specific version of the Data API to usestringN/AnoNot implemented.  Place holder only

 

Zip Structure Layout

The directory structure layout of the Zip stream returned from the Data API may be one of the following patterns depending on the optional parameters:

...

While the Data API by itself does not enforce any security mechanism for authentication and/or authorization, it is typically deployed behind an OAuth2 Servlet Filter.  A client making request to the Data API through the OAuth2 Servlet Filter must first obtain a valid OAuth2 token from the token service, and present the token as an additional HTTP request header to the OAuth2 Servlet Filtered Data API.  Please refer to " Using WSO2 Identity Server as the OAuth2 Provider for HTRC" for details on the usage.