Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

volumeIDs=<volumeID_list>[&concat=true]

<volumeID_list> := <volumeID> [“|” <volumeID> [“|”...]]

...

A correct request to the Data API receives a 200_OK status, and the body of the response is a binary ZIP stream, with the MIME type of “application/zip”.

If the request was sent with concat=false, the returned ZIP file would have the following structure:

volumes.zip
  <cleaned_volumeID_1>/
      00000001.txt
      00000002.txt
      …
      000nnnnn.txt
  <cleaned_volumeID_2>/
      00000001.txt
      00000002.txt
      …
      000xxxxx.txt
  …
  <cleaned_volumeID_n>/
      000000001.txt
      …
      00000zzzz.txt
  ERROR.err

In the ZIP file, each requested volume is in its own directory, and the name of the directory is a “cleaned” mutation of the original volumeID in a filesystem safe format.  Each page of the volume is a text file named by its page sequence as an eight-digit fixed-length zero-padded number.

<cleaned_volumeID> := <prefix>“.”<cleaned_ID_string>

A “cleaned” volumeID consists of the institution-identifying prefix, followed by the dot character ".", followed by the “cleaned” version of the original ID string.  This “cleaning” procedure is defined in the pairtree specification (https://wiki.ucop.edu/display/Curation/PairTree) so the directory name is filesystem safe.  However, it is worth pointing out that the prefix does not undergo the cleaning procedure.  This is because the prefix is not a part of the pairtree structure.

If an error occurs after the Data API has started sending the stream, the Data API injects a special ERROR.err file into the ZIP stream and then properly terminates the stream to prevent corruption.  ERROR.err contains information on the error.  If this file is present, the ZIP file may not contain all requested volumes/pages.   

On the other hand, if the request was sent with concat=true, the returned ZIP would have the following structure:

volumes.zip
   <cleaned_volumeID_1>.txt
   <cleaned_volumeID_2>.txt
   …
   <cleaned_volumeID_n>.txt
   ERROR.err

in this case, pages belong to each volume are no longer individual text files.  Instead, they are concatenated into a single text file, and the name of the text file is the cleaned volume ID with .txt extension.

If the entry ERROR.err is present, there was an issue retrieving some resources, and the ZIP stream was terminated prematurely to prevent corruption.  Some requested resources may be missing.

1.2.4 Requesting pages

Use the following command to request specific pages from a book instead of retrieving the entire volumeTo request for pages, set the request URL to :

https://<server>:<port>/data-api/pages?

and the request body to a URL encoded "pageIDs" parameter string:

pageIDs=<pageID_list>[&concat=true]where

<pageID_list> := <pageID> [“|” <pageID> [“|”...]]

<pageID> := <volumeID>“[”<pageID_1> [“,”<pageID_2 [“,”...]]“]

 

Example:
https://silvermaple.pti.indiana.edu:25443/data-api/pages?pageIDs=inu.3011012[1,2,20,30]|uc2.ark:/13960/t2qxv15[11,45,30,17,22,55]3011012%5B1%2C2%2C20%2C30%5D%7Cuc2.ark%3A%2F13960%2Ft2qxv15%5B11%2C45%2C30%2C17%2C22%2C55%5D
A pageID list consists of one or more pageIDs separated by the pipe character “|”.  Each pageID is a volumeID followed by a comma-separated list of page sequence numbers enclosed in square brackets.  The client may find it necessary to must perform URL encoding on the pageID list so all characters can be safely passed as query parametersit conforms to the application/x-www-form-urlencoded content-type.

The optional parameter “concat” controls how these pages should be returned.  If omitted, the default value for concat is false.

...