Child pages
  • HTRC Workset Builder 2.0 (Beta)
Skip to end of metadata
Go to start of metadata

The HTRC Workset Builder 2.0 Beta is a new tool and interface built over the HTRC Extracted Features Dataset to enable both volume-level metadata search and volume- and page-level unigram (single word) text search in order to build worksets.

As indicated, this interface is currently in beta, and may change.

Quick Guide: How to Build a Workset

To build a workset using HTRC Workset Builder, follow these general steps:

  • Step 0: Decide what to search

  • Step 1: Perform a unigram (single-term) full text or metadata (using field tags) search at the desired level (page or volume)

  • Step 2: Filter through the results and select desired items to add to your workset

  • Step 3: Repeat steps 1 and 2 as necessary until your workset is ready;

  • Step 4: "Export as workset" to HTRC Analytics or download workset metadata.

More information about each of the steps in the workset building process are included in sections below.

Searching

You can search for unigrams (single terms) in both volume metadata or in the text of each volume, by page. Since this is a search built on the Extracted Features Dataset, bigram and larger n-gram searches (phrases) are not possible in the conventional sense. Instead, you can search for phrases using quotations (e.g. "snow ski"), which will return volumes and pages where each term in the query co-occur. In this way, a search for "snow ski" is equivalent to a Solr syntax search of "snow" AND "ski". See more details about Solr syntax, and a link to a guide, in the section below.

Searches are not case sensitive, and by default, your search will be conducted on pages recognized as English. Click “Search all Languages” if you prefer to search everything. Users can also choose specific languages to limit your search to from those that appear under “Show other languages.” Limit your search to a specific part-of-speech by using the checkboxes under the language, though be aware that not all of the languages have the functionality to search by part-of-speech. Wildcard matching is possible using '?' for a single character and '*' for multiple characters. For example 'canad?' and '*land'.

There are four options for searching: text, metadata, combined and advanced. Text search will search the full text of the volume, at the page level, for a unigram or unigrams (e.g. searching all volumes for the word "rose"). Results returned are volume-level metadata, along with page-level metadata and bag-of-words tokens. Since this a page-based search, you will receive one result for each page that matches your query. To see results grouped by volume (multiple page results under one volume heading and one result), check the box marked "Sort &Group by Volume" under the search bar. Metadata search will search volume-level metadata fields for given unigrams, and return volumes in which the terms appear in a given (or any) field specified in the drop-down menu (e.g. searching all volumes for those with a publicaton place of "bl" the MARC code for Brazil). A combined search allows both text and metadata search in a given query (e.g. a search to return all volumes published in Brazil in which the term "rose" appeared on a page). Advanced search allows for users familiar with Solr syntax (see below for more information) to construct and execute their own queries.

Search Text

Text search allows users to search volume full text, by page, for unigrams (single terms). A version of phrase searching can be achieved using the same method as described under Search Metadata: using quotation marks to initiate a search for multiple terms on the same page of text. By default, text searches will search English-language volumes. If you'd like to search all languages, check the "Search pages in all languages" button underneath the search bar. Currently, part-of-speech information is only available for volumes in English, German, Portuguese, Danish, Dutch and Swedish. While other languages are coded in volume metadata and thus can be retrieved, there will not be part-of-speech data available for those volumes.

Text searches will retrieve volume-level metadata, but the main unit of search and retrieval is the page. Since many pages in a single volume may contain a given unigram, users may wish to check the "Sort & Group by Volume" button directly beneath the search bar, which will present results by the volume, with a list of pages on which the term appears, as compared to multiple volume entries with a single associated page in the results view. 

Search Metadata

Metadata search is similar to text search, supporting unigram queries across all metadata fields. Search for a single term in all fields by choosing “All Fields” from the drop-down (this is the default metadata search), or search a specific field by selecting it from the menu.

To search multiple metadata fields, enter your search query in a format called Solr syntax. For example, a search for “title_t:hamlet AND names_t:shakespeare” will return all volumes with “hamlet” in the title field and “shakespeare” in the names field. The same search can be used with an “OR” operand to return volumes that satisfy either condition. Note that there is no space between the colon and the search term. For more information, see this Solr query syntax guide.

For information on the volume metadata fields, including possible values for fields with controlled vocabularies see the below Metadata field values section.

Searching dates

Metadata date fields can be searched for using 4-digit years, e.g. "1984". To search a date range, you must use Solr syntax (more information in the next section) to string multiple date queries together using the OR operand. For example, a search for volumes published in 1880-1882 would be input as “date_t:1880 OR date_t:1881 OR date_t: 1882” in the Advanced search tab. An asterisk (*) can be substituted in place of a digit in the date in order to execute a wild card search. For example, a search of “date_t:188*” will return all volumes with a publishing date of 1880 through 1889, inclusive.

Volume Metadata

Volume-level metadata searching is not case-sensitive. Searching is built upon Solr syntax. Wildcard matching includes using '?' for a single character and '*' for multiple characters. For example 'canad?' and '*land'. This interface is designed for unigram (single-term) searching, however a phrase-based search is possible using quotation marks: "term1 term2". This is the equivalent to a Solr syntax search of "term1" OR "term2" and will return pages with either term present.

You can also search for specific volume metadata fields using the field:term form, such as names_t:Austen or names_t:"Austen, Jane" .

In alphabetical order, searchable fields are:

Field nameField name in Solr syntaxField description
Access ProfileaccessProfile_t:The code that indicates full-text access level.
Bibliographic FormatbibliographicFormat_t:The code for the format of a volume (e.g. book, serial, etc.).
Classification DDCclassification_ddc_t:The Dewey Decimal Classification call number supplied by the originating library.
Classification LCCclassification_lcc_t:The Library of Congress Classification call number supplied by the originating library.
Date CreateddateCreated_t:The time this metadata object was processed.
Genregenre_t:The genre of the volume.
Handle URLhandleUrl_t:The persistent identifier for the given volume.
HathiTrust Record NumberhathitrustRecordNumber_t:The unique record number for the volume in the HathiTrust Digital Library.
HathiTrust Bib URLhtBibUrl_t:The HathiTrust Bibliographic API call for the volume.
Imprintimprint_t:The place of publication, publisher, and publication date of the given volume.
ISBNisbn_t:The International Standard Book Number for a volume.
ISSNissn_t:The International Standard Serial Number for a volume.
Issuanceissuance_t:The bibliographic level of a volume
Languagelanguage_t:The primary language of the volume in MARC language code format.

Last Update Date

lastUpdateDate_t:The date this page was last updated.
LCCNlccn_t:The Library of Congress Call Number for a volume.
Namesnames_t:The personal and corporate names associated with a volume.
OCLCoclc_t:The control number(s) assigned to each bibliographic record by the Online Computer Library Center (OCLC).
Publication DatepubDate_t:The publication year.
Publication PlacepubPlace_t:The publication location code in MARC country code format.
Rights AttributesrightsAttributes_t:The rights attributes for a volume.
Schema VersionschemaVersion_t:A version identifier for the format and structure of this metadata object.
Source Institution Record NumbersourceInstitutionRecordNumber_t:The unique record number for the volume from its original institution.
Source InstitutionsourceInstitution_t:The institution code of the original institution who contributed the volume.
Titletitle_t:Title of the volume.
Type of ResourcetypeOfResource_t:The format type of a volume.
Volume IdentifiervolumeIdentifier_t:A unique identifier for the current volume. This is the same identifier used in the HathiTrust and HathiTrust Research Center corpora.

Results

On the results page, you will find the title and unique HathiTrust ID for each volume that contains a result based on your search. You can hover over the title of each volume to trigger a pop-up with brief metadata information. Also listed, if a text search is part of your query, is the page sequence on which your search term appears. Lastly, a link to download the Extracted Features data for each volume in your results is also generated. If you follow the link to the page sequence for your search term, the Extracted Features data–tokens in a variety of views, parts of speech, and token frequencies–for that page is shown, along with links to download the Extracted Features files for the page or volume, along with a thumbnail of the page image with a link back to HathiTrust to view the page directly. Additionally, below the volume title, the full metadata record in human-readable form is available, if you click the "Show metadata" link to expand the section.

Filtering

On your results page, you will see seven different fields that can be used to filter search results. These fields are dervied from the same metadata fields listed above: genre, language, copyright status, author, place of publication, original bibliographic format and classification. To apply facets to filter results, check boxes next to the desired facets under a given heading (e.g. "author"), and then click the "Apply Filter" button that will appear next to the section heading. To filter by values in more than one field/heading, you must first choose to apply filters in one field before doing so in another.

Exporting results

Once you have a desired set of results on the search page, you may work with or save them in a number of ways. For result sets of less than 40 million pages, you may export the entire set of search results as: a list of volume or page IDs, a metadata manifest with one row per volume, or you may choose to download the Extracted Features files for each volume in your result set. When downloading Extracted Features files for result sets, be mindful that for many volumes, this will be a large download, which can take minutes to complete.

To create a workset from more than one search, you can add volumes you'd like to include to your shopping cart by checking boxes next to each volume and pressing the yellow "Add" button or by selecting volumes via check box and dragging and dropping them into the shopping cart icon at the top right on the result page. If you'd like to change the checkboxes for each item, you can use the "Select All On This Page" or "Deselect All" to either check or uncheck all results. Similarly, the "Invert Selection" button can be used to change all checked items to unchecked, or the inverse.

Once your shopping cart is complete with the volumes you're interested in, click on the cart icon to view your workset. From this page, you can directly import your shopping cart as a workset in HTRC Analytics by clicking the "Export as Workset" button at the top right of the shopping cart page. From here, you'll be taken to HTRC Analytics, prompted to sign in, if you aren't already, and asked to provide a name and description of your new workset. Once a workset is imported into Analytics, you can get metadata information, share the workset, and run algorithms over its contents.

Saving worksets

Since worksets created using the Workset Builder are tied to a web browser session, once you exit your browser, your workset will not be saved unless you export it in one of the above ways. For the same reason, worksets cannot be shared via URL unless they are imported into HTRC Analytics.

Metadata field values

The Bibliographic Format field is coded; for example using BK to represent a Book. The possible codes used for this field are:

BK:BooksCF:Computer FilesCR:Continuing ResourcesMP:Maps
MU:MusicMX:Mixed MaterialsSE:SerialsVM:Visual Materials


The Place of Publication field pubPlace_t uses MARC country codes, with the following possible values:

aa:Albaniaabc:Albertaaca:Australian Capital Territoryae:Algeria
af:Afghanistanag:Argentinaai:Armenia (Republic)aj:Azerbaijan
aku:Alaskaalu:Alabamaam:Anguillaan:Andorra
ao:Angolaaq:Antigua and Barbudaaru:Arkansasas:American Samoa
at:Australiaau:Austriaaw:Arubaay:Antarctica
azu:Arizonaba:Bahrainbb:Barbadosbcc:British Columbia
bd:Burundibe:Belgiumbf:Bahamasbg:Bangladesh
bh:Belizebi:British Indian Ocean Territorybl:Brazilbm:Bermuda Islands
bn:Bosnia and Herzegovinabo:Boliviabp:Solomon Islandsbr:Burma
bs:Botswanabt:Bhutanbu:Bulgariabv:Bouvet Island
bw:Belarusbx:Bruneica:Caribbean Netherlandscau:California
cb:Cambodiacc:Chinacd:Chadce:Sri Lanka
cf:Congo (Brazzaville)cg:Congo (Democratic Republic)ch:China (Republic : 1949- )ci:Croatia
cj:Cayman Islandsck:Colombiacl:Chilecm:Cameroon
co:Curaçaocou:Coloradocq:Comoroscr:Costa Rica
ctu:Connecticutcu:Cubacv:Cabo Verdecw:Cook Islands
cx:Central African Republiccy:Cyprusdcu:District of Columbiadeu:Delaware
dk:Denmarkdm:Benindq:Dominicadr:Dominican Republic
ea:Eritreaec:Ecuadoreg:Equatorial Guineaem:Timor-Leste
enk:Englander:Estoniaes:El Salvadoret:Ethiopia
fa:Faroe Islandsfg:French Guianafi:Finlandfj:Fiji
fk:Falkland Islandsflu:Floridafm:Micronesia (Federated States)fp:French Polynesia
fr:Francefs:Terres australes et antarctiques françaisesft:Djiboutigau:Georgia
gb:Kiribatigd:Grenadagh:Ghanagi:Gibraltar
gl:Greenlandgm:Gambiago:Gabongp:Guadeloupe
gr:Greecegs:Georgia (Republic)gt:Guatemalagu:Guam
gv:Guineagw:Germanygy:Guyanagz:Gaza Strip
hiu:Hawaiihm:Heard and McDonald Islandsho:Hondurasht:Haiti
hu:Hungaryiau:Iowaic:Icelandidu:Idaho
ie:Irelandii:Indiailu:Illinoisinu:Indiana
io:Indonesiaiq:Iraqir:Iranis:Israel
it:Italyiv:Côte d'Ivoireiy:Iraq-Saudi Arabia Neutral Zoneja:Japan
ji:Johnston Atolljm:Jamaicajo:Jordanke:Kenya
kg:Kyrgyzstankn:Korea (North)ko:Korea (South)ksu:Kansas
ku:Kuwaitkv:Kosovokyu:Kentuckykz:Kazakhstan
lau:Louisianalb:Liberiale:Lebanonlh:Liechtenstein
li:Lithuanialo:Lesothols:Laoslu:Luxembourg
lv:Latvialy:Libyamau:Massachusettsmbc:Manitoba
mc:Monacomdu:Marylandmeu:Mainemf:Mauritius
mg:Madagascarmiu:Michiganmj:Montserratmk:Oman
ml:Malimm:Maltamnu:Minnesotamo:Montenegro
mou:Missourimp:Mongoliamq:Martiniquemr:Morocco
msu:Mississippimtu:Montanamu:Mauritaniamv:Moldova
mw:Malawimx:Mexicomy:Malaysiamz:Mozambique
nbu:Nebraskancu:North Carolinandu:North Dakotane:Netherlands
nfc:Newfoundland and Labradorng:Nigernhu:New Hampshirenik:Northern Ireland
nju:New Jerseynkc:New Brunswicknl:New Caledonianmu:New Mexico
nn:Vanuatuno:Norwaynp:Nepalnq:Nicaragua
nr:Nigeriansc:Nova Scotiantc:Northwest Territoriesnu:Nauru
nuc:Nunavutnvu:Nevadanw:Northern Mariana Islandsnx:Norfolk Island
nyu:New York (State)nz:New Zealandohu:Ohiooku:Oklahoma
onc:Ontariooru:Oregonot:Mayottepau:Pennsylvania
pc:Pitcairn Islandpe:Perupf:Paracel Islandspg:Guinea-Bissau
ph:Philippinespic:Prince Edward Islandpk:Pakistanpl:Poland
pn:Panamapo:Portugalpp:Papua New Guineapr:Puerto Rico
pw:Palaupy:Paraguayqa:Qatarqea:Queensland
quc:Québec (Province)rb:Serbiare:Réunionrh:Zimbabwe
riu:Rhode Islandrm:Romaniaru:Russia (Federation)rw:Rwanda
sa:South Africasc:Saint-Barthélemyscu:South Carolinasd:South Sudan
sdu:South Dakotase:Seychellessf:Sao Tome and Principesg:Senegal
sh:Spanish North Africasi:Singaporesj:Sudansl:Sierra Leone
sm:San Marinosn:Sint Maartensnc:Saskatchewanso:Somalia
sp:Spainsq:Swazilandsr:Surinamss:Western Sahara
st:Saint-Martinstk:Scotlandsu:Saudi Arabiasw:Sweden
sx:Namibiasy:Syriasz:Switzerlandta:Tajikistan
tc:Turks and Caicos Islandstg:Togoth:Thailandti:Tunisia
tk:Turkmenistantl:Tokelautma:Tasmaniatnu:Tennessee
to:Tongatr:Trinidad and Tobagots:United Arab Emiratestu:Turkey
tv:Tuvalutxu:Texastz:Tanzaniaua:Egypt
uc:United States Misc. Caribbean Islandsug:Ugandauik:United Kingdom Misc. Islandsun:Ukraine
up:United States Misc. Pacific Islandsutu:Utahuv:Burkina Fasouy:Uruguay
uz:Uzbekistanvau:Virginiavb:British Virgin Islandsvc:Vatican City
ve:Venezuelavi:Virgin Islands of the United Statesvm:Vietnamvp:Various places
vra:Victoriavtu:Vermontwau:Washington (State)wea:Western Australia
wf:Wallis and Futunawiu:Wisconsinwj:West Bank of the Jordan Riverwk:Wake Island
wlk:Walesws:Samoawvu:West Virginiawyu:Wyoming
xa:Christmas Island (Indian Ocean)xb:Cocos (Keeling) Islandsxc:Maldivesxd:Saint Kitts-Nevis
xe:Marshall Islandsxf:Midway Islandsxga:Coral Sea Islands Territoryxh:Niue
xj:Saint Helenaxk:Saint Luciaxl:Saint Pierre and Miquelonxm:Saint Vincent and the Grenadines
xn:Macedoniaxna:New South Walesxo:Slovakiaxoa:Northern Territory
xp:Spratly Islandxr:Czech Republicxra:South Australiaxs:South Georgia and the South Sandwich Islands
xv:Sloveniaxx:No placexxc:Canadaxxk:United Kingdom
xxu:United Statesye:Yemenykc:Yukon Territoryza:Zambi


Deprecated MARC Place of Publication codes, which are no longer being actively used, but may still appear in data, are:

-ac:Ashmore and Cartier Islands-ai:Anguilla-air:Armenian S.S.R.-ajr:Azerbaijan S.S.R.
-bwr:Byelorussian S.S.R.-cn:Canada-cp:Canton and Enderbury Islands-cs:Czechoslovakia
-cz:Canal Zone-err:Estonia-ge:Germany (East)-gn:Gilbert and Ellice Islands
-gsr:Georgian S.S.R.-hk:Hong Kong-iu:Israel-Syria Demilitarized Zones-iw:Israel-Jordan Demilitarized Zones
-jn:Jan Mayen-kgr:Kirghiz S.S.R.-kzr:Kazakh S.S.R.-lir:Lithuania
-ln:Central and Southern Line Islands-lvr:Latvia-mh:Macao-mvr:Moldavian S.S.R.
-na:Netherlands Antilles-nm:Northern Mariana Islands-pt:Portuguese Timor-rur:Russian S.F.S.R.
-ry:Ryukyu Islands, Southern-sb:Svalbard-sk:Sikkim-sv:Swan Islands
-tar:Tajik S.S.R.-tkr:Turkmen S.S.R.-tt:Trust Territory of the Pacific Islands-ui:United Kingdom Misc. Islands
-uk:United Kingdom-unr:Ukraine-ur:Soviet Union-us:United States
-uzr:Uzbek S.S.R.-vn:Vietnam, North-vs:Vietnam, South-wb:West Berlin
-xi:Saint Kitts-Nevis-Anguilla-xxr:Soviet Union-ys:Yemen (People's Democratic Republic)-yu:Serbia and Montenegro


The set of codes used for the Language field language_t are also derived from MARC language codes, with the following possible values:

aar:Afarabk:Abkhazace:Achineseach:Acoli
ada:Adangmeady:Adygeiafa:Afroasiatic (Other)afh:Afrihili (Artificial language)
afr:Afrikaansain:Ainuaka:Akanakk:Akkadian
alb:Albanianale:Aleutalg:Algonquian (Other)alt:Altai
amh:Amharicang:English, Old (ca. 450-1100)anp:Angikaapa:Apache languages
ara:Arabicarc:Aramaicarg:Aragonesearm:Armenian
arn:Mapuchearp:Arapahoart:Artificial (Other)arw:Arawak
asm:Assameseast:Bableath:Athapascan (Other)aus:Australian languages
ava:Avaricave:Avestanawa:Awadhiaym:Aymara
aze:Azerbaijanibad:Banda languagesbai:Bamileke languagesbak:Bashkir
bal:Baluchibam:Bambaraban:Balinesebaq:Basque
bas:Basabat:Baltic (Other)bej:Bejabel:Belarusian
bem:Bembaben:Bengaliber:Berber (Other)bho:Bhojpuri
bih:Bihari (Other)bik:Bikolbin:Edobis:Bislama
bla:Siksikabnt:Bantu (Other)bos:Bosnianbra:Braj
bre:Bretonbtk:Batakbua:Buriatbug:Bugis
bul:Bulgarianbur:Burmesebyn:Bilincad:Caddo
cai:Central American Indian (Other)car:Caribcat:Catalancau:Caucasian (Other)
ceb:Cebuanocel:Celtic (Other)cha:Chamorrochb:Chibcha
che:Chechenchg:Chagataichi:Chinesechk:Chuukese
chm:Marichn:Chinook jargoncho:Choctawchp:Chipewyan
chr:Cherokeechu:Church Slavicchv:Chuvashchy:Cheyenne
cmc:Chamic languagescop:Copticcor:Cornishcos:Corsican
cpe:Creoles and Pidgins, English-based (Other)cpf:Creoles and Pidgins, French-based (Other)cpp:Creoles and Pidgins, Portuguese-based (Other)cre:Cree
crh:Crimean Tatarcrp:Creoles and Pidgins (Other)csb:Kashubiancus:Cushitic (Other)
cze:Czechdak:Dakotadan:Danishdar:Dargwa
day:Dayakdel:Delawareden:Slaveydgr:Dogrib
din:Dinkadiv:Divehidoi:Dogridra:Dravidian (Other)
dsb:Lower Sorbiandua:Dualadum:Dutch, Middle (ca. 1050-1350)dut:Dutch
dyu:Dyuladzo:Dzongkhaefi:Efikegy:Egyptian
eka:Ekajukelx:Elamiteeng:Englishenm:English, Middle (1100-1500)
epo:Esperantoest:Estonianewe:Eweewo:Ewondo
fan:Fangfao:Faroesefat:Fantifij:Fijian
fil:Filipinofin:Finnishfiu:Finno-Ugrian (Other)fon:Fon
fre:Frenchfrm:French, Middle (ca. 1300-1600)fro:French, Old (ca. 842-1300)frr:North Frisian
frs:East Frisianfry:Frisianful:Fulafur:Friulian
gaa:gay:Gayogba:Gbayagem:Germanic (Other)
geo:Georgianger:Germangez:Ethiopicgil:Gilbertese
gla:Scottish Gaelicgle:Irishglg:Galicianglv:Manx
gmh:German, Middle High (ca. 1050-1500)goh:German, Old High (ca. 750-1050)gon:Gondigor:Gorontalo
got:Gothicgrb:Grebogrc:Greek, Ancient (to 1453)gre:Greek, Modern (1453-)
grn:Guaranigsw:Swiss Germanguj:Gujaratigwi:Gwich'in
hai:Haidahat:Haitian French Creolehau:Hausahaw:Hawaiian
heb:Hebrewher:Hererohil:Hiligaynonhim:Western Pahari languages
hin:Hindihit:Hittitehmn:Hmonghmo:Hiri Motu
hrv:Croatianhsb:Upper Sorbianhun:Hungarianhup:Hupa
iba:Ibanibo:Igboice:Icelandicido:Ido
iii:Sichuan Yiijo:Ijoiku:Inuktitutile:Interlingue
ilo:Ilokoina:Interlingua (International Auxiliary Language Association)inc:Indic (Other)ind:Indonesian
ine:Indo-European (Other)inh:Ingushipk:Inupiaqira:Iranian (Other)
iro:Iroquoian (Other)ita:Italianjav:Javanesejbo:Lojban (Artificial language)
jpn:Japanesejpr:Judeo-Persianjrb:Judeo-Arabickaa:Kara-Kalpak
kab:Kabylekac:Kachinkal:Kalâtdlisutkam:Kamba
kan:Kannadakar:Karen languageskas:Kashmirikau:Kanuri
kaw:Kawikaz:Kazakhkbd:Kabardiankha:Khasi
khi:Khoisan (Other)khm:Khmerkho:Khotanesekik:Kikuyu
kin:Kinyarwandakir:Kyrgyzkmb:Kimbundukok:Konkani
kom:Komikon:Kongokor:Koreankos:Kosraean
kpe:Kpellekrc:Karachay-Balkarkrl:Kareliankro:Kru (Other)
kru:Kurukhkua:Kuanyamakum:Kumykkur:Kurdish
kut:Kootenailad:Ladinolah:Lahndālam:Lamba (Zambia and Congo)
lao:Laolat:Latinlav:Latvianlez:Lezgian
lim:Limburgishlin:Lingalalit:Lithuanianlol:Mongo-Nkundu
loz:Loziltz:Luxembourgishlua:Luba-Lulualub:Luba-Katanga
lug:Gandalui:Luiseñolun:Lundaluo:Luo (Kenya and Tanzania)
lus:Lushaimac:Macedonianmad:Maduresemag:Magahi
mah:Marshallesemai:Maithilimak:Makasarmal:Malayalam
man:Mandingomao:Maorimap:Austronesian (Other)mar:Marathi
mas:Maasaimay:Malaymdf:Mokshamdr:Mandar
men:Mendemga:Irish, Middle (ca. 1100-1550)mic:Micmacmin:Minangkabau
mis:Miscellaneous languagesmkh:Mon-Khmer (Other)mlg:Malagasymlt:Maltese
mnc:Manchumni:Manipurimno:Manobo languagesmoh:Mohawk
mon:Mongolianmos:Moorémul:Multiple languagesmun:Munda (Other)
mus:Creekmwl:Mirandesemwr:Marwarimyn:Mayan languages
myv:Erzyanah:Nahuatlnai:North American Indian (Other)nap:Neapolitan Italian
nau:Naurunav:Navajonbl:Ndebele (South Africa)nde:Ndebele (Zimbabwe)
ndo:Ndongands:Low Germannep:Nepalinew:Newari
nia:Niasnic:Niger-Kordofanian (Other)niu:Niueannno:Norwegian (Nynorsk)
nob:Norwegian (Bokmål)nog:Nogainon:Old Norsenor:Norwegian
nqo:N'Konso:Northern Sothonub:Nubian languagesnwc:Newari, Old
nya:Nyanjanym:Nyamwezinyn:Nyankolenyo:Nyoro
nzi:Nzimaoci:Occitan (post-1500)oji:Ojibwaori:Oriya
orm:Oromoosa:Osageoss:Osseticota:Turkish, Ottoman
oto:Otomian languagespaa:Papuan (Other)pag:Pangasinanpal:Pahlavi
pam:Pampangapan:Panjabipap:Papiamentopau:Palauan
peo:Old Persian (ca. 600-400 B.C.)per:Persianphi:Philippine (Other)phn:Phoenician
pli:Palipol:Polishpon:Pohnpeianpor:Portuguese
pra:Prakrit languagespro:Provençal (to 1500)pus:Pushtoque:Quechua
raj:Rajasthanirap:Rapanuirar:Rarotonganroa:Romance (Other)
roh:Raeto-Romancerom:Romanirum:Romanianrun:Rundi
rup:Aromanianrus:Russiansad:Sandawesag:Sango (Ubangi Creole)
sah:Yakutsai:South American Indian (Other)sal:Salishan languagessam:Samaritan Aramaic
san:Sanskritsas:Sasaksat:Santaliscn:Sicilian Italian
sco:Scotssel:Selkupsem:Semitic (Other)sga:Irish, Old (to 1100)
sgn:Sign languagesshn:Shansid:Sidamosin:Sinhalese
sio:Siouan (Other)sit:Sino-Tibetan (Other)sla:Slavic (Other)slo:Slovak
slv:Sloveniansma:Southern Samisme:Northern Samismi:Sami
smj:Lule Samismn:Inari Samismo:Samoansms:Skolt Sami
sna:Shonasnd:Sindhisnk:Soninkesog:Sogdian
som:Somalison:Songhaisot:Sothospa:Spanish
srd:Sardiniansrn:Sranansrp:Serbiansrr:Serer
ssa:Nilo-Saharan (Other)ssw:Swazisuk:Sukumasun:Sundanese
sus:Sususux:Sumerianswa:Swahiliswe:Swedish
syc:Syriacsyr:Syriac, Moderntah:Tahitiantai:Tai (Other)
tam:Tamiltat:Tatartel:Telugutem:Temne
ter:Terenatet:Tetumtgk:Tajiktgl:Tagalog
tha:Thaitib:Tibetantig:Tigrétir:Tigrinya
tiv:Tivtkl:Tokelauantlh:Klingon (Artificial language)tli:Tlingit
tmh:Tamashektog:Tonga (Nyasa)ton:Tongantpi:Tok Pisin
tsi:Tsimshiantsn:Tswanatso:Tsongatuk:Turkmen
tum:Tumbukatup:Tupi languagestur:Turkishtut:Altaic (Other)
tvl:Tuvaluantwi:Twityv:Tuvinianudm:Udmurt
uga:Ugariticuig:Uighurukr:Ukrainianumb:Umbundu
und:Undeterminedurd:Urduuzb:Uzbekvai:Vai
ven:Vendavie:Vietnamesevol:Volapükvot:Votic
wak:Wakashan languageswal:Wolaytawar:Waraywas:Washoe
wel:Welshwen:Sorbian (Other)wln:Walloonwol:Wolof
xal:Oiratxho:Xhosayao:Yao (Africa)yap:Yapese
yid:Yiddishyor:Yorubaypk:Yupik languageszap:Zapotec
zbl:Blissymbolicszen:Zenagazha:Zhuangznd:Zande languages
zul:Zuluzun:Zunizxx:No linguistic contentzza:Zaz


Deprecated language codes, which may still appear in metadata records, are:

-ajm:Aljamía-cam:Khmer-esk:Eskimo languages-esp:Esperanto
-eth:Ethiopic-far:Faroese-fri:Frisian-gae:Scottish Gaelix
-gag:Galician-gal:Oromo-gua:Guarani-int:Interlingua (International Auxiliary Language Association)
-iri:Irish-kus:Kusaie-lan:Occitan (post 1500)-lap:Sami
-max:Manx-mla:Malagasy-mol:Moldavian-sao:Samoan
-scc:Serbian-scr:Croatian-sho:Shona-snh:Sinhalese
-sso:Sotho-swz:Swazi-tag:Tagalog-taj:Tajik
-tar:Tatar-tru:Truk-tsw:Tswana


The set of codes used for the Copyright field rights_t are:

cc-by-3.0:CC BY 3.0cc-by-4.0:CC BY 4.0cc-by-nc-3.0:CC BY-NC 3.0
cc-by-nc-4.0:CC BY-NC 4.0cc-by-nc-nd-3.0:CC BY-NC-ND 3.0cc-by-nc-nd-4.0:CC BY-NC-ND 4.0
cc-by-nc-sa-3.0:CC BY-NC-SA 3.0cc-by-nc-sa-4.0:CC BY-NC-SA 4.0cc-by-nd-3.0:CC BY-ND 3.0
cc-by-nd-4.0:CC BY-ND 4.0cc-by-sa-3.0:CC BY-SA 3.0cc-by-sa-4.0:CC BY-SA 4.0
cc-zero:CC Zeroic:In-copyrightic-world:In-copyright (world viewable)
icus:US copyrightnobody:Blockedop:Out-of-print
orph:Copyright-orphanedorphcand:Orphanpd:Public domain
pd-pvt:Access limitedpdus:Public domain in US onlysupp:Suppressed from view
und:Undeterminedund-world:Undetermined







  • No labels