When dealing with the HathiTrust public domain volumes, the makeup of the collection changes drastically. This is because US copyright law makes it difficult to ascertain rights post-1923, resulting in representation biases. There are proportionally less public domain works approaching the present day, but the drop-off is much quicker in some categories than others. For a valid understanding of the collection, one shouldn't compare pre-1923 and 1923- subsets of the public domain collection without additional context.


Figure: Proportion (in %) of collection represented by each LC Class, by Year. 1923 is marked with a vertical line.


Table: Library of Congress class distributions for Public Domain data, in % of Collection

Library of Congress Classpre-19231923-


Language and Literature9.58%1.84%-7.74%
General and Old World History6.02%1.63%-4.39%
Philosophy, Psychology, and Religion4.01%1.06%-2.95%
Social Sciences3.53%12.01%8.47%
General Works2.74%0.25%-2.49%
History of the United States and British, Dutch, French, and Latin America2.53%1.06%-1.47%
History of America1.75%1.30%-0.45%
Political Science1.70%2.45%0.75%
Bibliography, Library Science, and General Information Resources1.01%2.63%1.62%
Fine Arts0.97%0.65%-0.33%
Geography, Anthropology, and Recreation0.77%1.38%0.61%
Auxiliary Sciences of History0.54%0.81%0.27%
Military Science0.31%1.49%1.18%
Naval Science0.16%0.59%0.42%

(HT PD-only, Spring 2015)