Monday, August 09, 2010

Metadata and Google Books

This article over at Ars Technica discusses metadata in Google Books ... it also is rather complimentary to the information professions as a by product:
In the end, most of the "metadata problems" that Google's engineers are trying to solve are very, very old. Distinguishing between different editions of a work, dealing with mistitled and misattributed works, and sorting out dates of publication—these are all tasks that have historically been carried out by human historians, codicologists, paleographers, library scientists, museum curators, textual critics, and learned lovers of books and scrolls since the dawn of writing. In trying to count the world's books by identifying which copies of books (or records of books, or copies of records of books, or records of copies of books) signify the "same" printed and bound volume, Google has found itself on the horns of a very ancient dilemma.

Google may not (or, rather, certainly will not) be able to solve this problem to the satisfaction of scholars who have spent their lives wrestling with these very issues in one corner or another of the humanities. But that's fine, because no one outside of Google really expects them to. The best the search giant can do is acknowledge and embrace the fact that it's now the newest, most junior member of an ancient and august guild of humanists, and let its new colleagues participate in the process of fixing and maintaining its metadata archive. After all, why should Google's engineers be attempting to do art history? Why not just focus on giving new tools to actual historians, and let them do their thing? The results of a more open, inclusive metadata curation process might never reveal how many books their really are in the world, but they would do a vastly better job of enabling scholars to work with the library that Google is building.

1 comment:

Bernadette Callery said...

While appreciating the differences between the the descriptions of the "ideal copy" in OCLC and the bold experiment of RLIN, which allowed copy-specific notes to be added to the collection- bibliographic records, is certainly important in this debate, it is also true that the bibliographic record contributed to a collaborative catalog is not the most effective vehicle for communicating the fine points of bibliographic distinction. Esoteric articles in professional journals studded with collation formulae are. I always wanted to add footnotes to my catalog records to create those links.