IFLA Newspaper Conference: Morgan Cundiff
Morgan Cundiff, from the Library of Congress is the final presenter for the Technical Panel. I’m friends with Morgan - we work at the Library together. He works in the NetDev office, working on many of the standards the LOC produces.
This is an XML game now. Echoing some comments from George earlier, standards are definitely the most important thing in creating interoperable digital libraries, and XML is that de-facto standard on the web. The lack of standards has been the biggest barrier to creating joint libraries.
(Morgan is doing a technical overview of METS and MODS right now, which I’m skipping over. My fingers need a rest. Plus I already know it.)
A METS Profile describes a class of METS documents to provide programmers and authors the guidance to create and process METS documents of a particular profile. A sufficiently detailed METS Profile can be considered a standard, and there is a schema for creating profiles. However, it is human-readable, and not machine actionable. (The NetDev office really needs to get some system in place for doing that, like the earlier schematron-based profile descriptor and validator I wrote.)
Morgan has created a draft for the digital newspapers, found here.
(He’s going through the parts of the profile document now, which can just be viewed from the link above. As much as I like the newspaper profile and the related work, I think spending too much time on a particular standard like this is not too useful. People will always have different requirements, and the standard will not always work for all needs. A more valuable approach is to define semantic correlations between different models and then using semantic tools to translate between them as appropriate. Far more along the lines of what Alison Stevenson & Elizabeth Styron are doing.)
(One really clever thing done with the mets:fptr is the use of the mets:area to point a region to a portion of the image held in the mets:fileSec. Cool!)
The newspaper community has a major “quality vs. quantity” problem. People talk in terms of millions of pages, and so that requires machine processing. That means lower quality, which is just a law of nature.