IU Technology Architecture Lodge
Random and not so random thoughts from Raymond Yee, primarily on the scholarly and educational use of the Web, libraries, educational technology, and information management

 
Home

Print friendly version

Scholar's Box Essay Series

Current Projects

Presentations and Papers

Work on Educational Technology Interop

RY's wiki

RY's personal blog

About This Site

About Raymond Yee

Interactive University

Contact RY

My blogroll

RSS 2.0 feed for this site

 
 

IU Technology Architecture Lodge

Permanent link to archive for 2/11/05. Friday, February 11, 2005

Slouching towards a conclusion about MARCXML->OpenURL #
I am working right now to finish up my documentation of my work on crosswalking MARC XML to OpenURL. Not that I've figured out everything there is to know or to discover with respect to translating MARC XML to OpenURLs. Rather, I've learned enough and want to move on to other problems.

Let me just start with some conclusions and then work backwards to justify them:

  • There is no perfect mapping of MARC XML to the elements in OpenURl, certainly in practice and probably in theory. That's really not surprising. The specifications were not developed at the same time or for the same purposes. Since when have kindred specification ever been totally interoperable? So immediately, there are bounds on theoretically how interoperable the specifications are. On top of that, there are the funny, real-life practical things people do with specifications. In particular, in MARC, it would seems that 773$g is a big challenge. As Walt Crawford wrote me in email (which he kindly permitted me to quote here):

      The big and, I believe, somewhat insoluble problem in MARC=>OpenURL mapping is the 773$g. Because the syntax for that field, which combines year, volume, issue, and pagination, is either undefined or ill-defined (MARC21 rarely specifies internal syntax for a textual subfield!), all mappings are inherently pragmatic. We've refined our algorithms somewhat as we discover nuances of unusual databases, but I've accepted that we will never get the mapping right in 100% of article-level records. What we can do and have done is encourage database providers to follow data entry practices that make extraction feasible. (Here, again, most database producers may not have this problem: They probably store the data in separate elements, where we store in MARC21.)
  • Tom Schirmer is going to implement essentially the mapping given on the RLG site: http://www.rlg.org/openurl.html#pointers I will want to document how we might do things differently. (David Walker has provided some good insight into the matter, which I will write here.)

  • I've decided that I will do a very simple scraping of the MARCXML metadata to get out a working OpenURL for now -- but put my hope on working with others to get Ex-Libris to give us an OpenURL.

  • Converting MARCXML to MODS first provides some better human-friendly view on data -- and also allows us to capture the huge amount of work that the Library of Congress has already done on interpreting MARC in terms of MODS. But it doesn't solve the 773$g problem.

I still want to provide the details behinds these conclusions.


 
Posted by Raymond Yee on 2/11/05; 10:31:16 AM
from the Unclassified dept.

Discuss

 
February 2005
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
 
Jan   Mar




Last update: Friday, February 11, 2005 at 10:31:16 AM.

This site is using the Vanilla Manila 1999 theme.
The opinions or statements expressed herein should not be taken as a position of or endorsement by the University of California, Berkeley. Nor should the opinions or statements expressed herein be taken as a position of or endorsement of the University of California, Berkeley. Links on these pages to commercial sites do not represent endorsement by the University of California or its affiliates.