IU Technology Architecture Lodge
Webnet talk
Author: Raymond Yee Posted: 6/17/2003; 10:04:38 AM Topic: Webnet talk Msg #: 849 (top msg in thread) Prev/Next: 847/850 Reads: 18394
Rewrite in process [last rev: July 24, 2003]
I used a barebones version of the following page to support my Webnet talk from June 17. I think that the page was comprehensible once I talked through it. Now, I'm filling in a written narrative to help my readers comprehend what this page is about.
First, the blurb on my section of the 3-part talk:
RSS Is Only The Beginning
RSS is perhaps the most active used XML format for content syndication. However, there is so much interesting digital content to syndicate, aggregate and reuse that is not currently in RSS format. This content resides in our libraries, museums, scholarly repositories, government archives. I will talk about using other key XML formats in the educational, library, and technology worlds that can be used to allow content to be reused and recontextualized. I will look at the question of how digital cultural heritage materials and scholarly works can be "ripped, mixed, and burned" in recombinatorial, bricolage authoring.
Because Kalle Nemvalts and Scot Hacker spoke ahead of me about how they have used RSS, I tried to build off what they had talked about. (Thanks, guys, for saving me the trouble of having to explain RSS!) However, where I wanted to get to is all the other non-RSS content out there that we'd like folks to be able to use in the way that RSS is currently being used In other words, I wanted to show how content can be syndicated, aggregated, and recontextualized.
Specifically, I talked about content (and how XML is used to package that content) coming from the libraries and museums. The examples I use here are californiadigitallibrary.org and amazon.com. Amazon.com? Why amazon? Well, two reasons: 1) it has a well-developed web services interface and 2) it is a quasi-library with, say, tons of excellent and even engaging book-related information. Californiadigitallibrary.org is a portal to publicly available resources (images, texts, statistics, electronic books) that come from the University of California system.
Examples from Amazon.com
Let's first take a look at amazon.com. Many of you will be familiar with amazon -- but hop over there and do a search on "Bach" (my favorite search term). Take a look at the type of results you get. Now use the following form to do the same search -- but now you will be getting XML in return. You can either type "Bach" into the Keyword box and leave the other terms as is (e.g., "xml" in the XSLT box)) or just click here.
You might wonder what the big deal is -- getting XML instead of HTML. The big deal is that the XML one now gets back is much more malleable, easier to transform into other formats for various contexts. Indeed, amazon.com provides an XSLT engine for making such transformations. If instead of setting the XSLT parameter to "xml" (which tells amazon to return XML), one entered the URL of an XSLT stylesheet, amazon then applies the given transformation and returns the result.
Let's consider a number of examples. I've written a number of XSLT to transform Amazon results to a variety of formats.
Amazon to HTML
To convince ourselves that the XML we get from amazon has some correspondence to the interface we're used to seeing from amazon, let's convert the results to HTML. Using the amazon search form above, you can type in a search term (say, "Bach") and copy the following URL into the XSLT box:
http://interactiveu.berkeley.edu/gems/creatingcontent/amazonlitetohtml.xsl
(Alternatively, click here to get some HTML from a search on Bach.)
Obviously, there is an unlimited number of ways to generate HTML from the amazon results -- that's the point of the amazon architecture that allows a user to pass in one's own XSLT. The XSLT provided should provide a skeleton for elaborations.
Amazon to RSS 0.92
Since the panel was focused on RSS, it wouldn't have been right for me not to come back to RSS. Of course, there are more reasons to come back to RSS than the mere formality of the talk -- RSS is arguably the most widely used and visible instantiation of XML on the Web (ok, maybe XHTML is the real champion -- but RSS is important nevertheless.) Moreover, RSS is exciting because of how it's being used to syndicate content. Witness Ben Hammersley's Content Syndication with RSS for example.
So, using the same ol' trick, drop
http://interactiveu.berkeley.edu/gems/creatingcontent/amazonlitetorss092.xsl
into the XSLT box and do your search.
Before I move on to the next format, I should note that the XSLT I wrote converts amazon XML to RSS 0.92. The latest versions of RSS are of two type: RSS 2.0 (upwards compatible with RSS 0.92) and RSS 1.0 (a RDF-based format). I plan to extend this work to those other formats.
(See related work on Amazon and RSS)
Slight Detour: Viewing RSS
In a demonstration of concepts, just getting XML is probably not terribly interesting -- it's nice to be able to see the XML interpreted in some context. Here I offer two methods.
The first is the Infinite Penguins RSS Viewer that takes the URL of a RSS document and returns HTML. Because the Viewer's backend is MagpieRSS, the viewer probably supports RSS .9 through RSS 1.0, and the RSS 1.0's modules. (with a few exceptions). There are plenty of ways to tweak the display of the HTML coming from the viewer; if you are interested in using those options, check out the full viewer page. If you want just the default display options, you can use the following form:
(For example, take a look at the viewer's display of the New York Times Arts section RSS channel.)
The second way of viewing RSS files works in the context of Manila, the software used to host this weblog: the viewRssBox macro, which can be fed a URL to a RSS file and display options. For example, the New York Times Arts section RSS channel comes out in the following way using a viewRssBox macro:
[Macro error: Can't find a sub-table named "compilation".]Amazon to METS
The METS schema is "is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standards Office of the Library of Congress, and is being developed as an initiative of the Digital Library Federation."
METS is of particular interest because it is being widely adopted to encode digital materials in the library (and museum) worlds. In this context, we can dynamically create METS objects from an amazon search by entering
http://interactiveu.berkeley.edu/gems/creatingcontent/amazonlitetoMETSv1.1.xsl
into the XSLT box. (Click here to get a METS object of an amazon search of "Bach".)
Note that the XSLT I have written to translate Amazon to METS is not definitive by any means -- it can be improved to capture more of the information in the amazon XML, for example. Moreover there is no perfect translation (or crosswalk) for all circumstances because different circumstances often require different translations.
To view the METS document, we can make use of the METS viewer that the UC Berkeley Library has created. (Note that it is a work under development.)
For example, click here for the METSified Amazon search results for Bach.
Amazon to MOA2
Before working with METS, I started working with the MOA2 format (the predecessor to METS.) The viewer that UC Berkeley Library has developed to render METS objects began as a MOA2 viewer that takes the URL of an arbitrary MOA2. Here is some equisite XSLT for converting Amazon to MOA2
http://interactiveu.berkeley.edu/gems/creatingcontent/amazonlitetomoa2v2.xsl
which you can use in conjunction with the Amazon search to create a MOA2 document (e.g., a MOA2 version of the Bach search can be had through clicking here.)
Once you have the URL of the MOA2 document in hand, drop it into the following MOA2 viewer form. (For comparison, you might want to take a look at how the Berkeley Library uses the MOA2 format to encode a complex document such as the Uchida Scrapbook.)
(e.g., click here see the Bach search results in the MOA2 viewer)
Amazon to IMS-Content Package (IMS-CP)
Another set of standards of significance in higher education is that of the IMS. We are envisioning a time when there are many educational technology systems (e.g., learning management systems) for which faculty, students, and staff will want to generate content. The IMS Content Packaging specification is emerging as a transport mechanism in standards-compliant educational technology.
To demonstrate how amazon materials can be incorporated into a learning management system, use the following XSLT to create an imsmanifest.xml:
http://interactiveu.berkeley.edu/gems/creatingcontent/amazonlitetoIMSv1.xsl
As far as I know, there is no publicly available web-based reader of ims manifests to which one can then feed the results. (Anyone out there know of one?)
Other Amazon functionality
Before I move from using amazon.com as an example, I just want to highlight that the Amazon APIs allow for much more functionality than just keyword searches. Here, for example, is a form for displaying the XML corresponding to a given Amazon wishlist (given an Amazon wishlist ID). (Don't be taken in by my own wishlist -- although I would be flattered if someone actually wanted to buy me a book, ask me before buying one from my wishlist. Many of the books on that list are not ones I actually want to own!).
As we did above, we can make various versions of the wishlist: HTML, RSS 0.92 [in a RSS viewer], METS, MOA2 [in a MOA2 Viewer], IMS Content Package.
Examples from The California Digital Library (CDL)
I've spent a lot of time with examples from Amazon.com because many people are incredibly familiar with its contents and because the web services infrastructure for amazon.com is one of the most well-developed examples out there. Hence, there's a lot to learn from looking at how amazon.com data are being used. For instance, allconsuming.net is a fascinating service that aggregates book discussion on the Web, leveraging data about books coming fom amazon.com (book covers, "people who bought this book bought these other books", etc.)
I want to turn now to an example of an major academic library: the California Digital Library. The CDL provides many services to both the University of California community and to the public at large. Here, we focus on publicly available materials from the CDL gathered through the californiadigitallibrary.org portal, a "first-step towards bringing together publicly accessible digital collections that are created and managed by the UC." [source]
Into addition to returning HTML, the search engine of the CDL portal can return XML -- hence allowing a user to play the same tricks as for Amazon.com. Since the CDL does not currently allow arbitrary XSLT to be passed in, we will make use of the W3C XSLT Service to actually do the XSLT transformation.
Let's walk through an example. Type "horse" into the following search form and select "HTML" as the output format to do a traditional search.
Now if you select "XML" as the output format, you will get XML (naturally). Grab the URL for the search (e.g., the XML for the search on horse) and drop it into the box for "URI for xml resource". At this moment, we have only XSLT for converting the CDL results to RSS 2.0.Here's the results, rendered as RSS 2.0, read into this blog:
viewRssBox ("http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Finteractiveu.berkeley.edu%2Fgems%2Fcreatingcontent%2Famazonlitetorss092.xsl&xmlfile=http%3A%2F%2Fxml.amazon.com%2Fonca%2Fxml%3Fv%3D2.0%26t%3Dwebservices-20%26dev-t%3DD2L0SJ0N2ZTRXE%26type%3Dheavy%26f%3Dxml%26WishlistSearch%3D1U5EXVPVS3WP5%26mode%3Dnull%26page%3D1&transform=Submit", align:"center", width: "500", frameColor:"#000000", titleBarTextColor:"#000000", titleBarColor:"#CCCCCC", boxFillColor:"#FFFFFF", timeZone:"PST", hspace:15, vspace:0)} >
CDL Results 11/13/2009 Terrified Horse. Notes Terrified Horse, Napa County, CA 1956
Toquerville, Horse & Barn, Irrigation & Ditch. Notes [A.C. Man & Woman], Horse & Barn, Irrigation & Ditch
Horse and Buggy [graphic]. Accession number: P1042
Guy Field hauling fertilizer in horse-drawn wagon in McPherson [now part of Orange], California, 1916. Guy Field hauling fertilizer in horse-drawn wagon in McPherson [now part of Orange], California in 1916. Image shows side view of horse and wagon on dirt road, with orange grove and pine trees in background. Field's citrus ranch was located on Pearl Road in the eastern area of Orange. The 1916 city directory listing: "Field, Guy (Elsie) rancher, r McPherson, P O Orange, R D 3."
Horse-Drawn Wagons at Anaheim Southern Pacific Railway Depot, Anaheim [graphic]. Accession number: P3920
Terrified Horse. Notes Terrified Horse, Napa County, CA, 1956
Horse-Drawn Wagon at Anaheim Southern Pacific Railway Depot, Anaheim [graphic]. Accession number: P3562
Santa Clara County Fair, 1948. On front of program: "Santa Clara County Fair, 1948. Sept. 13 thru 19." Bearded old man with pipe carrying suitcase which says, "Hi! Neighbor. Come to the Fair Sept. 13 thru 19." At bottom: "Stay and play all day. San Jose." On middle pages: "Hi Neighbor! Sept. 13 thru 19. It's here again! Bigger and better than ever!" Images of "Womans Division, Horse Racing Events (photo by C. A. Taddo), Gigantic Horse Show, Agricultural Exhibit, Poultry Exhibit and Floral Display." On back: "Diaper Derby -- Come to the million dollar show: your Santa Clara County Fair! Come for fun! See the best that is being produced agriculturally and industrially in Santa Clara County. Come for the quarter horse races that are a new feature at this year's Fair. Come for the entertainment features -- Dancing under the stars, Fireworks every evening, the Carnival, the Grandstand Show featuring acts of all kinds, the Championship Horseshoe Matches, the kids' day Special Events, Pie Eating and Freckle Face Contests, the Horse Show. Come, this is your thrilling show. See what your neighbors have been raising in their gardens in flowers and vegetables: what the F. F. A. and 4-H youngsters have been doing in raising fine cattle and agricultural products. Come to the fourth annual Santa Clara County Fair! September 13 thru 19."
Five Horse-Drawn Wagons, Anaheim [graphic]. Accession number: P1009
Toquerville, Horse & Barn, Irrigation & Ditch. Notes [A.C. Man & Woman], Horse & Barn, Irrigation & Ditch
Terrified Horse. Notes Terrified Horse, Napa County, CA 1956
Nesbitt Home and Horse Team. Nesbitt home and horse and wagon team. The house was located on old Highway 65 and Henderson Avenue. At this time the road came out at Sunnyside Avenue and wound around the hill.Charles Nesbitt is the man driving the horse team.
Two automobiles and a horse on an unpaved mountain road with a man sitting by the side of the road. Negative envelope states: Mountain Springs[,] "Devil Cannon" [Canyon, probably] 2 cars on dirt road[,] horse, man.
Rafael Navarro, "Zanjero" with horse, Tom, and cart [graphic]. Accession number: P68
Center Street, Anaheim [graphic]. Accession number: P259
Toquerville, Horse & Barn, Irrigation & Ditch. Notes [A.C. Man & Woman], Horse & Barn, Irrigation & Ditch
Horse and Buggy. Woman sitting in buggy hitched to a horse, and child standing by the buggy. The buggy is in front of the Moore Opera House.
Day at the Races: Photographs from Santa Anita and Hollywood Park, 1937-38. Images of horse races and the Santa Anita and Hollywood Park racetracks, taken by Will Connell in the late 1930s. From the California Museum of Photography, University of California, Riverside.
[Title not known] Date not indicated. KU71496. Inscription A horse, A horse, my kingdom for a horse, Worlds Columbian Exposition.
Silkwood, Famous Pacer Horse on Sulky Circuit in 1890. 11456008
METS to IMS-CP
METS to IMS-CP (no metadata): http://interactiveu.berkeley.edu/gems/creatingcontent/metstoimscpv1.xsl METS to IMS-CP (simple GDM to LOM metadata crosswalk): http://interactiveu.berkeley.edu/gems/creatingcontent/metstoimscpv2.xsl
This is a Manila! site.
The opinions or statements expressed herein should not be taken as a position or endorsement of the University of California, Berkeley. Links on these pages to commercial sites do not represent endorsement by the University of California or its affiliates.