IU Technology Architecture Lodge
Random and not so random thoughts from Raymond Yee, primarily on the scholarly and educational use of the Web, libraries, educational technology, and information management

 
Home

Print friendly version

Scholar's Box Essay Series

Current Projects

Presentations and Papers

Work on Educational Technology Interop

RY's wiki

RY's personal blog

About This Site

About Raymond Yee

Interactive University

Contact RY

My blogroll

RSS 2.0 feed for this site

 
 

IU Technology Architecture Lodge

Permanent link to archive for 4/2/03. Wednesday, April 2, 2003

Figuring out the simple OpenOffice XML presentation #

I just posted a question on the OpenOffice XML developer's mailing list.  Before I posted the question, I wrote Steve Parker with a similar question.  Steve Parker, who has been playing around with the OpenOffice document formats and who has been kind enough to share his wisdom, quickly responded!  Thanks, Steve!

Here's what I posted on the developer's list:

Hello,

Although I'm new to using OpenOffice and developing for OpenOffice, I'm quite excited about its potential, especially its use of XML document formats.  I have a number of projects in mind, but let me talk about one in this email.

*The Goal*

Write a Python or Java library that will generate a OpenOffice presentation from scratch that represents a simple photo-album slide show.   That is, I want to create the appropriate XML documents (content.xml, styles.xml, etc), package up in a zip file with the appropriate binaries (e.g., image files).  I do NOT want to have to access the OO APIs.

*Why am I doing this?*

I want to let my users gather images from various sources (their own pictures, from digital repositories, etc), sequence and annotate the images, and keep track of some metadata of the images -- and then package up the images, annotations, and metadata into an OO presentation file that the user can then use to further customize the presentation.   Users are often assembling by hand presentation based on images.  I'm building a tool to help automate the process.

*What I'm using?*

OpenOffice 1.1 Beta on a Win2K Professional SP3 machine

*What I've been trying:*

I've been trying to figure out the "hello world" of an OpenOffice presentation file by generating the simplest possible XML documents that will be useful.

The first thing I did was to create an empty presentation, save the file (empty.sxi), and then unzip empty.sxi to a directory.  What I found are files that are mentioned in such tutorials as Uche Ogbuji's helpful tutorial on the OO XML format:  http://www-106.ibm.com/developerworks/xml/library/x-think15/?dwzone=xml

I looked through the various files and found them to be relatively simple (for what the document format has to do) -- but more complicated than I would like for what I'm trying to accomplish.  That is, if I want to generate XML to feed into OO, do I actually need to include all the files and all the information found in empty.xsi?

To answer that question, I've taken two approaches:

1) Incrementally add new pages and elements to the OO presentation, save the presentation, and look at changes in the xml files.  Figure out what is constant and account for what changes.  I've made some progress working this way, but it's a bit laborious.  The problem is that it's not obvious what data are necessary and what's not.

2) Download the OpenOffice DTD and use a tool like XMLSpy (a validating XML authoring tool) to generate and author the simplest valid OO documents, zip them up, and then feed them to OO to see what happens.

What I've discovered so far is that I can generate documents that are valid according to the DTD but which OO doesn't seem to handle properly.  For example, I wrote the following content.xml (which XMLSpy says is valid):

<?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSPY v5 rel. 3 U (http://www.xmlspy.com)-->
<!DOCTYPE office:document-content PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "office.dtd">
<office:document-content office:class="presentation">
<office:body>
  <draw:page draw:name="Title Page" draw:master-page-name="Default">
   <draw:text-box>
    <text:p>Hello world!</text:p>
   </draw:text-box>
  </draw:page>
</office:body>
</office:document-content>

compressed it and fed the package (which has no other files) into OO .  No complaints from OO -- but also no indication that OO recognized my naming the first slide "Title Page" or that I'm trying to stick in the words "Hello world!" into the slide.

My tentative conclusion is that just because a document is VALID according to the OO DTD doesn't mean that OO will do anything useful with it.

NOTE:  I have been avoiding reading the OpenOffice.org XML File Format document (http://xml.openoffice.org/xml_specification.pdf ) because I've been hoping for a shorter, more-see-what-OO-does-without-reading-too-much   Perhaps that's what I to do!

Questions:

1) Am I on the right track in my approach to write simple XML that OO will do something useful with?

2) Am I right to conclude that generating valide OO documents is not sufficient to generate a file that would be useful for OO?

3) What is the simplest set of XML documents (content.xml, styles.xml, etc) that will display "Hello world!" in a OO Presentation?  That is, what elements comprise the absolute mininal OO document?  For instance,  Steve Parker (http://steve-parker.org/c/openoffice.org/xml.shtml ) worked out the simplest table-generating  XML.  I find that very useful.  IMHO, we need more simplest versions of XML documented.

4) Are there any software tools to test whether a OO document is "useful" as input for OO?  I think that I'm generating "valid" documents -- but valid docs aren't necessarily useful, it seems.

Thank you very much in advance.  Sorry if these are totally newbie questions.

-Raymond


 
Posted by Raymond Yee on 4/2/03; 11:17:07 AM
from the Web Technology dept.

Discuss

Notelets for 2003/04/02 #

I'm happy as a clam when I get to spend (almost) the whole day programming at home!  (I'm at work right now parsing the OpenOffice XML formats.  I'd like to be able to programmatically generate office suite documents by writing out simple documents.  More later.)

*

I share Chris' disappointment with Bill Clinton's relative silence (in substantive forums) these days.  There were some interesting articles in the March issue of The Atlantic on what Clinton will do for the next thirty years of his life, including a piece by James Fallows. It boggles the mind what good the man has the potential for doing with his experience, talent, and energy.  Let's hope that he is able to leave this world with a legacy worthy of the promise.


 
Posted by Raymond Yee on 4/2/03; 8:16:09 AM
from the Personal Notes dept.

Discuss

 
April 2003
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 
Mar   May




Last update: Wednesday, April 2, 2003 at 11:17:07 AM.

This site is using the Vanilla Manila 1999 theme.
The opinions or statements expressed herein should not be taken as a position of or endorsement by the University of California, Berkeley. Nor should the opinions or statements expressed herein be taken as a position of or endorsement of the University of California, Berkeley. Links on these pages to commercial sites do not represent endorsement by the University of California or its affiliates.