MOST FREQUENTLY ASKED QUESTIONS: QUESTIONS
- Which OpenOffice.org application uses XML-based file formats?
- What are the default file extensions for XML-based documents?
- What is all that binary goo I see in your files?
- What package format do you use, and what do I find inside?
- How can I put additional information into an XML file?
- But I really, really want plain XML. No compression, no binary data, no nothing. Just plain XML. Can I have that, please?
- Why are so many styles written out?
- How are embedded images and binary data handled?
- Why didn't you use XHTML, XSL-FO, SVG, etc. ?
- Can I write an XML translation from or into ...?
- I found a bug. What do I do?
- Hey, I like your XML format. How can I help?
- But what about ...? And why isn't my favorite question in here?
MOST FREQUENTLY ASKED QUESTIONS: ANSWERS
Which OpenOffice.org application uses XML-based file formats?
All OpenOffice.org applications use XML-based file formats. All applications (except Math) use the same format as defined in our specification. The Math component uses our package structure and format (see below), but uses MathML inside the package.
What are the default file extensions for XML-based documents?
Writer sxw Calc sxc Draw sxd Impress sxi Math sxm Writer
XML is also used in other OpenOffice.org files (e.g. configuration) which are not covered in the xmloff project.
What is all that binary goo I see in your files?
Our documents use packages that contain the XML data alongside binary data such as images. The packages use the well known ZIP format. Just open an sxw/sxc/... file with a ZIP-tool of your choice, and you get access to the unadulterated XML.
The document meta data (in the meta.xml stream) is not compressed. This allows for easy searching and extraction of the meta data.
For more information on our packages, see the next question.
What package format do you use, and what do I find inside?
We use the well-known ZIP file format as a package format. In addition, we store an XML-based manifest file that describes the package content and may supply additional information about the included files (e.g. encryption method). Since we use ZIP, most archive programs can already handle our files.
Inside the package, you generally find several streams that make up the full office document. These are:
meta.xml information about the document (author, time of last save, ...) styles.xml styles that are used in the document content.xml main document content (text, tables, graphical elements) settings.xml document and view settings (such as magnification level and selected printer); these are usually application specific META-INF/manifest.xml provides additional information about the other files (such as MIME type or encryption method) Pictures/ directory containing images (in their native, binary formats) Dialogs/ directory containing dialogs used by document macros Basic/ directory containing StarBasic macros Obj.../ directories containing embedded objects, such as charts; each directory contains one such object, stored in its native format. For OpenOffice.org objects that is its XML representation, for other objects it's usually a binary format.
How can I put additional information into an XML file?
Alien attributes, i.e. attributes not defined in the OpenOffice.org DTD, will be preserved if they are attached to
<style:properties>elements in style definitions. All other alien content will be discarded by the OpenOffice.org import filters.
Since you can attach styles to arbitrary text ranges, you can use this mechanism to attach your information to arbitrary text ranges, too.
Note: The above mechanism seems to only work in Writer. The issue is under investigation.
It is planned that you can also put additional files with your own content into the packages. However, this doesn't work yet.
But I really, really want plain XML. No compression, no binary
data, no nothing. Just plain XML. Can I have that, please?
Also, it is planned to allow plain XML files (without packages) to be read and written. However, this is not implemented yet.
Why are so many styles written out?
In general, styles that are used in the document or that have been modified by the user are written to disk. The former is necessary to render the document correctly. The latter is preserved because if a user edited those styles, he/she is likely to use them later on. Therefore, those styles should not be discarded, even though they do not contribute to the document in its current form and shape.
If styles that meet neither of these criteria are written, then this is may be considered a bug. The Draw, Impress, and Calc applications currently show this behavior.
How are embedded images and binary data handled?
Images and embedded objects are stored in their native formats into the ZIP-based package format.
Why didn't you use XHTML, XSL-FO, SVG, etc. ?
Those formats are not used because they do not have a suitable level of presentation for office documents. When we found an established format (like the ones mentioned above) contains concepts that are used in OpenOffice.org as well, then we generally adopted their representation for those concepts in our XML format. We hope this will ease transformation between the formats.
Can I write an XML translation from or into ...?
You are absolutely welcome to write transformation from our XML-based file format into and from anything you see fit.
I found a bug. What do I do?
Report it using IssueZilla. Try to give a detailed description of what went wrong. Don't forget to include the document in which the error occurred. (After submission of the bug, choose "create attachment".)
DON'T BE SHY ABOUT REPORTING BUGS! All of us are interested in stable and bug-free applications, and bug reports from our users are a very important means towards that end. Bug reports help all of us. If you don't report your findings, we can't fix them, and so they will cause problems for users as well.
Hey, I like your XML format. How can I help?
There are many things you can do to help.
- You can spread the word, e.g. by telling your friends and co-workers about OpenOffice.org.
- You can use the OpenOffice.org applications and report any bugs you find.
- You can program transformation from our format into others (and vice versa).
- You can implement one of the suggestions on the todo list on our homepage.
But what about ...? And why isn't
my favorite question in here?
FAQ maintained by dvo.