Language

The Free and Open Productivity Suite
Released: Apache OpenOffice 4.1.15

Note:This FAQ is for the OpenOffice.org XML file format only. OASIS OpenDocument/ISO/IEC 26300 FAQs are available at OASIS and opendocument.xml.org.

  MOST FREQUENTLY ASKED QUESTIONS: QUESTIONS  

  1. Which OpenOffice.org application uses XML-based file formats?
  2. What are the default file extensions for XML-based documents?
  3. What is all that binary goo I see in your files?
  4. What package format do you use, and what do I find inside?
  5. How can I put additional information into an XML file?
  6. But I really, really want plain XML. No compression, no binary data, no nothing. Just plain XML. Can I have that, please?
  7. Why are so many styles written out?
  8. How are embedded images and binary data handled?
  9. Why didn't you use XHTML, XSL-FO, SVG, etc. ?
  10. Can I write an XML translation from or into ...?
  11. I found a bug. What do I do?
  12. Hey, I like your XML format. How can I help?
  13. But what about ...? And why isn't my favorite question in here?

  MOST FREQUENTLY ASKED QUESTIONS: ANSWERS  

  1. Which OpenOffice.org application uses XML-based file formats?

    All OpenOffice.org applications use XML-based file formats. All applications (except Math) use the same format as defined in our specification. The Math component uses our package structure and format (see below), but uses MathML inside the package.


Back to top

  1. What are the default file extensions for XML-based documents?
    Writer sxw
    Calc sxc
    Draw sxd
    Impress sxi
    Math sxm
    Writer
    global document
    sxg

    XML is also used in other OpenOffice.org files (e.g. configuration) which are not covered in the xmloff project.


Back to top

  1. What is all that binary goo I see in your files?

    Our documents use packages that contain the XML data alongside binary data such as images. The packages use the well known ZIP format. Just open an sxw/sxc/... file with a ZIP-tool of your choice, and you get access to the unadulterated XML.

    The document meta data (in the meta.xml stream) is not compressed. This allows for easy searching and extraction of the meta data.

    For more information on our packages, see the next question.


Back to top

  1. What package format do you use, and what do I find inside?

    We use the well-known ZIP file format as a package format. In addition, we store an XML-based manifest file that describes the package content and may supply additional information about the included files (e.g. encryption method). Since we use ZIP, most archive programs can already handle our files.

    Inside the package, you generally find several streams that make up the full office document. These are:

    meta.xml information about the document (author, time of last save, ...)
    styles.xml styles that are used in the document
    content.xml main document content (text, tables, graphical elements)
    settings.xml document and view settings (such as magnification level and selected printer); these are usually application specific
    META-INF/manifest.xml provides additional information about the other files (such as MIME type or encryption method)
    Pictures/ directory containing images (in their native, binary formats)
    Dialogs/ directory containing dialogs used by document macros
    Basic/ directory containing StarBasic macros
    Obj.../ directories containing embedded objects, such as charts; each directory contains one such object, stored in its native format. For OpenOffice.org objects that is its XML representation, for other objects it's usually a binary format.

    For more information on why we chose ZIP, please read package.html. For more information on the ZIP format itself, please look here.


Back to top

  1. How can I put additional information into an XML file?

    Alien attributes, i.e. attributes not defined in the OpenOffice.org DTD, will be preserved if they are attached to <style:properties> elements in style definitions. All other alien content will be discarded by the OpenOffice.org import filters.

    Since you can attach styles to arbitrary text ranges, you can use this mechanism to attach your information to arbitrary text ranges, too.

    Note: The above mechanism seems to only work in Writer. The issue is under investigation.

    It is planned that you can also put additional files with your own content into the packages. However, this doesn't work yet.


Back to top

  1. But I really, really want plain XML. No compression, no binary data, no nothing. Just plain XML. Can I have that, please?

    For purposes of import and export, we provide UNO-based services that allow you to import or export XML data through the SAX interface. A documentation of this technique is available here.

    Also, it is planned to allow plain XML files (without packages) to be read and written. However, this is not implemented yet.


Back to top

  1. Why are so many styles written out?

    In general, styles that are used in the document or that have been modified by the user are written to disk. The former is necessary to render the document correctly. The latter is preserved because if a user edited those styles, he/she is likely to use them later on. Therefore, those styles should not be discarded, even though they do not contribute to the document in its current form and shape.

    If styles that meet neither of these criteria are written, then this is may be considered a bug. The Draw, Impress, and Calc applications currently show this behavior.


Back to top

  1. How are embedded images and binary data handled?

    Images and embedded objects are stored in their native formats into the ZIP-based package format.


Back to top

  1. Why didn't you use XHTML, XSL-FO, SVG, etc. ?

    Those formats are not used because they do not have a suitable level of presentation for office documents. When we found an established format (like the ones mentioned above) contains concepts that are used in OpenOffice.org as well, then we generally adopted their representation for those concepts in our XML format. We hope this will ease transformation between the formats.


Back to top

  1. Can I write an XML translation from or into ...?

    You are absolutely welcome to write transformation from our XML-based file format into and from anything you see fit.


Back to top

  1. I found a bug. What do I do?

    Report it using IssueZilla. Try to give a detailed description of what went wrong. Don't forget to include the document in which the error occurred. (After submission of the bug, choose "create attachment".)

    DON'T BE SHY ABOUT REPORTING BUGS! All of us are interested in stable and bug-free applications, and bug reports from our users are a very important means towards that end. Bug reports help all of us. If you don't report your findings, we can't fix them, and so they will cause problems for users as well.


Back to top

  1. Hey, I like your XML format. How can I help?

    There are many things you can do to help.

    1. You can spread the word, e.g. by telling your friends and co-workers about OpenOffice.org.
    2. You can use the OpenOffice.org applications and report any bugs you find.
    3. You can program transformation from our format into others (and vice versa).
    4. You can implement one of the suggestions on the todo list on our homepage.


Back to top

  1. But what about ...? And why isn't my favorite question in here?

    If your question is not answered here, ask it on our mailing list. You can view the archives here. Instructions for joining the list are on our project homepage.


Back to top

FAQ maintained by dvo.

Apache Software Foundation

Copyright & License | Privacy | Contact Us | Donate | Thanks

Apache, OpenOffice, OpenOffice.org and the seagull logo are registered trademarks of The Apache Software Foundation. The Apache feather logo is a trademark of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.