Apache OpenOffice (AOO) Bugzilla – Issue 16322
text:span inserts line break in XML export
Last modified: 2013-02-07 22:12:34 UTC
When using the XSLT style sheets to export to DocBook, a text:span element inserts additional white space after it. E.g. the sequence "123" in the OOo document, with only the "2" printed in italics, creates a line break between the "2" and the "3" in the exported XML. Using FlatXML export + external XSLT processor avoids this problem.
taking over
Set the target to 'OOo Later'.
Status to "NEW"
This issue is much more severe: It doesn't only affect the DocBook export filter or just all external filters but the XML engine itself! At least, it seems so. Try this: Create a new document, write "hello" and mark "ll" als bold text. Then export it through DocBook, Word 2003 XML or any (!) other filter and have a look at the so-created file. You will notice an unwanted line break after the "ll" and before the "o". You can see that it is not just a bug in one filter when trying the following in one of your own filter: <xsl:template match="text:span"> <xsl:text>[SPAN]</xsl:text> </xsl:template> This should create something like "he[SPAN]o" from a document only containing the word "hello" with "ll" marked bold, underline, italics or whatsoever. But the result is: "he[SPAN]" -- then a line break and then -- "o" The additional line break is totally wrong here. I would suggest to change the Target Milestone to something earlier than "OOo Later" since it's really annoying 'cause it prevents XML programmers from creating export filters that really work! The affected Subcomponent may be rather "code" than "external filters", but I'm not really sure about that.
I'm not sure whether I get your problem completly here. I think, that what you mean is the following: <text:p>normal<text:span text:style-name="T1">bold</text:span>normal</text:p> should not be serialized as <text:p>normal <text:span text:style-name="T1">bold</text:span> normal</text:p> however, to my understanding, a whitespace is totally OK in this situation, since the two XML fragments are equivalent. If a real break would have been in the document, a <text:line-break/> element would have been present @michael: do you see any real problem here?
Oh no, perish the thought, they are *not* equivalent, at least not in DocBook! And unfortunately it is exported 1:1, which inserts false whitespace. I don't know how OOo handles whitespace in this case. I used to know that, but I've forgotten. It may be true that for OOo both is equivalent. However this would be very stange, if you think of HTML's "<em>non</em>-vanishing values".
I'm not sure about this at the moment, but the two XML fragments may not be equivalent when using other settings for <xsl:strip-space>, <xsl:preserve-space> and others in your XSL style sheet file. And, it is a feature of XSLT to be able to produce any kind of text output, not just XML. E. g., one could easily write a XSL style sheet to output plain text. Thus, newlines should never be added automatically and the XSL style sheet should have the opportunity to clearly define where to put a newline to the output and where not
ok, I tested this a bit and they are not equivalent. The additional linebreak would produce a space if opened in writer. Whitespaces are collapsed, but not ignored (naturally) what I also found out was, that the whitespace isn't introduced by the xml exprter itself but by the xslt engine when indenting is enabled I used a copy-transformation: <?xml version='1.0' encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="no" omit-xml-declaration="no" version="1.0" encoding="UTF-8"/> <xsl:template match="*|@*|comment()|processing-instruction()|text()"> <xsl:copy> <xsl:apply-templates select="*|@*|comment()|processing-instruction()|text()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> and with xsl:output@indent="no" everything is ok. with indent="yes" which allows for the engine to introduce whitespaces at it's own discretion, the described behaviour can occur (depending on line length etc...) The docbook transformation has indent set to "yes", setting it to "no" helps in my test-case Does this help your case too?
xmlfilter for you