Issue 16322 - text:span inserts line break in XML export
Summary: text:span inserts line break in XML export
Status: CONFIRMED
Alias: None
Product: xml
Classification: Code
Component: external filters (show other issues)
Version: OOo 1.1 Beta
Hardware: PC Linux, all
: P3 Trivial with 2 votes (vote)
Target Milestone: AOO Later
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-07-02 09:43 UTC by bronger
Modified: 2013-02-07 22:12 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description bronger 2003-07-02 09:43:29 UTC
When using the XSLT style sheets to export to DocBook, a text:span element
inserts additional white space after it.  E.g. the sequence "123" in the OOo
document, with only the "2" printed in italics, creates a line break between the
"2" and the "3" in the exported XML.  Using FlatXML export + external XSLT
processor avoids this problem.
Comment 1 lo 2003-09-11 08:21:12 UTC
taking over
Comment 2 jogi 2003-10-01 08:57:01 UTC
Set the target to 'OOo Later'.
Comment 3 jogi 2003-10-01 14:04:17 UTC
Status to "NEW"
Comment 4 tnd 2003-10-08 18:31:45 UTC
This issue is much more severe: It doesn't only affect the DocBook
export filter or just all external filters but the XML engine itself!
At least, it seems so.

Try this: Create a new document, write "hello" and mark "ll" als bold
text. Then export it through DocBook, Word 2003 XML or any (!) other
filter and have a look at the so-created file. You will notice an
unwanted line break after the "ll" and before the "o".

You can see that it is not just a bug in one filter when trying the
following in one of your own filter:

<xsl:template match="text:span">
  <xsl:text>[SPAN]</xsl:text>
</xsl:template>

This should create something like "he[SPAN]o" from a document only
containing the word "hello" with "ll" marked bold, underline, italics
or whatsoever. But the result is:

  "he[SPAN]" -- then a line break and then -- "o"

The additional line break is totally wrong here.

I would suggest to change the Target Milestone to something earlier
than "OOo Later" since it's really annoying 'cause it prevents XML
programmers from creating export filters that really work! The
affected Subcomponent may be rather "code" than "external filters",
but I'm not really sure about that.
Comment 5 lo 2004-05-27 15:08:59 UTC
I'm not sure whether I get your problem completly here. I think, that what you
mean is the following:
<text:p>normal<text:span text:style-name="T1">bold</text:span>normal</text:p>
should not be serialized as
<text:p>normal
<text:span text:style-name="T1">bold</text:span>
normal</text:p>

however, to my understanding, a whitespace is totally OK in this situation,
since the two XML fragments are equivalent. If a real break would have been in
the document, a <text:line-break/>
 element would have been present

@michael: do you see any real problem here?
Comment 6 bronger 2004-05-27 16:01:28 UTC
Oh no, perish the thought, they are *not* equivalent, at least not in DocBook! 
And unfortunately it is exported 1:1, which inserts false whitespace.

I don't know how OOo handles whitespace in this case.  I used to know that, but
I've forgotten.  It may be true that for OOo both is equivalent.  However this
would be very stange, if you think of HTML's "<em>non</em>-vanishing values".
Comment 7 tnd 2004-05-27 16:42:36 UTC
I'm not sure about this at the moment, but the two XML fragments may not be
equivalent when using other settings for <xsl:strip-space>, <xsl:preserve-space>
and others in your XSL style sheet file.

And, it is a feature of XSLT to be able to produce any kind of text output, not
just XML. E. g., one could easily write a XSL style sheet to output plain text.
Thus, newlines should never be added automatically and the XSL style sheet
should have the opportunity to clearly define where to put a newline to the
output and where not
Comment 8 lo 2004-05-27 17:06:57 UTC
ok, I tested this a bit and they are not equivalent. The additional linebreak
would produce a space if opened in writer. Whitespaces are collapsed, but not
ignored (naturally)

what I also found out was, that the whitespace isn't introduced by the xml
exprter itself but by the xslt engine when indenting is enabled
I used a copy-transformation:

<?xml version='1.0' encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no" omit-xml-declaration="no" version="1.0"
encoding="UTF-8"/>
<xsl:template match="*|@*|comment()|processing-instruction()|text()">
    <xsl:copy>
        <xsl:apply-templates
select="*|@*|comment()|processing-instruction()|text()"/>
    </xsl:copy>
</xsl:template>
</xsl:stylesheet>

and with xsl:output@indent="no" everything is ok. with indent="yes" which allows
for the engine to introduce whitespaces at it's own discretion, the described
behaviour can occur (depending on line length etc...)
The docbook transformation has indent set to "yes", setting it to "no" helps in
my test-case
Does this help your case too?
Comment 9 lo 2005-12-13 13:42:21 UTC
xmlfilter for you