Issue 13857 - text box in wrong location when importing .doc or .rtf file from Word
Summary: text box in wrong location when importing .doc or .rtf file from Word
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: ui (show other issues)
Version: OOo 1.1 Beta
Hardware: PC All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: michael.ruess
QA Contact: issues@sw
URL: http://www.ieee.org/organizations/pub...
Keywords: ms_interoperability, oooqa
Depends on: 27349
Blocks:
  Show dependency tree
 
Reported: 2003-04-27 00:54 UTC by dankegel
Modified: 2013-08-07 14:41 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Minimal test case. "line2" should be below "LINE1", but is superimposed on top of it instead. (1.31 KB, text/rtf)
2003-04-28 08:29 UTC, dankegel
no flags Details
An RTF file that is not recognizable in OOo (4.75 KB, application/rtf)
2003-10-21 21:30 UTC, ivaroo
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description dankegel 2003-04-27 00:54:47 UTC
This problem was also seen in OpenOffice1.0.x, so it is not new.
It is present both on Windows and on Linux.

In the document TRANS-JOUR.DOC, 
the text "First A. Author, Second B. Author, Jr. and Third C. Author"
is in a textbox that should appear below the headline
"Preparations of Papers for IEEE Transations and Journals".
Unfortunately, it appears superimposed over the top of the
headline, which looks awful and is hard to read.
Comment 1 dankegel 2003-04-27 01:01:03 UTC
See
http://a957.g.akamai.net/f/957/3680/1h/www.ieee.org/organizations/pubs/transactions/TRANS-JOUR.PDF
for a pdf showing how it should look.
Comment 2 dankegel 2003-04-28 08:29:51 UTC
Created attachment 5922 [details]
Minimal test case.  "line2" should be below "LINE1", but is superimposed on top of it instead.
Comment 3 dankegel 2003-04-28 08:34:50 UTC
OK, I attached a minimal test case (produced with Word, but trimmed
down by hand a bit).
The word "line2" should appear *below* the word "LINE1",
but in OpenOffice, it appears on top of it.
Abiword gets this right, by the way.
The problem is the same whether the file is saved in .rtf or .doc.
Comment 4 dankegel 2003-05-08 06:13:09 UTC
I hereby confirm my own bug, as directed by the "how to help" page.
Comment 5 h.ilter 2003-05-08 14:56:25 UTC
Reassigned to MRU
Comment 6 michael.ruess 2003-05-20 10:03:25 UTC
MRU->CMC: The problem is, that these old WW6 frames are positioned
above their anchor though the properties tell me "0,00 cm vertical
position"...
Do think that there's a possibility to "fix" anything here?
Comment 7 caolanm 2003-06-09 09:32:32 UTC
can't be done correctly before 2.0 until I get the other placement
options we need.
Comment 8 dankegel 2003-06-09 15:11:53 UTC
OK, adding ms_interoperability keyword to indicate
that it's an ms-word interoperability problem that
can't be solved until 2.0.
Comment 9 caolanm 2003-08-19 15:18:07 UTC
cmc->od: This is a far as we can go in the filter at the moment, I'll
reassign this to you as you're looking into placement options and so on.

This is an unusual case as its involves "old style" frames. These
frames can be created manually be inserting a text box in word and
choosing format->text box->text box->convert to frame... Its good to
consider the placement options (and layout behaviour) that are
available with these frames in addition to the "normal" ones.
Comment 10 Oliver-Rainer Wittmann 2003-08-20 09:15:35 UTC
OD->Dan Kegel (20.08.2003):
Hallo Dan,
I've got a closer look to your example and I was wondering, what MS
Word does with the given vertical position of the frames: 
Both frames are anchored at the second paragraph and vertical
positioned 0 cm to this paragraph with a distance to the text of 0,33
cm. Thus, both frames are proposed to have the same vertical position
directly in front of the second paragraph, correct?. But MS Word
positioned these frames before the first paragraph with different
vertical positions.
When I input some text in the first paragraph (more than one line),
every line, except the last one, of the first paragpraph are
positioned before the frames. It doesn't seem - looking at the given
PDF - that this was your intention, right? I think, you should anchor
the frames at the first paragraph.
I figured out, that MS Word behaves like this because of the given
distance to the text. I changed the value of both frames to 0cm. Then
MS Word behaves like I excepted - both frames are directly positioned
before the second paragraph and behind the first paragraph. If I
change the value of frame 'LINE1' to 0,1cm, it is positioned before
the first paragraph, holding at the bottom the given distance, but at
the top an unexcepted distance of about 0,3cm - I don't know why.
Then, changed the value of frame 'line2' also to 0,1cm. Now, it is
also positioned before the first paragraph, holding the excepted
distance at the bottom, but between the frame an unexcepted distance
of 0,5cm is hold - I don't know why.
As you see, I'm a little bit confused about what MS Word does with
your given positioning values. Such a behaviour would be quite
complicated to implement in the writer.
I propose a workaround for your positioning problem:
Anchor the frames to the first paragraph and vertical position these
frame relative to page with appropriate values, e.g. 1,8 cm for frame
'LINE1' and 3,1 cm for frame 'line2'
Please give me feedback, if this works for you.
Comment 11 caolanm 2003-08-20 10:17:20 UTC
cmc->od: These old style frames are a little bit interesting in word,
this type of frame is not actually an object like a drawing object,
but instead these frames are properties of the paragraphs inside the
frame.

So an old style frame is a series of paragraphs which all have the
same absolute positioning properties. 

This might go someway towards explaining why the distance from text
matters where the frames are being positioned in word, as the frames
are actually paragraphs and so are "text". Perhaps the distance from
text value of one frame considers text inside other old style frames
when deciding where to go.
Comment 12 Oliver-Rainer Wittmann 2003-08-20 10:53:46 UTC
OD->CMC (20.08.2003):
Thanks for your comments.
As I figured out with Andreas (AMA), the distance between the two
frames doesn't seem to depend on the 'distance to text' values. It
seems to depend, how MS Word layout engine works: 
It seems that first the body text is formatted without any frame.
Next, frame 'LINE1' seems to be positioned considering the current
paragraph position. Afterwards the body text is formatted again and
now wraps around frame 'LINE1', but frame 'LINE1' isn't notified, that
its anchor paragraph is moved. Next, frame 'line2' seems to be
positioned considering the new paragraph position. Afterwards, again
the body text is formatted and now wraps around frame 'line2', but
doesn't notify any frame of its movement.
To prove this 'theory', set the following vertical position values:
for frame 'LINE1' -0,5cm and for frame 'line2' -1,0cm. As you see, the
frames now overlap and the positions seems to be calculated following
the given 'theory'.
What your opinion about this 'theory'?

For 'new style' frames (text boxes) a similar 'theory' about the
layout engine is hold, but with the difference, that the body text
isn't formatted until the last text box is positioned. Thus, the
positions of both text boxes are determined by the paragraph position,
before the text boxes are inserted.

For a mixture of 'old style' and 'new style' frames, we didn't find a
consistent 'theory'.
Can you confirm this?
Comment 13 dankegel 2003-08-20 17:15:06 UTC
dk->od: the document I found this in was a very important .doc
file (at least according to google, which dredged it up for me)
that happened to have a .pdf version as well.  I have no control
over it.  No workaround is therefore possible on the source side --
OpenOffice really does have to render this the same way as Microsoft
Office.
Comment 14 Oliver-Rainer Wittmann 2003-08-21 12:31:34 UTC
OD->Dan (21.08.2003):
Thanks for your comment, but your statement "OpenOffice really does
have to render this the same way as Microsoft Office." doesn't help
very much.
OpenOffice is *not* a clone of Microsoft Office. We are providing a
filter for Microsoft Office word processing documents and we really
try to be as close as possible. And we try to adjust our layout engine
 for such document. But, we can't be perfect, because during the
import we have got all layout informations for the document and we
can't look into the Microsoft Office code, because it isn't open
source. We also have to consider already existing OpenOffice
documents, which are rendered with our current layout engine. These
documents have to be rendered as they are in the current state. Thus,
each adjustment of the layout engine has to consider this.
In the given case, we have got the positioning values for the frames,
but as you can see in the discussion, that we have to try to
understand how these positioning values are used to find the
corresponding position in the document by Microsoft Office. And I
think, you can agree, that the algorithm, which is used, isn't very
intuitive and doesn't correspond directly to the given positioning value.
But, help to improve our filter is very welcomed and if you have
concrete proposal how the layout engine of Microsoft Office works and
how we can implement this in OpenOffice we will be very appreciated.
Comment 15 dankegel 2003-08-21 14:35:12 UTC
Sorry, I'm just the messenger.  Users are going to expect OOo
to be able to load that file, that's all.  I wish I had time
to help implemement the fix, but I am off saving the world in
other ways...
Comment 16 Oliver-Rainer Wittmann 2003-09-23 13:01:09 UTC
OD (23.09.2003): accepted.
This an adjusted formatting of frames with wrapping, this 'defect' can
be solved. We do our best.
Comment 17 Oliver-Rainer Wittmann 2003-09-24 12:55:50 UTC
OD (24.09.2003):
We face the challenge - after the fully understanding of the layout
algorithm of MS Word, we will try to implement it.
Comment 18 dankegel 2003-09-24 15:08:12 UTC
OK, good luck!  This might take care of one of the few
remaining stumbling blocks for certain large automotive
manufacturers.  If you make progress on this, and need
more test cases, I can send you some tough ones from
a real live huge manufacturer who is waiting for this kind
of fix before moving to OpenOffice/StarOffice.
Comment 19 ivaroo 2003-10-21 21:30:03 UTC
Created attachment 10526 [details]
An RTF file that is not recognizable in OOo
Comment 20 ivaroo 2003-10-21 21:33:39 UTC
Just added a rtf file that makes the trouble even worse. This document
looks nice in word (some overprinting but this is by design) The image
is probably black because i messed it up a bit (sorry but had to).
I'll upload a picture on how Word 2000 (and i personally) likes it.
Comment 21 dankegel 2004-03-05 06:18:57 UTC
Amusing factoid: OOo 1.1.1rc crashes when you load and then save the
document, TRANS-JOUR.DOC, that caused me to file this report
in the first place.
Comment 22 dankegel 2004-03-05 16:33:39 UTC
Amusing factoid, part 2: the crash I just mentioned has been 
in issuezilla for a while (with a different document)
as issue 24978.
Comment 23 Oliver-Rainer Wittmann 2004-07-05 13:32:16 UTC
Add dependence to issue #27349

OD->ivaroo: Please submit a separate issue for the RTF-import of document
'Thermat.rtf'. It isn't handled by this issue. 
Comment 24 Oliver-Rainer Wittmann 2004-07-12 15:16:40 UTC
fixed cws swobjpos04 by issue #27349
Comment 25 dankegel 2004-07-13 07:14:29 UTC
Which milestone will the fix appear in, do you think?
I just tested with 680m45, and while there has been
a lot of improvement in rendering of TRANS-JOUR.DOC 
in the 14 months since I filed the bug, this issue
persists (as does one other: not as much vertical space
is used above the footnote on the first page,
which is probably partly responsible for the strange 
positioning of Fig 1 on page 2 instead of page 3
as in the PDF).
Comment 26 Oliver-Rainer Wittmann 2004-07-13 08:21:15 UTC
OD->dankregel:
Thank you for the compliments about our success to improve our Microsoft
interoperability. It was a 'long way' for this issue to be solved. Several
features have been implemented:
- 'Negative positions for Writer fly frames', specification found at
http://specs.openoffice.org/writer/compatibility/negative_positions_for_Writer_fly_frames.sxw
- 'Follow text flow vs. leaving layout environment for Writer fly frames',
specification found at
http://specs.openoffice.org/writer/compatibility/follow_text_flow_vs_leaving_environment.sxw
- 'Vertical alignments at page areas for Writer fly frames', specification found
at
http://specs.openoffice.org/writer/compatibility/vertical_alignment_at_page_areas.sxw
- 'Adjust positioning of floating screen objects', specification found at
http://specs.openoffice.org/writer/compatibility/adjust-object-positioning.sxw
- 'Adjust text wrapping', specification found at
http://specs.openoffice.org/writer/compatibility/adjust-text-wrapping.sxw
- 'Unification of object positioning', specification found at
http://specs.openoffice.org/writer/compatibility/unification_of_object_positioning.sxw
- and finally 'Positioning of floating screen objects with considering its
wrapping mode', implemented in cws swobjpos04, specification found at
http://specs.openoffice.org/writer/compatibility/obj-pos-without-wrapping.sxw
You see, we aren't on holiday on the last 14 months.

The cws swobjpos04 is currently synchronising to SRC680m47. Afterwards an
internal installation set will be build. This installation set is checked by the
quality assurance team. If everything is ok, the cws is nominated for
integration. I think it will take at least 3 weeks to nominate cws swobjpos04.
Then, the release engineering will integrate the cws into the master. Which
milestone that will be I don't know exactly.

BTW, in my local environment of cws swobjpos04 document 'TRANS-JOUR.DOC' looks
nearly the same in Microsoft Word and Writer - the positions of the page breaks
differ about one line.
Comment 27 dankegel 2004-07-14 04:09:53 UTC
OK, I'll check again in a month or so.

On your local copy, does the text flow around
Fig. 1 properly?  On the July snapshot, the text runs
behind the picture.  (Interestingly, it's in the
right spot on the page; looks like it's anchored
relative to the page, but the text doesn't notice
the image in the way!)

The Q concept was quite aggressive on its MS compatibility goals,
and it looks like you folks are on your way to really delivering.
Comment 28 Oliver-Rainer Wittmann 2004-07-14 15:58:08 UTC
OD->dankegel:
Yes, in my local copy the text wraps around figure 1

Reopened to assign to QA
Comment 29 Oliver-Rainer Wittmann 2004-07-14 15:59:09 UTC
OD->MRU: Checked in internal installation set of cws swobjpos04 - please verify.
Comment 30 Oliver-Rainer Wittmann 2004-07-14 15:59:43 UTC
set status back to FIXED
Comment 31 michael.ruess 2004-07-16 11:16:03 UTC
Checked fix in CWS swdrawpos04.
Comment 32 michael.ruess 2004-09-08 09:40:06 UTC
Checked in 680m52. Closed.