Issue 62295 - very slow (37 minutes) to load small (77KB) .sxw file
Summary: very slow (37 minutes) to load small (77KB) .sxw file
Status: ACCEPTED
Alias: None
Product: Writer
Classification: Application
Component: open-import (show other issues)
Version: OOo 2.3.1
Hardware: PC All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: oooqa, performance
Depends on:
Blocks:
 
Reported: 2006-02-20 05:30 UTC by noise_e_piranha
Modified: 2017-05-20 11:18 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
file exhibiting problem (76.13 KB, application/vnd.sun.xml.writer)
2006-02-20 05:31 UTC, noise_e_piranha
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description noise_e_piranha 2006-02-20 05:30:26 UTC
When I attempt to load the attached 77KB .sxw document into OOo 1.1.4 on a 
1.5GHz P4 with XP SP2 and 256MB RAM using a non-administrator account, the 
load begins to slow down after the 12th gray progress indicator segment 
during "Loading Document..." status.  It slows down even more after the 13th 
segment, and a lot more after the 15th.  The CPU pegs at 100%, mostly 
soffice.exe.  After approximately 37 minutes, the document is loaded 
successfully and OOo seems not to have any problems with it.

I am able to open other, much larger (401KB, for example) documents in a 
matter of seconds.  I am also able to save the file in seconds, although 
longer than normal for this size file.

If I save the file as Microsoft Word 97/2000/XP .doc, loading also hangs a bit 
but still completes in under 1 minute.  If I re-save the .doc as .sxw then 
load again, the new .sxw file has no problems loading (although the formatting 
has changed a bit due to Word/OOo compatibility).

Unable to upgrage to 2.x at this time.



// slow slowly long time forever load loading half hour start startup text 
writer sxw native file document multiple tables font fonts symbol symbols gray 
screen window indicator hang hangs freeze freezes maximum pegged cpu percent
Comment 1 noise_e_piranha 2006-02-20 05:31:28 UTC
Created attachment 34291 [details]
file exhibiting problem
Comment 2 michael.ruess 2006-02-20 07:32:25 UTC
Reassigned to ES.
Comment 3 eric.savary 2006-02-20 14:55:23 UTC
ES->FLR: the content.xml is 11 (eleven) Mb big and mainly consists in imbricated 
"<text:span text:style-name="AMA3 ss - Blank Line">" tags.
Comment 4 flr 2006-02-20 15:15:00 UTC
flr:
It seems that the style "AMA3 ss - Blank Line" is applied multiple times to a
single paragraph. This can only happens via the API. 
My question is? How was the document created originally? Was it a special kind
of filter or an XSL(T) script?
Comment 5 Mathias_Bauer 2006-03-09 14:46:52 UTC
As we don't know anything about the origin of the document we will stop working
on this issue. The fact that resaving the document fixes the problem encourages
me to think that this document is an artefact.

Question to the submitter: can you please explain how this document was created?
Comment 6 noise_e_piranha 2006-07-14 00:30:08 UTC
The document was created by, and has only been processed using, the OpenOffice 
end-user GUI, never a filter or external program.  I did a lot of copy-and-
paste, though, so perhaps it's a bug related to cut-and-paste.

I am now using OpenOffice 2.0.2 and the problem continues.  I resaved in ODT 
format, and when I reload the ODT, the load still takes "forever".  When I 
save then load as a DOC file using OpenOffice, the load is almost immediate.
Comment 7 utomo99 2008-01-04 13:13:31 UTC
77kb in 37 minutes ? I think something is wrong
Comment 8 hdu@apache.org 2008-02-05 15:06:05 UTC
While analyzing a performance problem on Aqua (i85798) I noticed the root cause of this problem:
The loop in SwXTextRange::_CreateNewBookmark() doesn't scale well with the number of bookmarks. The 
algorithmic complexity could be about O(n), but it is actually about O(n^2)!
Comment 9 Oliver Specht 2008-02-07 10:09:43 UTC
The function _CreateNewBookmark() iterates over all bookmark names to make sure
that the to-be-inserted new bookmark has a unique name. This is so slow because
Writers bookmark array is not sorted by the names and all elements have to be
checked. 
I can't see the O(n^2) because the iteration over the names never finds a match
while loading the bugdoc. It would only fail if the sal_Int32 bookmark index
overflows or bookmarks with the name SwXTextPosition + <index> were inserted
otherwise. That doesn't happen here. 
Comment 10 Oliver Specht 2008-02-07 10:46:03 UTC
->AMA: I checked what happens if the string compare iterations are completely
removed. The it takes only two minutes to load the doc.
Changing the bookmark array of SwDoc to some hashing container could do miracles
here. 

Target changed to 3.0
Reassigned
Comment 11 hdu@apache.org 2008-02-07 11:01:42 UTC
Update to my comment above: The method SwXTextRange::_CreateNewBookmark() itself is "only" O(n^1). 
Adding n bookmarks using this method is O(n^2) though.
Comment 12 andreas.martens 2008-04-28 12:48:12 UTC
Set target OOo3.x
Comment 13 malte_timmermann 2009-02-05 14:31:55 UTC
keyword: performance
Comment 14 Mathias_Bauer 2009-05-06 16:50:00 UTC
Bjoern, please have a look on the bookmark performance problems 
Comment 15 bjoern.michaelsen 2009-08-28 17:57:48 UTC
retargeting 3.3
Comment 16 bjoern.michaelsen 2010-07-22 21:18:19 UTC
confirmed in OOO320m18 (released OOo 3.2.1)
Comment 17 bjoern.michaelsen 2010-08-18 16:49:45 UTC
error is in old binfilter code only, closing as WONTFIX. Please reopen if the
performance is also as bad using modern file formats (e.g. odt).
Comment 18 Mathias_Bauer 2010-08-19 17:27:34 UTC
sxw was confused with sdw - this is not a binfilter issue.
Reopened.
Comment 19 bjoern.michaelsen 2010-08-20 15:32:11 UTC
starting work on cws swbookmarkfixes01 => STARTED
Comment 20 bjoern.michaelsen 2010-08-20 18:32:19 UTC
retargeting 3.4
Comment 21 bjoern.michaelsen 2010-10-15 15:44:56 UTC
I fixed the issue with the bookmark creation, however loading this doc is still
awfully slow. I see a lot of hint array operations -- no wonder judging by the
description of the content.xml by es. Given that the document is rather
pathological I am tempted to close the issue as WONTFIX.
Comment 22 hans_werner67 2011-02-03 12:08:08 UTC
pls. reassign or close issues.
Thx.
Comment 23 Marcus 2017-05-20 11:18:18 UTC
Reset assigne to the default "issues@openoffice.apache.org".