Issue 3103 - failed to open large size document
Summary: failed to open large size document
Status: CLOSED DUPLICATE of issue 5000
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: 641
Hardware: PC Windows 2000
: P3 Trivial (vote)
Target Milestone: ---
Assignee: openoffice
QA Contact: issues@sw
URL: http://briefcase.yahoo.com/sf_wei
Keywords:
Depends on:
Blocks:
 
Reported: 2002-02-14 19:01 UTC by rachel
Modified: 2013-08-07 14:43 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
a large size doc file can't be opened (zipped from an xml file) (350.74 KB, text/plain)
2002-02-14 19:05 UTC, rachel
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description rachel 2002-02-14 19:01:53 UTC
everytime I try to open a large size document, I failed and got an soffice.exe 
application error message, such as: The instruction at "0x00231f5a" referenced 
memory at "0x06061d07". The memory could not be read...

PS: my document is zipped from an xml file which is generated by a report 
program. on the url link, you will find a big file which can't open it but you 
can see the xml file if you unzip it. The small file works fine.
Comment 1 rachel 2002-02-14 19:05:34 UTC
Created attachment 1047 [details]
a large size doc file can't be opened (zipped from an xml file)
Comment 2 stefan.baltzer 2002-02-15 16:40:54 UTC
Reassigned to Éric.
Comment 3 eric.savary 2002-02-19 16:21:44 UTC
ES->DVO: Well I also get a GPF in the 641. In the current build, no
GPF, just an empty page. looks normal because the "sxw" file only
contains a huge conten.xml, without manifest, style or meta...
Should we open an empty page or display a message error "Unknown format!"?
Comment 4 rachel 2002-02-20 17:31:03 UTC
Hello Eric,

Regarding your response, I have two questions:
1. Definitely, I don't know that my file contains unknown format. It 
does work with a little smaller file. All format appears to be 
according to OpenOffice xml file format. So it should open neither as 
an empty page nor unknown format...
2. I've included all styles in one content.xml file. I think I don't 
need the rest files such as manifest, meta, settings.... Is it 
required to separate all xml files and then zip together?

Best regards,

Rachel
Comment 5 openoffice 2002-02-21 15:56:55 UTC
dvo->rachel, es: We support a single content.xml without any other 
streams, but it's no 'official' feature. Either way, any GPF is a 
bug, and so is an empty page (provided, the document itself is OK). 
Both of those will likely occur with a multi-stream package, too.
Comment 6 openoffice 2002-02-21 17:22:26 UTC
This is indeed a problem with large documents: the item reference 
counter overflows, due to too many 'hard' formatting. 
[details in: SfxPoolItem::AddRef(), svtools/poolitem.hxx, line 347]

Fixing this will probably require a substantial rework of the code 
base, so I don't think we can do that anytime soon.


There's a fairly easy work-around though, namely to use style sheets. 
If I move all paragraph styles (all those called P<number>) from the 
<office:automatic-styles> into the <office:styles> element, they 
become templates. No 'hard' paragraph formatting -> no refcount 
overflow -> no problem. (Actually, using style sheets is better 
anyway...)

dvo->rachel: Please comment on whether you can load he file with the 
work-around, and whether this is good enough for you.

dvo->rachel: Btw, the document has some syntax errors: All tables 
have the same name, which the format doesn't allow. This doesn't 
cause any bugs or so, though. Just thought you might want to know.
Comment 7 rachel 2002-02-22 16:14:02 UTC
Hello Daniel,

Thank you so much for helping me out. I'm able to open a big file 
with the work-around now :-)) I also corrected my syntax errors.( I 
thought that was allowed... )
However, it seems taking a very long time to fully load a big file. 
for example, I tried to open a 600-page file. it took 1 minute to 
show up on the window, and another 4 minutes to be fully loaded. The 
reason I'm saying fully loading is that when the document first shows 
a few pages, it also shows a much bigger total page number(such as 
2270, kind of estimated pages?) which is not true, while CPU is 
working very hard (fully cpu usage)to load the rest pages. I can't 
scroll down to the last page untill it shows right total page number, 
which took another 4 mins. Is this normal? How long time should I 
expect to open such a big file?

once again, thanks a lot,

Rachel
Comment 8 openoffice 2002-03-01 10:57:36 UTC
dvo->rachel: Two comments:

1) performance: Indeed, performance with (large) tables can be 
improved in Writer. The loading isn't that fast, and I've also heard 
about problems in Writer. Alas, this is hard to fix.

2) 2000 pages: When loading a file, the text document is being 
formatted. The layout initially distributes the contents to the pages 
according to a simple heuristics. In this case, it apparently 
overestimates the number of required pages; that's where the 2000 
pages come from. Then the 'real' formatting of the document starts; 
that's where the high CPU load comes from. In the end, the document 
is properly formatted, and the number of pages should be correct 
again. 
I think you can already start working on the formatted parts of the 
documents.
This is behaviour is OK (except for performance).

I'm not sure what do to with this issue. I think I'll split it into 
two (one for performance, one for limit on hard formatting), and mark 
both a 'resolution: later'. This way, the issues won't get lost, and 
we can fix them later on.
Comment 9 openoffice 2003-01-17 17:35:47 UTC
dvo: Removing the limit on 'hard' formatting will apparently get
removed as part of issue 5000. Work is in progress.
Comment 10 openoffice 2003-01-27 13:25:45 UTC
dvo: I mark it 'duplicate' because the limit on hard formatting
problem is resolved as part of issue i5000. 
Performance issues have been improved over the time; I'm not sure if
the original problem still persists.

*** This issue has been marked as a duplicate of 5000 ***
Comment 11 michael.bemmer 2003-03-11 17:21:19 UTC
As mentioned on the qa dev list on March 5th I will close all resolved duplicate
issues. Please see this posting for details. First step in IssueZilla is
unfortunately to set them to verified.
Comment 12 michael.bemmer 2003-03-11 17:25:23 UTC
As mentioned on the qa dev list on March 5th I will close all resolved duplicate
issues. Please see this posting for details. First step in IssueZilla is
unfortunately to set them to verified.
Comment 13 michael.bemmer 2003-03-11 17:31:15 UTC
As mentioned on the qa dev list on March 5th I will close all resolved duplicate
issues. Please see this posting for details. First step in IssueZilla is
unfortunately to set them to verified.
Comment 14 michael.bemmer 2003-03-11 17:42:38 UTC
As mentioned on the qa dev list on March 5th I will close all resolved duplicate
issues. Please see this posting for details.