Issue 6991 - Application error core dump importing Word doc
Summary: Application error core dump importing Word doc
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: OOo 1.1 Beta
Hardware: Sun SunOS
: P2 Trivial (vote)
Target Milestone: ---
Assignee: michael.ruess
QA Contact: issues@sw
URL: http://www.inrs-eau.uquebec.ca/activi...
Keywords: oooqa
Depends on:
Blocks:
 
Reported: 2002-08-14 18:05 UTC by whitegn3
Modified: 2013-08-07 14:41 UTC (History)
4 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
stack trace from core file (10.31 KB, text/plain)
2002-08-30 11:43 UTC, whitegn3
no flags Details
truss -t mmap,brk -p NNN for swriter opening english.doc (21.20 KB, text/plain)
2002-08-30 17:23 UTC, whitegn3
no flags Details
File from the mentioned URL. (227.50 KB, application/msword)
2003-06-19 11:33 UTC, michael.ruess
no flags Details
objects only example (149.50 KB, application/octet-stream)
2003-06-20 11:52 UTC, caolanm
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description whitegn3 2002-08-14 18:05:20 UTC
The problem files also produced a core dump in ver. 1.0.0.  They are the
documentation for a widely used Matlab package, in the file Kriging51.zip 
(english.doc or francais.doc).

Sometimes the program displays a popup "out of memory" message just before it
crashes. The files use lots of maths. I have not had problems with any other
Word documents, but none of them have used maths.
Comment 1 whitegn3 2002-08-15 00:36:56 UTC
Both files open in OO-1.0.1 running on Intel linux, with some font issues:

1.  some text that appears to be Helvetica/Arial Bold-Italic displays
much larger than it should, and the bold attribute is off.  Resetting
the font to Arial or Helvetia and attributes B+I and size (10pt)
results in something close to correct.

2.  as expected, many math symbols are missing
Comment 2 jkeil 2002-08-28 15:22:28 UTC
I can't reproduce the crash here.  I've tried to open english.doc and
francais.doc from the Kriging51.zip available from the above URL.  I've
tried

- OOo 1.0.1 Solaris x86
- OOo 1.0.1 Solaris SPARC
- SO 6.0 PP1 x86
- SO 6.0 PP1 SPARC

(All running on Solaris 8 with MU7 installed)

All four versions could open the two MS .DOC files just fine.  So there
probably is some other condition in your environment that triggers the
OOo crash for you. (Things like installed patches, installed custom
fonts, available free virtual memory, available temporary disk space,
...).



Since you mentioned an "out of memory" popup:  did you make sure that
your system has enough free swap space for running a huge application
like OOo?  (you may have to use "mkfile(1M)" and "swap(1M)" to 
increase swap)

Comment 3 whitegn3 2002-08-28 18:27:10 UTC
The system should have plenty of swap space, it is a large system used
to run Matlab and numerical calculations.  Patches or fonts are the
likely problem. Did you observe the font problems I reported for these
docs using Linux x86? I should also note that I'm using an SGI Indigo2
for the X-server, but configured to get fonts from Sun's fs.  We won't
be installing any patches that aren't required for Matlab.   

The best approach may be to track down the font issues with the Linux
version -- understanding that might suggest a solution, or a least a
way to get more informative diagnostics on the Sun. 
Comment 4 jkeil 2002-08-28 19:42:57 UTC
> Did you observe the font problems I reported for these docs using 
> Linux x86?

The documents didn't look 100% correct,  for example there were some
"plane" icons appearing in some of the formulas.  This could be some
glyph mapping problem for one of the symbol fonts.  (I think I saw a
similar symbol font glyph mapping problem in another OOo issue.)

But it didn't crash.

> I should also note that I'm using an SGI Indigo2
> for the X-server, but configured to get fonts from Sun's fs.

Aha, this could be important.

Does your Sun system include a frame buffer, that is, is it possible
for you to re-test this document displaying to the local X11 server
on the Sun?  Does that work?
Comment 5 jkeil 2002-08-29 18:30:43 UTC
I made a few more experiments with OOo running on solaris and displaying
to remote X11 servers using the two documents, but still no crashes so
far.

Since you mention a "core dump" in the summary and the problem
description,  can you please try to extract a stack backtrace for the
crash using for example the command "pstack core" and attach it here?
Comment 6 whitegn3 2002-08-30 11:43:07 UTC
Created attachment 2635 [details]
stack trace from core file
Comment 7 whitegn3 2002-08-30 12:00:32 UTC
I have requested that the SUNWxwsrv package be installed so I can use
Xvfb, but meanwhile I have attached a stack trace (note that Solaris 7
ptrace doesn't do traces from core files, so I used dbx). 

I tried the Cygwin XFree86 Xserver and also got a core dump, so the
problem looks like something with our Sun configuration.
Comment 8 jkeil 2002-08-30 16:40:47 UTC
Similar problems have been reported under issue 3360 and issue 4233.

But I'm not sure if this is a duplicate, because I was able to reproduce
the crashes from 3360 and 4233 on solaris 8 sparc (including the faulty
"operator new" calls in the stack backtrace, requesting huge amounts of
memory).  But your file refuses to crash my OOo.


Your backtrace includes an "operator new[]" call, but it's just
trying to allocate a harmless amount of 0xff00 bytes (<64KByte)
(if we can believe the argument values shown in the stack backtrace).


It looks like this allocation for 64KBytes fails.  

You said your system has plenty of swap space:  I hope it's not all
in use by other applications running on that system  :-)
"swap -l"  lists enough free swap, right?


Can you please give us some details about the solaris system are you
running?  (model, cpu type, amount of memory, amount of swap, ...)

Are there any special limit settings in effect for the mathlab
or your numerical computation processes,  like a non-standard
stacksize limits?  (ulimit -a (sh) or limit (csh))



If you start openoffice,

then run " truss -t mmap,brk -p {pid-of-soffice.bin} "
in a terminal window

and then open one of the problematic word .doc files,  do you see
failed mmap or brk system calls?


Comment 9 whitegn3 2002-08-30 17:23:03 UTC
Created attachment 2640 [details]
truss -t mmap,brk -p NNN for swriter opening english.doc
Comment 10 whitegn3 2002-08-30 17:36:28 UTC
The system is a Ultra Enterprise 3000 with 2 processors, and is not
running any big jobs this week (end of August -- sensible people are
on holiday, and the summer students have gone).

$ uname -i
SUNW,Ultra-Enterprise

$ cat /etc/release                            
                    Solaris 7 5/99 s998s_u2SunServer_09 SPARC
           Copyright 1999 Sun Microsystems, Inc.  All Rights Reserved.
                             Assembled 19 April 1999
                     Solaris 7 Maintenance Update 4 applied
$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        8192
coredump(blocks)     unlimited
nofiles(descriptors) 64
vmemory(kbytes)      unlimited
$ swap -l
swapfile             dev  swaplo blocks   free
/dev/dsk/c0t0d0s1   32,121     16 524864  30080
/sys1/SWAP1           -       16 1023984 428928
/sys2/SWAP2           -       16 1023984 436112

I tried "ulimit -s 16000" (the maximum allowed) and
still get the core dump loading english.doc. 
Comment 11 eric.savary 2003-04-13 01:04:27 UTC
You have a lot of OLE objects in your docs.
Try to tune a little bit the memory options in OOo under "Tools - 
Options - OpenOffice - Memory"
Comment 12 dankegel 2003-05-15 07:25:31 UTC
Since the similar issue 3360 has been fixed, 
the original poster should try to reproduce this bug
with ooo1.1beta.  If nothing else, he'll see that
the formulae look much better :-)

I'd be happy with resolving the bug WORKSFORME, but don't want
to step on anyone's toes...
Comment 13 whitegn3 2003-05-15 13:25:21 UTC
Using OO1.1beta I reliably get the popup: "Main memory shortage. Please
quit other applications...", followed immediately by a core dump.  I 
tried changing the memory settings without success. If I set
"ulimit -s unlimited" OO1.1beta crashes without the popup.

I have opened the file with 1.0.3 and/or 1.1beta on other platforms.
There are still many problems with the fonts.
Comment 14 dankegel 2003-05-19 04:58:23 UTC
Aah, now we're getting somewhere.  I forgot that on Solaris,
the default ulimits are rather restrictive.

Can you crank up ulimit again so you don't get the dialog
box, run OpenOffice1.1beta under the debugger (dbx is ok,
but I wonder if gdb might not demangle the c++ identifiers better),
and get a stack trace of the crash?

I've updated the "version" field to reflect the fact that you've
seen the problem with 1.1 beta.

Thanks so much!
Comment 15 whitegn3 2003-05-23 14:26:11 UTC
I tried to get a stack trace.  We don't have gdb on the Sun, so I used
dbx, which took 3 days to load and then dumped core:

$ ulimit -s unlimited
$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        unlimited
coredump(blocks)     unlimited
nofiles(descriptors) 64
vmemory(kbytes)      unlimited


Reading soffice.bin
Reading ld.so.1

[... days pass ...]

Reading libucppkg1.so
t@1 (l@8) signal SEGV (no mapping at the fault address) in
__align_cpy_1 at 0x7e7b06d0
0x7e7b06d0: __align_cpy_1+0x0030:       stb     %o4, [%o0 - 0x1]
(/opt/SUNWspro/bin/../WS5.0/bin/sparcv9/dbx) 
dbx: internal error: signal SIGSEGV (no mapping at the fault address)
dbx's coredump will appear in /tmp
$ ls -l /tmp/core
2337824 -rw-------   1 gwhite   bod      1196965888 May 23 08:18 /tmp/core
$ file /tmp/core 
/tmp/core:      ELF 64-bit MSB core file SPARCV9 Version 1, from 'dbx'
Comment 16 Joost Andrae 2003-05-28 11:51:06 UTC
JA->CP: please have a look at this. Do you have an idea ?
Comment 17 christof.pintaske 2003-05-28 12:14:47 UTC
I guess with a hard limit of 64 file descriptors you don't get far
with complex documents. 
Comment 18 jkeil 2003-05-28 13:08:16 UTC
Yes, the filedescriptor limit of 64 could be the problem.

Note the system call trace attachment for this issue:  the
file descriptor which is used to allocate anonymous memory from
swap space slowly increases and the last couple of mmaps used fd 63,
before the application crashed.


As a test I've reduced my descriptor limit to 64 and opened
english.doc. OOo 1.0.3 is using 50 out of the 64 descriptors.  It
does not yet crash, though.

When I try to open the second file francais.doc, OOo 1.0.3 crashes.
Comment 19 whitegn3 2003-05-28 13:10:59 UTC
I set "ulimit -n 512" and "ulimit -s unlimited" and was able to open
the document and export to PDF.  On superficial examination the 
maths looks right!
Comment 20 whitegn3 2003-05-28 13:33:49 UTC
francais.doc opens with 1.1beta and the same ulimit settings I used
to open english.doc

There is a problem with the font used for labels in the figure on the
"filresp" page, in both versions of the document.

I can't install 1.1beta2, so I opened a new issue with setup.  

Comment 21 whitegn3 2003-05-28 13:51:36 UTC
both english.doc and francais.doc open with 1.0.3 on SGI, but
the figures are missing and many of the math symbols are also 
mmissing or incorrect.  On Sun, the files open in 1.0.1, with
missing figures and maths symbols.
Comment 22 dankegel 2003-05-28 16:20:46 UTC
I was able to reproduce this crash easily on Linux
by running
  ulimit -n 64
before opening the document.

This kind of smacks of a file descriptor leak.  Why
don't those file descriptors get closed?

If this bug doesn't get fixed for 1.1, the 1.1
release notes should at least have a description
of the problem and its workaround.
Comment 23 stefan.baltzer 2003-06-13 15:38:11 UTC
Reassigned to Michael.
Comment 24 michael.ruess 2003-06-19 11:32:45 UTC
MRU->CMC: I can also reproduce a crash on Windows with srx645m6.
For this, the MathType conversion has to be enabled.
Comment 25 michael.ruess 2003-06-19 11:33:53 UTC
Created attachment 6994 [details]
File from the mentioned URL.
Comment 26 caolanm 2003-06-20 11:52:23 UTC
Created attachment 7009 [details]
objects only example
Comment 27 caolanm 2003-06-20 15:32:00 UTC
When the amount of objects are over the cache size they get swapped
out, but the swapout fails because the reference count on the object
is not 1. Someone still has a handle to the object, this must be
somewhere in the change mathtype to starmath code because there is a
reference count of 1 on the object if it was not a converted one. So
SvxMSDffManager::CheckForConvertToSOObj in svx/msfilter/msdffimp.cxx
is my leading contender at the moment, will build that subproject with
debugging and explore this.
Comment 28 dankegel 2003-06-20 16:59:40 UTC
Hey, sounds like it might get fixed before 2.0!  Good luck!
Comment 29 caolanm 2003-06-23 17:16:45 UTC
Starmath objects seem always have a reference count of 2 :-(,
something of a red herring. 
Comment 30 caolanm 2003-07-08 17:31:51 UTC
*** Issue 16306 has been marked as a duplicate of this issue. ***
Comment 31 caolanm 2003-07-09 13:19:16 UTC
cmc->cmc: Ancient bugtrack id #82897# "Wrong RefCount for unload" bug
seems suspiciously similiar.
Comment 32 caolanm 2003-07-14 17:43:27 UTC
Hmm, issue 16015 shows a similiar problem. With some modifications my
crash under windows goes away, but I run out of file descriptors under
solaris very quickly and still crash.
Comment 33 dankegel 2003-07-16 16:05:06 UTC
Issue 16015 is fixed now -- does that mean this issue is fixed, too?
Comment 34 caolanm 2003-07-16 17:59:37 UTC
issue 6991 may fix this for the original reporter under solaris
because the unix stacktraces and sympthoms are those of the CopyTo
leak problem.

But there is a remaining general ww8 import problem with large amounts
of objects and refcounting/unloading and the like. I have a solution
which I have just checked into a 2.0 workspace that I think will work,
but its somewhat extensive and given that this code is basically
untouched for years I'd be very unhappy with the idea of making this a
1.1 bug and throwing it in at the last minute.
Comment 35 caolanm 2003-08-15 16:59:41 UTC
reopen to reassign
Comment 36 caolanm 2003-08-15 17:00:27 UTC
cmc->mru: Working in limerickfilterteam08, may also be working in 1.1
seeing as core parts for a similiar problem were backported to that
release.
Comment 37 michael.ruess 2003-08-21 16:18:22 UTC
Checked fix with internal CWS filterteam08.
Comment 38 michael.ruess 2003-08-21 16:18:59 UTC
This will work at least with OO 2.0
Comment 39 michael.ruess 2003-11-27 09:26:33 UTC
Fix o.k. in OO 2.0 snapshot src680m13.