Issue 2800 - Loading .sxc file takes ~10x longer than .xls file.
Summary: Loading .sxc file takes ~10x longer than .xls file.
Status: CLOSED NOT_AN_OOO_ISSUE
Alias: None
Product: Calc
Classification: Application
Component: code (show other issues)
Version: OOo 1.0.1
Hardware: PC Linux, all
: P3 Trivial (vote)
Target Milestone: ---
Assignee: oc
QA Contact: issues@sc
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-01-10 18:19 UTC by jlapham
Modified: 2013-08-07 15:14 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Excel 256x256 block of "if statements" example (2.03 MB, application/octet-stream)
2002-01-10 18:21 UTC, jlapham
no flags Details
Excel 256x256 block of "if statements" example (2.03 MB, application/octet-stream)
2002-01-10 18:22 UTC, jlapham
no flags Details
256x256 block of "if statements" in sxc format (47.75 KB, application/octet-stream)
2002-01-10 18:25 UTC, jlapham
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description jlapham 2002-01-10 18:19:31 UTC
I have a large complex spreadsheet that am I am converting from Excel to OOo
(using v641C under linux).  I can import this document in its original .xls
format into OOo in approx. 30s.  After saving the document in the native .sxc
format, the file now requires >300s to load.

I have tracked the problem down to the many repeated "if" statements on on sheet
of the original document.  Thus, I have been able to make a simple demonstration
file for this problem.  

To reproduce:
Simply create an Excel file with 256x256 block of "if" statements (or use my
oneif_256x256.xls file).  Now, open that document in OOo and save it in .sxc
format (or use my oneif_256x256.sxc file).  Close OOo.  Start OOo fresh, and
open the .sxc format file.  Load time will be approx. 10x that of opening the
.xls file.

Note: this is *not* a Excel conversion issue.  You can also create the document
fresh in OOo and the .sxc file will still require a very long time to open.  I
mention the "10x load time to the .xls file" to point out that this must be an
issue with the reading of the .sxc file format itself, and not some OOo internal
issue.
Comment 1 jlapham 2002-01-10 18:21:32 UTC
Created attachment 919 [details]
Excel 256x256 block of "if statements" example
Comment 2 jlapham 2002-01-10 18:22:51 UTC
Created attachment 920 [details]
Excel 256x256 block of "if statements" example
Comment 3 jlapham 2002-01-10 18:25:33 UTC
Created attachment 921 [details]
256x256 block of "if statements" in sxc format
Comment 4 jlapham 2002-01-10 18:26:42 UTC
oops, send the Excel file twice.  I know the second Excel attachment
is okay, I'm not sure about the first one.
Comment 5 frank 2002-01-11 10:41:17 UTC
Hi Eike,

it seems to be yours.

Loading the sxc by SAB takes about 1:00 compiling the formulas takes
about 1:43 and calculating the cellheights 0:05.

Frank
Comment 6 jlapham 2002-01-21 13:23:57 UTC
gnumeric has a (somewhat) similar problem with file like this, which
has supposedly been solved recently by Jody:
http://bugzilla.gnome.org/show_bug.cgi?id=68723
Comment 7 ooo 2002-01-21 16:52:05 UTC
Gee, this is something that could benefit from "shared formulas",
reusable formula code. Also the "all formulas are listening to one and
the same cell" mentioned in gnumeric could be an issue here in compile
time.
Comment 8 jlapham 2002-05-01 20:04:58 UTC
I've just confirmed that this problem still exists in OOo v1.0.0. 
Load time for the .xls file is ~7s while the .sxc file is ~55s on my
600MHz laptop running RH Linux v7.2.

Comment 9 jlapham 2002-07-16 18:32:20 UTC
i just confirmed this problem still exists in v1.0.1.  Load time of
the .xls file is ~8s and of the .sxc file is ~65s on a 600MHz linux
machine.
Comment 10 ooo 2002-11-21 17:53:17 UTC
First of all: reading in an XML .sxc file is always slower than any
binary file format, no matter if Excel .xls or SO5 .sdc, since we have
to compile and interpret all formulas instead of reading in readily
compiled token arrays and results.

I optimized the process at various places, and the attached example
file on my machine now loads in 18s instead of the 29s it previously
needed. This is far from optimal, but at the moment it couldn't be
much improved without reimplementation of some core parts, if at all.
Note also that the IF() is not the problem, same behavior if you
exchange all IF() with SUM(), for example.

However, I did the changes on the SRX644 build line, no backport to
OOO_STABLE_1 planned, therefor I set the issue to resolved later.
Comment 11 ooo 2002-11-22 12:48:24 UTC
back-reference to iBIS task: #99677#
Comment 12 jlapham 2002-11-22 13:14:49 UTC
Hmmmm, I posted a comment a few minutes ago and it didn't register. 
Weird, I'll try again.

ER-

Thanks, I'll take a 40% improvement in load times anyday!

You mention that XML format files will always be slower than binary
file formats, and that makes perfect sense to me.  But, then I wonder
how gnumeric (which uses a XML file format) is able to load this same
file (once converted to gnumeric format) in 3s?  Maybe there is some
interesting stuff in the changes Jody made which would be of interest
of OOo?

http://bugzilla.gnome.org/show_bug.cgi?id=68723

Regardless, thanks again Eike.
-Jon
Comment 13 ooo 2002-11-25 11:17:48 UTC
Without having investigated it any further I'd say that the file
converted to Gnumeric format uses shared formula expressions, where a
range of identical formulas is represented by one formula expression
referenced by all subsequent cells of that range, so parsing has to be
done only once. This is something not implemented yet in OOo Calc.
Comment 14 jlapham 2002-11-25 12:06:40 UTC
Okay, "shared formula expressions" sound like a good idea.  Should we
keep this bug open to track that new feature?

My original document (the one that takes 300s to open) is one of many,
many spreadsheets the financial people at my company use.  Even with
this improvement in load time you have done, I'm sure they will not
switch to OOo if i takes them (300s minus ~40%) to load their
documents.  My point here is that we are not talking about waiting 18s
in these real-world cases, but rather, many minutes.

PS: Just to give you an idea:  These spreadsheet they use have 365
columns, one for each day of the year, with a row for each item they
are tracking, sometimes there are thousands of items.  Each cell has
many "if" statements, sometimes in ugly nested formats.  Ugly?  Yes. 
But, these are financial people, there is no talking to them about
elegant spreadsheet design.  :)

Comment 15 ooo 2002-12-09 15:53:55 UTC
Have to reopen to reassign, rediculous.
Comment 16 ooo 2002-12-09 15:55:24 UTC
reassign to QA
Comment 17 ooo 2002-12-09 15:56:19 UTC
restore status
Comment 18 jlapham 2003-06-15 22:18:12 UTC
For what it is worth, the .sxc file loads ~25% faster in OOo 1.1b2 as
compared to OOo 1.02.  Great job!

It still takes 35s on my home machine to load the .sxc, versus 9s to
load the .xls file... but now these numbers are within the limit of
being tolerable.
Comment 19 oc 2004-06-04 11:42:14 UTC
closed because remaining problem depends on implementation of shared formulars
which is still planned for the near future. (see also PCD issue 20489)