Apache OpenOffice (AOO) Bugzilla – Issue 76606
Performance problem - "pasting" 65000x20 table from Calc to Base takes hours on 2.4GHz Pentium4
Last modified: 2009-07-20 20:57:39 UTC
Subject says it all. Repro steps: 1. Select data in Calc, Edit - Copy. 2. Start Base, Create new db, click Tables, right-click - Paste. 3. In Copy table wizard click "Definition and data", "Create primary key", Next, add all columns to the right, Next, Create. I can supply .ods file (2.7MB, confidential), which contains the data to reproduce the problem. It takes 3 hours on 2.4GHz Pentium4 (1GB of RAM) to import all of the data into database. What makes the problem even worse is that Office is unusable for any other task during import and machine becomes sluggish (apparently some thread is running at high priority).
P.S. Import is completely CPU-bound - I see 100% CPU utilization during import.
tested on test pc p1800 (512mb ram) => two crashes (one of them after more than a hour) clu->fs: what is an appropriate time for 65000 datasets?
You might want to separate two things here - The clipboard and importing to base. I did the following: Created a Calc file with 65,515 rows of 21 columns. Mixed data: decimal, date, strings. Created a Base file that connected to this Calc file. Created a new embedded database Base file. Dragged the table Sheet1 from the first Base file to the second Base file. The import is not speedy but it finished in just under 18 minutes. This is on an HP 810n ( AMD 3300, 640 Meg Ram ), WinXP Sp2 and a good mix of other applications running at the time. Firefox ( 4 windows, 5 tabs in one - gmail, Issue trakcker ( 2 tabs), OOoForum, local file ), 2 cmd windows, HSQLdb running as server, MySQL 5 running, iTunes playing music from an internet feed. Total of 11 open documents ( 4 base files, 2 forms, 5 query data views ) in OOo. The PC was sluggish but I was still able to work in Firefox entering 2 posts at the forum without too much discomfort at all...and the music never stopped playing :>)
This deserves an update. I preformed the same steps tonight as I did the other night, bringing the calc data into an embedded Base database - My machine was running just about the same mix as the other night also but using 2.3m_210 Time to import data dropped to 10 min 2 seconds. Another BIG difference - system responsiveness. The CPU is still pegged during the transfer BUT OOo is offering up clock cycles much, much more often now. I could work with the other applications almost as if the process was not screaming along. Someone put the word 'Cooperative' into Base on Windows it seems :>) It also appeared that total memory used by the soffice.bin process was less then on 2.2, but I will need to run a few more tests before I can say for sure.
Created attachment 44723 [details] Data as csv file
I thought others might want to check out the performance on their machines so I saved the spreadsheet to a csv file and zipped for attachment here.
I have tried m210 and it is roughly the same. I have discovered that if I do not enable primary key creation then import finishes almost 4 times faster: 2.2 no PK - 1m40s 2.2 with PK - 6m (competitive analisys - Office 2003 takes 45s) atjensen, please make sure that you have timed copy with PK creation.
Yes, all my test runs had included creating a primary key field. Looking at your times does seem to show that removing the system clipboard from the equation makes quite a difference. But it is hardly the only factor to be considered. Here is another test run - on the same machine, but a new software configuration. Two differences only - First I changed OOo to use JRE 1.5.0_11 instead of 1.6.0 as the other night, second in getting ready for some contract work I shutdown the MySQL server and fired up an Oracle 10g server on this machine. I preformed the same record import test as before using 2.3 M_210, and with no primary key being added, for this run my time was horrendous: 54 Minutes 20 Seconds. OK - how many people are really going to have an Oracle server running on their workstation - not many. So we can probably just toss this number out, but to be sure: Next I left Oracle running and changed OOo to use JRE 1.6 again. Started the transfer again. ( Just to be clear each time the HSQL embedded database is empty ) - I killed the process after 20 minutes. So - I shutdown everything Oracle service ( server, recovery, agent, listener ). I left OOo using JRE 1.6 and I performed the transfer one more time, again no primary key. Run time: 4 minutes 06 seconds Finally - for the last test I changed OOo back to use JRE 1.5.0_11. Run time 5 Minutes 28 Seconds. All the test runs where with 2.3m_210
I have done some more testing using http://www.openoffice.org/nonav/issues/showattachment.cgi/44736/Praha4.zip (make sure you set text delimiter empty, otherwise file would not import correctly) and version m210: 1. Using different version of JRE (1.5.11 vs 1.6.0) did not make a second of difference on WinXP. 2. Pasting that data on different OSes: Suse 10.2 - 3 minutes WinXP - 5m 50s Vista - 35 minutes (yes, 35 minutes is correct). Time spent in kernel (as seen in Task Manager) on Windows is around 65% on XP or around 90% on Vista.
Another competitive analysis Using the data in the Praha 4.csv file Kpalagin posted to populate a Calc sheet in OOo and then doing a copy past of the 13,000 rows into Kexi 2007 for Windows > 9 SECONDS. That is with adding a primary key field - HOWEVER, the date fields are converted to text. Importing into Kexi directly from the CSV file took longer - 11 seconds.
fs->oj: One for your performance list ...
Fixed in cws dbaperf1. Have a look at the wiki page http://wiki.services.openoffice.org/wiki/Base/Performance#Row_Fetching
Please verify. Thanks.
verified in CWS dbaperf find more information about this CWS, like when it is available in the master builds, in EIS, the Environment Information System: http://eis.services.openoffice.org/EIS2/cws.ShowCWS?Path=DEV300%2Fdbaperf
This issue is closed automatically and wasn't rechecked in a current version of OOo. The fixed issue should be integrated in OOo since more than half a year. If you think this issue isn't fixed in a current version (OOo 3.1), please reopen it and change the field 'Target Milestone' accordingly. If you want to download a current version of OOo => http://download.openoffice.org/index.html If you want to know more about the handling of fixed/verified issues => http://wiki.services.openoffice.org/wiki/Handle_fixed_verified_issues
Sorry this issue was wrongly closed. This issue will be reopened automatically. And will be set after that back to fixed/verified.
Set to state 'fixed'.
Set back to state 'verified/fixed'. Again. Sorry for the mass of mails.
Using m52 on P4 2.12GHz - jan29_2006-feb4_2006cf.zip gets copied hundred times faster. Thanks a lot! Closing.