Apache OpenOffice (AOO) Bugzilla – Issue 40827
MailMerge: Performance of creating individual documents is very slow
Last modified: 2013-08-07 14:44:00 UTC
Performance of creating individual documents is very too slow. (2 Minute 5 sec for 60 short documents tested on WinXP; P4; 1,8 GH; 512 MB RAM.
reassigned to HI.
HI->OS: We have already discussed about this at the first time. And there was a performance issue which I've send you and TRA by mail.
Mailmerge uses one save and 60 load operations in this situation. We're talking about 2 seconds per document. Besides some performance issues on load/save (see issue 20155) there's no way to improve the speed for OOo 2.0. Target changed to OOo later
HI->CJ: FYI.
cj->hi: I can't fix this problem, please send thois task back to the developer(s).
HI->CJ: Why you need me to pass it back to development?
Hi, Any progress on this issue? A real-world test with 1.9.122 (a full A4 with some 20 fields) takes about 30 seconds a page to generate 186 pages (P4M/2.2GHZ/1GBRAM). And the database is not on the same PC as OOo2. Would really like it to be better, because the rest of OOo2 is really nice. Any info? Regards Wim
This is still true with OOo 2.0.3. This is a show stopper for many big companies wishing to switch from MSO to StarOffice and OOo ! A pity is that they do not know enought OOo and issueZilla to vote, but many persons think this slowness is a real problem that should be solved as soon as possible.
A big problem for my company is that mail merge is not only slow, but that it becomes slower and slower while merging documents. For example, a simple document begins the merge at a rate of 250/minute. After 10,000 merges the rate has dropped to 100/minute. (By the way, we use a simple macro for the mail merge, not the standard wizard)
Our users are also complaining that mail merge is much slower with OOo 2.2 than before we switched from Word 97 (yes, we were still using MS Office 97 before migrating to OOo). We typically create form letters and labels with just the address changed for about 5000 addresses. Would be nice to see the target changed from "OOo Later" to something more specific.
This Issue is now open for more than 3 years. And the target is still "OOo Later"... Is it planned to fix this performance Problem in OOo 3.0? I hope so.
More information, With 1300 records MSO takes only 1min30 and OOo takes 1h30 to do the mailmerge processing. Is there any plan to fix this performance issue ? Thank you.
I reposted this Bug for DEV300_m3 [1] and some more documentation, but it seems to me that this problem will be a problem for another 1-100 years. [1] http://www.openoffice.org/issues/show_bug.cgi?id=87104
In more detail, it occurs when OOo create new page style for each record. Indeed, you ended with 1300 page styles ! So you have to use a page style without any header or footer, OOo will have the same page style for each record. Ok with the same page style (about 4 min 30 s for 1300 records) Why OOo creates for each record a new page style ? For example (Default1, Default2 ...)
->aladdin2k7: The page styles are created to make sure that merged data that might influence the header/footer content is correct for each document part.
I forgot about fields :) So, there will be a performance issue if you have to enable header/footer with some data.
Target changed to 3.x
*** Issue 54531 has been marked as a duplicate of this issue. ***
Some discuss about this issue : http://sw.openoffice.org/servlets/BrowseList?list=dev&by=thread&from=2057086
I trapped into this issue some days ago. With OOo 2.4.1 (ubuntu) it took about 10 hours to export a 4 page document into individual PDFs for the first 650 records. Then the mailmerge wizard became slower and slower - formatting some thousand pages for everey single export. Then I used the File/Print option. Using this way I got the total 1100 exported documents within about 5 minutes (what is reasonable).
Created attachment 56994 [details] Improves speed of MailMerge operation : it's around 7 times faster.
Hi *, I've made a small patch to reduce dramatically the time needed to perform a MailMerge. I've profiled it with callgrind and find that MergeDocuments was waiting for ever on system method. After looking it, I realised that the master document was reopened with an "exec" style call for every single document generated. That is to say, a MailMerge action for 1000 documents was like opening 1000 times OOo on the master documents. I did some tests and found a tricky way to pass around this. If I restore back the master document with "Undo()" calls, MailMerge operation runs as fine as the original one, produces the same result but is seven times faster. Our tests were on a master document without header & footer, but the performance gain should be the same for any kind of documents. -> os : Would you please review the patch & tell us if it can be integrated into mainline ? Thanks,
->mloiseleur: you have to switch to PATCH as well
Hi, I hope this patch works and is build into a developer build or a special test build if it's not possible to add it to the 3.0 Release. It don't think that it's a real solution for the actual problem, that it's not possible to clone documents in the memory, but it's a temporary solution that is imho totally needed. Today I had to merge some labels, approx. 480, 8 labels per document for our new students. With 3 pictures and 2 fields per label, it took almost an hour to create the documents on the old machine at university. Tomorrow I need to make another 200 and there is no way to get around this, because MS Office 2007 doesn't run on Linux, and that's the only OS installed on the PCs for the students council.
Hi mlouiseleur, Great that you prepared this patch! Although I've nothing to say in this field, I can tell that 3.0 is to far on its way to include int. But I vote to try for including with 3.0.1 ... Cor
Adding me to cc. cornouws : Thanks for your help. It's been integrated in go-oo (http://go-oo.org). You'll be able to see it in the next debian/ubuntu release. For windows, you'll have to wait to the official release of 3.0 Go-oo, but it will be in it.
-> mfn: I forgot to mention that unstable build for Suse are built on a weekly basis. It's in the next batch. You'll be able to find'em here : http://download.opensuse.org/repositories/OpenOffice.org:/UNSTABLE/ And you can add the whole thing without hassle with the 1click-install : http://software.opensuse.org/ymp/OpenOffice.org:UNSTABLE/openSUSE_10.3/OpenSUSE_org.ymp
->mloiseleur: Thank you for your patch. It will work in most cases but not in all. At first some operations are not supported by undo. This applies to EmbedAllLinks() at least. Another much bigger problem occurs if there are macros executed while merging. Those macros can change the document in a lot of ways.
I wasn't aware of the macro possibility. It's so slow that I actually didn't try to look around for advanced features. -> os : Thanks for your comment. Since it's the common case, what do you think of : 1) detect if macros are planned 2) use one of the two methods according to the previous test ? According to comments on this issue, it seems that the first way to have faster mail merge is to make it with macro, without the wizard. For EmbedAllLinks() & others unidentified method objects, I am wondering if it has any effects at all on the merged document. They are applied before the merge and it seems that the result will be the same for each documents. Any field mapping/editing method should be undo-able, isn't it ? Is this not the case ? If not, does the problem lies really on the patch & on this issue ? BTW, as mfn said : "I don't think that it's a real solution for the actual problem, that it's not possible to clone documents in the memory, but it's a temporary solution that is imho totally needed." -> os : Whatever the answers are about the two previous question, what do you think about a switch (checkbox) in the wizard ? One which activates/disables this trick. Regards,
Created attachment 57607 [details] New patch, fix a bug and a little bit faster
-> os : You were right on the undo thing, it cannot be trusted completely. I had to move FieldToText translation into the destination document, because the Undo implementation of this operation cannot be trusted, for whatever reason. Anyway, I still think it's worthy and the main path of the patch is still to use Undo as the more current pragmatic way to get rid of continuous system reopening. Many macros are undo-able, as far as I know (correct me if I am wrong). So maybe it's not a regression but a necessary evil for attaining decent performance ? Regards,
->mloiseleur: I didn't build you patch yet and will not be able to try this until the end of next week. But it looks promising amd I will comment on this then.
*** Issue 87104 has been marked as a duplicate of this issue. ***
Target changed to 3.2
Created attachment 60818 [details] Updated version, which takes a better care of numbering list
I fixed the task now in cws os131 by copying the document in memory instead of saving it. The advantage depends on the type of the document and on the speed of the save/load operations. In my case on some pretty fast machines it was about 20 to 25%.
@ os: thanks for working on this. Do I understand it right that you did not use the patch of mloiseleur? Because he was talking about 7 times faster (in an earlier version of the patch). And you talk about 20-25% on a fast machine. So that might be 15% for me, while currently the preformance is, well ... Not that 15% improvement is not good, but 2, 5 times faster, would be better :-)
->cornouws: This first patch was using one source document and it used Undo() to restore the document after invisible content has been removed. This doesn't work in that case. Additionally changes that macros do at the document while merging might get lost.
Does MailMerge still create tons of styles when creating the personalized Document with header and footer? If it does, it would be a good idea to eliminate this behavior too.
->mfn: The styles are necessary to have headers/footers with different content (database content, page count etc).
Reassigned for verification
Verified with cws os131 = OK, visible improved Test with an one page document with 300 recipients. Here my results for the number of saving individual merged documents OOo3.1 after 1.min after 2.min. after 3.min time summary windows 92docs 186docs 283 docs 3min 10sec. Linux 196docs --- --- 1min 30sec. cws os131 after 1.min. after 2.min. --- time summary windows 122docs 245docs --- 2min 10sec. Linux 276docs --- --- 1min 5sec
Hopefully, I don't drop a clanger. When will it be integrated in a snapshot build?
Unfortunately I have to send this issue back because of a crash by edit copy text - New Writer doc - Type some text - Select text - Menu Edit Copy -> crash NOTE: Occurs only in windows.
And reopened
Fixed in cws os131 in sw/source/core/doc/swserv.cxx
Verified with new cws os131 = ok The crash fix does not change the performance.
closing, integrated DEV300_m54