Issue 40827 - MailMerge: Performance of creating individual documents is very slow
Summary: MailMerge: Performance of creating individual documents is very slow
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: save-export (show other issues)
Version: 680m70
Hardware: All All
: P3 Trivial with 18 votes (vote)
Target Milestone: ---
Assignee: Oliver Specht
QA Contact: issues@sw
URL:
Keywords: oooqa, performance
: 54531 87104 (view as issue list)
Depends on:
Blocks:
 
Reported: 2005-01-17 17:04 UTC by christian.jansen
Modified: 2013-08-07 14:44 UTC (History)
16 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Improves speed of MailMerge operation : it's around 7 times faster. (7.72 KB, patch)
2008-10-03 13:43 UTC, mloiseleur
no flags Details | Diff
New patch, fix a bug and a little bit faster (8.81 KB, patch)
2008-10-31 10:50 UTC, mloiseleur
no flags Details | Diff
Updated version, which takes a better care of numbering list (8.90 KB, patch)
2009-03-09 08:54 UTC, mloiseleur
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description christian.jansen 2005-01-17 17:04:07 UTC
Performance of creating individual documents is very too slow. (2 Minute 5 sec
for 60 short documents tested on WinXP; P4; 1,8 GH; 512 MB RAM.
Comment 1 michael.ruess 2005-01-18 07:55:43 UTC
reassigned to HI.
Comment 2 h.ilter 2005-01-18 08:59:22 UTC
HI->OS: We have already discussed about this at the first time. And there was a
performance issue which I've send you and TRA by mail.
Comment 3 Oliver Specht 2005-01-20 10:53:20 UTC
Mailmerge uses one save and 60 load operations in this situation. We're talking
about 2 seconds per document. 
Besides some performance issues on load/save (see issue 20155) there's no way to
improve the speed for OOo 2.0. Target changed to OOo later
Comment 4 h.ilter 2005-01-20 11:19:45 UTC
HI->CJ: FYI.
Comment 5 christian.jansen 2005-01-20 11:59:49 UTC
cj->hi: I can't fix this problem, please send thois task back to the developer(s).
Comment 6 h.ilter 2005-01-20 12:10:39 UTC
HI->CJ: Why you need me to pass it back to development?
Comment 7 wva 2005-08-24 12:05:19 UTC
Hi,
Any progress on this issue?
A real-world test with 1.9.122 (a full A4 with some 20 fields) takes about 30
seconds a page to generate 186 pages (P4M/2.2GHZ/1GBRAM). And the database is
not on the same PC as OOo2. Would really like it to be better, because the rest
of OOo2 is really nice. Any info?
Regards
Wim
Comment 8 pagalmes.lists 2006-05-23 15:18:32 UTC
This is still true with OOo 2.0.3. This is a show stopper for many big companies
wishing to switch from MSO to StarOffice and OOo ! A pity is that they do not
know enought OOo and issueZilla to vote, but many persons think this slowness is
a real problem that should be solved as soon as possible.
Comment 9 hark 2006-10-16 12:52:55 UTC
A big problem for my company is that mail merge is not only slow, but that it
becomes slower and slower while merging documents. For example, a simple
document begins the merge at a rate of 250/minute. After 10,000 merges the rate
has dropped to 100/minute. (By the way, we use a simple macro for the mail
merge, not the standard wizard)
Comment 10 robbk 2007-09-10 11:01:31 UTC
Our users are also complaining that mail merge is much slower with OOo 2.2 than
before we switched from Word 97 (yes, we were still using MS Office 97 before
migrating to OOo). We typically create form letters and labels with just the
address changed for about 5000 addresses. Would be nice to see the target
changed from "OOo Later" to something more specific.
Comment 11 mfn 2008-02-21 15:21:57 UTC
This Issue is now open for more than 3 years. And the target is still "OOo
Later"... Is it planned to fix this performance Problem in OOo 3.0? I hope so.
Comment 12 aladdin2k7 2008-03-28 15:13:29 UTC
More information, 

With 1300 records MSO takes only 1min30 and OOo takes 1h30 to do the mailmerge
processing.

Is there any plan to  fix this performance issue ?

Thank you.
Comment 13 mfn 2008-03-28 16:03:07 UTC
I reposted this Bug for DEV300_m3 [1] and some more documentation, but it seems
to me that this problem will be a problem for another 1-100 years.

[1] http://www.openoffice.org/issues/show_bug.cgi?id=87104
Comment 14 aladdin2k7 2008-04-07 10:53:37 UTC
In more detail, it occurs when OOo create new page style for each record.
Indeed, you ended with 1300 page styles !

So you have to use a page style without any header or footer, OOo will have the
same page style for each record.

Ok with the same page style (about 4 min 30 s for 1300 records)

Why OOo creates for each record a new page style ? For example (Default1,
Default2 ...)
Comment 15 Oliver Specht 2008-04-07 12:24:56 UTC
->aladdin2k7: The page styles are created to make sure that merged data that
might influence the header/footer content is correct for each document part.
Comment 16 aladdin2k7 2008-04-08 08:52:52 UTC
I forgot about fields :)

So, there will be a performance issue if you have to enable header/footer with
some data.
Comment 17 Oliver Specht 2008-05-27 06:40:59 UTC
Target changed to 3.x
Comment 18 Oliver Specht 2008-06-12 07:45:39 UTC
*** Issue 54531 has been marked as a duplicate of this issue. ***
Comment 19 aladdin2k7 2008-06-12 10:07:51 UTC
Some discuss about this issue :
http://sw.openoffice.org/servlets/BrowseList?list=dev&by=thread&from=2057086
Comment 20 wilhelmpflueger 2008-09-21 17:45:55 UTC
I trapped into this issue some days ago. With OOo 2.4.1 (ubuntu) it took about
10 hours to export a 4 page document into individual PDFs for the first 650
records. Then the mailmerge wizard became slower and slower - formatting some
thousand pages for everey single export.
Then I used the File/Print option. Using this way I got the total 1100 exported
documents within about 5 minutes (what is reasonable). 
Comment 21 mloiseleur 2008-10-03 13:43:45 UTC
Created attachment 56994 [details]
Improves speed of MailMerge operation : it's around 7 times faster.
Comment 22 mloiseleur 2008-10-03 14:06:47 UTC
Hi *,

  I've made a small patch to reduce dramatically the time needed to perform a
MailMerge. 
  I've profiled it with callgrind and find that MergeDocuments was waiting for
ever on system method. After looking it, I realised that the master document was
reopened with an "exec" style call for every single document generated.
  That is to say, a MailMerge action for 1000 documents was like opening 1000
times OOo on the master documents.

  I did some tests and found a tricky way to pass around this. If I restore back
the master document with "Undo()" calls, MailMerge operation runs as fine as the
original one, produces the same result but is seven times faster.
  Our tests were on a master document without header & footer, but the
performance gain should be the same for any kind of documents.

-> os : Would you please review the patch & tell us if it can be integrated into
mainline ?

Thanks,
Comment 23 max.odendahl 2008-10-03 14:19:33 UTC
->mloiseleur: you have to switch to PATCH as well
Comment 24 mfn 2008-10-03 20:39:43 UTC
Hi,

I hope this patch works and is build into a developer build or a special test
build if it's not possible to add it to the 3.0 Release.

It don't think that it's a real solution for the actual problem, that it's not
possible to clone documents in the memory, but it's a temporary solution that is
imho totally needed.

Today I had to merge some labels, approx. 480, 8 labels per document for our new
students. With 3 pictures and 2 fields per label, it took almost an hour to
create the documents on the old machine at university. Tomorrow I need to make
another 200 and there is no way to get around this, because MS Office 2007
doesn't run on Linux, and that's the only OS installed on the PCs for the
students council.
Comment 25 cno 2008-10-03 22:39:37 UTC
Hi mlouiseleur,
Great that you prepared this patch!
Although I've nothing to say in this field, I can tell that 3.0 is to far on its
way to include int. But I vote to try for including with 3.0.1 ...
Cor
Comment 26 mloiseleur 2008-10-04 12:56:06 UTC
Adding me to cc.
cornouws : Thanks for your help. 
  It's been integrated in go-oo (http://go-oo.org). You'll be able to see it in
the next debian/ubuntu release. For windows, you'll have to wait to the official
release of 3.0 Go-oo, but it will be in it.
  

Comment 27 mloiseleur 2008-10-04 13:07:14 UTC
-> mfn: I forgot to mention that unstable build for Suse are built on a weekly
basis. It's in the next batch. You'll be able to find'em here :
http://download.opensuse.org/repositories/OpenOffice.org:/UNSTABLE/
  And you can add the whole thing without hassle with the 1click-install :
http://software.opensuse.org/ymp/OpenOffice.org:UNSTABLE/openSUSE_10.3/OpenSUSE_org.ymp

Comment 28 Oliver Specht 2008-10-06 07:15:39 UTC
->mloiseleur: Thank you for your patch. It will work in most cases but not in
all. At first some operations are not supported by undo. This applies to
EmbedAllLinks() at least. Another much bigger problem occurs if there are macros
executed while merging. Those macros can change the document in a lot of ways.

Comment 29 mloiseleur 2008-10-06 08:38:48 UTC
I wasn't aware of the macro possibility. It's so slow that I actually didn't try
to look around for advanced features.
-> os : Thanks for your comment. Since it's the common case, what do you think of :
   1) detect if macros are planned
   2) use one of the two methods according to the previous test ?

According to comments on this issue, it seems that the first way to have faster
mail merge is to make it with macro, without the wizard. 

For EmbedAllLinks() & others unidentified method objects, I am wondering if it
has any effects at all on the merged document. They are applied before the merge
and it seems that the result will be the same for each documents. 
  Any field mapping/editing method should be undo-able, isn't it ? Is this not
the case ? If not, does the problem lies really on the patch & on this issue ?

BTW, as mfn said : "I don't think that it's a real solution for the actual
problem, that it's not possible to clone documents in the memory, but it's a
temporary solution that is imho totally needed."

-> os : Whatever the answers are about the two previous question, what do you
think about a switch (checkbox) in the wizard ? One which activates/disables
this trick.

Regards,
Comment 30 mloiseleur 2008-10-31 10:50:51 UTC
Created attachment 57607 [details]
New patch, fix a bug and a little bit faster
Comment 31 mloiseleur 2008-10-31 10:59:16 UTC
-> os : You were right on the undo thing, it cannot be trusted completely. I had
to move FieldToText translation into the destination document, because the Undo
implementation of this operation cannot be trusted, for whatever reason. 

  Anyway, I still think it's worthy and the main path of the patch is still to
use Undo as the more current pragmatic way to get rid of continuous system
reopening. 
  Many macros are undo-able, as far as I know (correct me if I am wrong). So
maybe it's not a regression but a necessary evil for attaining decent performance ? 
  

Regards,
Comment 32 Oliver Specht 2008-10-31 14:35:21 UTC
->mloiseleur: I didn't build you patch yet and will not be able to try this
until the end of next week. But it looks promising amd I will comment on this then.
Comment 33 Oliver Specht 2008-12-09 14:07:16 UTC
*** Issue 87104 has been marked as a duplicate of this issue. ***
Comment 34 Oliver Specht 2009-01-30 10:42:25 UTC
Target changed to 3.2
Comment 35 mloiseleur 2009-03-09 08:54:10 UTC
Created attachment 60818 [details]
Updated version, which takes a better care of numbering list
Comment 36 Oliver Specht 2009-06-17 11:02:20 UTC
I fixed the task now in cws os131 by copying the document in memory instead of
saving it. The advantage depends on the type of the document and on the speed of
the save/load operations. In my case on some pretty fast machines it was about
20 to 25%.
Comment 37 cno 2009-06-17 12:27:10 UTC
@ os: thanks for working on this.
Do I understand it right that you did not use the patch of mloiseleur?
Because he was talking about 7 times faster (in an earlier version of the patch).
And you talk about 20-25% on a fast machine. So that might be 15% for me, while
currently the preformance is, well ...
Not that 15% improvement is not good, but 2, 5 times faster, would be better :-)
Comment 38 Oliver Specht 2009-06-17 13:00:36 UTC
->cornouws: This first patch was using one source document and it used Undo() to
restore the document after invisible content has been removed. This doesn't work
in that case. Additionally changes that macros do at the document while merging
might get lost.
Comment 39 mfn 2009-06-17 14:13:54 UTC
Does MailMerge still create tons of styles when creating the personalized
Document with header and footer?
If it does, it would be a good idea to eliminate this behavior too.
Comment 40 Oliver Specht 2009-06-17 14:21:36 UTC
->mfn: The styles are necessary to have headers/footers with different content
(database content, page count etc). 
Comment 41 Oliver Specht 2009-06-18 10:01:44 UTC
Reassigned for verification
Comment 42 h.ilter 2009-07-10 13:30:04 UTC
Verified with cws os131 = OK, visible improved
Test with an one page document with 300 recipients.
Here my results for the number of saving individual merged documents
OOo3.1          after 1.min          after 2.min.          after 3.min          time summary
windows         92docs               186docs              283 docs            3min 10sec.
Linux             196docs                 ---                        ---                  1min 30sec.

cws os131     after 1.min.          after 2.min.              ---                  time summary
windows        122docs              245docs                  ---                  2min 10sec.
Linux             276docs                 ---                        ---                  1min 5sec
Comment 43 mfn 2009-07-12 23:44:41 UTC
Hopefully, I don't drop a clanger.
When will it be integrated in a snapshot build?
Comment 44 h.ilter 2009-07-13 09:09:19 UTC
Unfortunately I have to send this issue back because of a crash by edit copy text
- New Writer doc
- Type some text
- Select text
- Menu Edit Copy
-> crash
NOTE: Occurs only in windows.
Comment 45 h.ilter 2009-07-13 09:10:58 UTC
And reopened
Comment 46 Oliver Specht 2009-07-22 13:07:05 UTC
Fixed in cws os131 in 
sw/source/core/doc/swserv.cxx
Comment 47 h.ilter 2009-07-24 10:59:50 UTC
Verified with new cws os131 = ok
The crash fix does not change the performance.
Comment 48 caolanm 2009-09-11 09:08:29 UTC
closing, integrated DEV300_m54