Apache OpenOffice (AOO) Bugzilla – Issue 55650
Conversion wizard is too slow if log enabled
Last modified: 2013-07-30 02:20:11 UTC
I've started converting all the documents of my office from MsOffice format to OpenDocument, using the wizard. I've started the conversion in a directory tree with about 3,500 files. With the "log" enabled, the conversion is 10x slower than without the log, and, even worse, becomes slower and slower. After 7 hours (sigh!) I was around 1,600 files converted, and I had to abort the process. I restarted without log enabled, and I completed it, as I said, with a document/minute ratio 10x faster, from start to end. Even when converting 680 Excel files the "log" flag made it be a painful experience (but did not re-tried without it, so don't know exactly what improvements you could have in a "pure" Excel conversion). I expect a lot of people use the wizard with their archives, so improvements in this regards is much more important than before. Please, improve the "log" option algorithm, or remove it. I've realized a python script that I run after the wizard, that checks if each M$ Office file has a converted one, and if not prints it's path. This works only if you have the "source" files and the converted one in the same position, of course. This is very beta software (I don't assume responsability for it), but seems to work fine, so you could include it in the wizard to run it at the end of the process, and it's probably more useful than the report (btw, it does not yet consider the templates). #!/usr/bin/env python """ Shows filenames of files not been converted by the OpenOffice conversion wizard (originale and converted files are supposed to be in the same directory) Copyright (c) by Marco Menardi mmenaz@mail.com with the help of Paolo Veronelli paolo.veronelli@gmail.com This program is Free Software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. """ __version__="0.1" ##### hack this dict to add supports exts={'.doc':'.odt', # Word, Write '.xls':'.ods', # Excel, Calc '.ppt':'.odp', # PowerPoint, Impress '.pps':'.odp', # PowerPoint, Impress (different extension) '.sxw':'.odt', # Write 1.0 '.sxc':'.ods', # Calc 1.0 '.sxd':'.odp', # Impress 1.0 '.sdw':'.odt', # StarWrite 1.0 '.sdc':'.ods', # StarCalc 1.0 '.sdd':'.odp', # StarImpress 1.0 '.sdp':'.odp', # StarImpress 1.0 (different extension) } import glob, os, stat,sys,optparse usage=''' python notooo.py <directory> \tLists Microsoft Office files that have no OpenDocument equivalent in the same directory \tDefault is current directory. ''' parser = optparse.OptionParser(usage=usage) parser.add_option("-n", "--nonrecurse",dest="nonrecurse", action="store_false",default=True, help="don't recurse down target directory") options,args = parser.parse_args() if not len(args): # default to '.' basedir = '.' elif len(args) == 1: # the target directory basedir = args[0] else: parser.print_help() sys.exit(1) for root, dirs, files in os.walk(basedir): for name in files: base, ext = os.path.splitext(name) if ext in exts: target = os.path.join(root, base + exts[ext]) if not os.path.exists(target): print os.path.join(root, name) if not options.nonrecurse: break
of @ oc: Please have a look.
Hi Ilko, seems to be your construction site. 178 files in some subfolders converted without log function 25 % faster. Frank
Frank: "178 files in some subfolders converted without log function 25 % faster" As I clearly stated, the log makes thing go slower and slower. If you convert 178 files, maybe. Try with 1780, or 6478 as I did, then let me know. The average office has thousands of files to convert, in hundred of folders, you can't ask them to convert them in bulks of 100 files. Do we want to ease their migration as much as possible or not? Thanks
Reset assignee on issues not touched by assignee in more than 2000 days.