Apache OpenOffice (AOO) Bugzilla – Issue 92384
New document language setting ignored for the HSQLDB in a newly created .odb
Last modified: 2017-05-20 10:47:32 UTC
When creating a new .odb database, the "SET DATABASE COLLATION" statement in the .odb document's embedded HSQLDB database (in the database/script file) uses a language corresponding to the user's current locale, not the OOo setting for default language for new documents.
please describe more exactly where you look for the settings?
unzip the .odb file and look in the database/script file in the unpacked contents. It contains a line like: SET DATABASE COLLATION "Swedish" Unfortunately I can't recall any more what the exact customer complaint was, but it was a real customer, from Quebec I think, that complained about this. I guess it is related to that the HSQLDB code supports only a very limited set of "collations".
HSQLdb does the system locale setting for newly created databses. The Base implementation of HSQLdb does not change this - tested again with OOo 3.0 today. This collation language setting can only be changed IF there are no objects in the database. No tables, sequences or views, once any of these have been created the collation order is set - however it can be changed permanently before any of these are created by issuing the SET DATABASE COLLATION in the SQL window for a newly created Base file. Also checked with OOo 3.0 and this still works as expected. As for which locales are supported - over 90 last I checked, but then FS will know this for sure. I would think however that using the OOo default language setting is the more appropriate behavior.
Here is what I wrote in the corresponding Novell bug: The problem here is that the HSQLDB database format seems to support only a fixed list of collation languages. If you look into the attached .odb file (just unzip it), in the file database/script you find the line: SET DATABASE COLLATION "French" The documentation at http://hsqldb.sourceforge.net/web/hsqlDocsFrame.html just says of this statement: "Each database can have its own collation. Sets the collation from the set of collations in the source for org.hsqldb.Collation." so apparently the HSQLDB source code is also the format specification for this part of an HSQLDB database. The source file in question has a table of collation names and their mapping to Java locale names (ISO639-ISO3166 code pairs). For French, it only contains: nameToJavaName.put("French", "fr-FR"); If we would modify this to include also a mapping between, say, "French (Canada)" and "fr-CA", this would mean that such .odb databases produced by our version of OOo would presumably be unreadable by vanilla OOo versions, as the HSQLDB code seems to throw an exception if an unrecognized collation locale is read from the script file of the database. (I haven't tried, this is just from reading the code.) Btw, the collation name table in HSQLDB contains such crack as "Danish_Norwegian" for "nb-NO"... Calling the Bokmål variant of Norwegian "Danish_Norwegian" is quite insulting to at least some of its users, I think. As you might know, the relation between the nb_NO (Bokmål) and nn_NO (Nynorsk) communities can be quite "hot"... (No, despite my first name I am not Norwegian myself.) I really don't know what to do at this point. Should we add support for fr-CA to our build of OOo, and then risk that databases produced by our users in Québec aren't usable by users of vanilla OOo? (I will have to test whether that actually happens.) Will it be enough to add "fr-CA" to the table, does Java already support proper fr-CA collation? Will have to test that, too. Btw, OOo's setting Tools:Options:Language Settings:Languages:Default languages for documents:Western does not affect the "SET DATABASE COLLATION" statement stored in a newly created .odb file's HSQLDB database. It uses the user's locale! (On Windows, from the Control Panel's "Regional and Language Options" applet.)
agree to change the behavior so that the collation for newly created databases matches the respective OOo option, not the system locale. As for patching HSQLDB code to accept more locales: I really do not know why we initially decided to have those prose names for the languages, this is completely weird from my today's point of view. IMO, the SET COLLATION behavior should be extended so that it accepts the current prose names for compatibility only, and otherwise is simply based on locales, e.g. SET DATABASE COLLATION fr-CA. Also, the driver on OOo side should generate such a locale-based statement. Then, the patch/es should be contributed to both the HSQLDB project and OOo, to include it in upcoming releases of both products. Probably, when micro updates of older versions of OOo appear (such as 2.4.x), we should also include those patches there, to ensure those older versions can read .odbs created with newer versions.
@fs - You are talking about the current HSQLdb 1.8.x code I believe. If so - then I think this would be something I could handle, I looked at the source for that functionality once before (a long way back however) and will do so again now. Assuming it looks like a task I could take on quickly I'll let you know here, unless you or OJ would prefer to do so. Specifically: 1) Change to use OO.o locale setting over system on create. (I'm figuring here that this will be some change to C++ - expect I could handle that - but would ask for help with finding where in the source.) 2) Accept ISO abbreviations, 'fr-CA' for instance, *along* with the verbose language names. (Figuring this is all Java in the HSQLdb souce - should be able to just jump in on this one)
Drew, your help is much appreciated. However, there's a small "but": I talked about this issue with Fred Toussi, the HSQLDB maintainer, a while ago. First thing I stumbled upon in the current implementation is this strange and weird usage of prose names for the collations, instead of simply specifying an ISO locale. This is something I'd like to be changed in the course of fixing the issue - the old prose names would be kept for compatibility only. Second, Fred told me about ongoing implementations in the 1.9 branch, regarding collations. One thing I promised to do when time permits (which wasn't the case so far, unfortunately) is research about collation features in other DBMS, so we don't re-invent collation in HSQL. So, I don't think a short fix would be the best solution here. If you're interested in helping, then I would like to forward you the small conversation I had with Fred, in the hope you can participate. Especially the collection of collation-usages in other DBs is something which I think would perfectly fit your competence ...
@fs - yes, why don't you pass on the comments from Fred. I could at a minimum help put together a reference of how collation is handled in MySQL, PostgreSQl, SQL Server and Oracle EX (11g)
Reset assigne to the default "issues@openoffice.apache.org".