Apache OpenOffice (AOO) Bugzilla – Issue 17056
WW8: Word doesn't have "title case" character property
Last modified: 2013-08-07 14:41:36 UTC
Words which have had their case changed, either by autocorrection, or by use of my case-changing macro (in the url above) lose this change of case when exported to Word format. This makes the resulting documents look very sloppy and unprofessional. steps to repeat. start a paragraph without a capital letter (see above). Correct it to title case, using my macro. Export to word. Reopen in Word. The letter will have come back down to lower case. It is still in lower case if the doculemnt is reopened in OOo.
Reassigned to JSK
Bob Long on the user list checked, and I have confirmed, that title case is not exported to Word even when applied via format->character->title case. parenthetically, this is the fourth bug I have found in the implementation of title case. It really must have been written on a Monday morning.
Some more on this case business. It seems that the Font Effects|Effects change only the *appearance* of text within OOo. For example, if you enter ABC (in capitals, using the shift key) then you will see ABC. And if you use font effects to change it to, say, lowercase, or title case, then you see lowercase, or title case, as appropriate. But the underlying characters are *not* changed. Conversely, entering abc in lower case, then using the effect of capitals, shows capitals but does not change the "real" characters. This can be seen by saving as HTML, or as pasting as unformatted text. I discovered this when trying to use a regular expression to replace words that were all capitals by italicised lowercase. As I'm pretty raw with REs, I thought I was doing it incorrectly. But it turns out the document (originally a .doc made by someone else) was using the capitals font effect. Without the underlying characters being changed, I can't think of how to successfully change case and have the expected case available for something like HTML output or an unformatted paste.
There are two problems here. One is the problem of export to other formats -- html and plain text as well as word: presumably all export formats are affected. Here it seems to me incontrovertible that the export filters should change the case of the underlying characters The second problem comes, as Bob has noticed, with cutting and pasting text. That's a little more complicated. Because I presume that the justificaiton for having a charcasemap at all is something to do with non-ascii charsets. Put presumably the unformatted text that is placed on a clipboard should be changed just as when it's exported to a proper file format.
I think this has absolutely nothing to do with the macro recorder. Interpreting the additional comments i think this is a flaw in the filters. Back to hi.
Quite. It has nothing whatever to do with the macro recorder, and macros only come into it because it was by using a (hand-written) macro that I discovered this flaw. The changed case is not exported however the change is made.
HI->CMC: The font attributes seems not to exported correctly.
Word doesn't have have a "title" character format, our title format caps the first letter in a paragraph, word doesn't have a feature like that, we'd have to turn in into an actual "caps" format on the first character.
Created attachment 8463 [details] Example .sxw file
Ok, seeing as word doesn't have the feature we'll make the filter turning the affected characters into their uppercase equivalents when exporting. Done in limerickfilterteam08 for OOo 2.0
retitling
cmc: thanks for that. The next question is whether you can modify the upper/lower case settings in OOo to export real upper case/lower case letters both to word and to ascii text. This is not quite the same as word's deficient title case. It crops up in other contexts, such as when Writer autocorrects sentences that start with a lower-case letter. The correction is not preserved on export to ascii, for example. Again, this looks some journalists look like sloppy fools when they submit autocorrected ascii files in which half the sentences start with lower case letter. anyone would think we were idle or ignorant. that is a terrible thing. Perhaps this is someone else's job. But I don't consider the issue fixed until all the charcasemap formatting changes are preserved on export to formats that dont support a charcasemap type property. Because the root of this problem is a failure for that to happen.
When writer autocorrects it does replace the character with an uppercase equivalent, it doesn't toggle the title or other char property, so I don't see this behaviour with text autocorrected to have uppercase. So the normal autocorrect case works fine for me saving to text and .doc and so on. It only arises when there is a char property on the character. Word has the caps property so that's ok. Granted the export to .txt should probably do the same thing as was done for title case here. Buts that's suprisingly tricky as the ascii filter doesn't work with any properties at all. Nevertheless its a good point, write a new bug for export of all caps/title case/small caps to .txt. A quick check of the competitor show that that works for them.
reopen to reassign
Created attachment 8484 [details] example for qa
cmc->mru: Change title case to real uppercase implemented for .doc in limerickfilterteam08
Checked with internal CWS filterteam08.
Verified. The "Fix" will be available in OO 2.0. Andrew, be careful: an import of the exported title characters will not be possible. As CMC pointed out, Word does not have such a feature. This means, thew characters will be loaded as normal upper case chars into Writer, not as chars with the "title" property.
I think this is what I asked for: that a word in Title case (ie with the first letter upper case and the rest lower) is exported to word in that way. Will this also be true with export to ascii/html? The difference, as I understand it, is that is that in real title case, if I type "word" then "Word" appears; and if I later add an "s" letter at the start, "Sword" appears (the S is capped up, and the w lowered). Obviously that can't be preserved on export to formats that don't know about title case. In those formats, after export "Word" should appear, with the initial cap; and the expected behaviour if I add an initial lower-case s is to get "sWord". Not perfect, but no longer the fault of OOo, which at present exports "word". Anyway, thank you for spending the time on this.
Checked integration in 680m32. Andrew, for similar feature in ASCII/HTML export, please file new "enhancement" issues. These (though both seem export issues) are different code places to work on, so it will be necessary to file new issues. Thanks for your patience!