Apache OpenOffice (AOO) Bugzilla – Issue 31776
PDF export bug for asian language sinhala
Last modified: 2017-05-20 11:29:54 UTC
While asian characters for the Sinhala (Sri Lanka) language appear (usually properly) in the OO word processor, exporting it to PDF does not display it properly. It seems, that character distances are not respected properly. Note: Some characters in this language are "build" by 2 characters, while the 2nd character "jumps back" onto the 1st to add something to this character. Another strange bug seems to be, that not all characters are displayed properly in the OO word processor, but if you change it to bold, than it does ... !?!
Created attachment 16619 [details] buggy PDF
Created attachment 16620 [details] SXW is OK ...
Created attachment 16621 [details] TTF of Sinhala and "normal" roman font
reassigned to HI.
Verified with 680m48 = ok
closed.
Congratulations to v1.9.62 ! I tested this bug and now the bold bug is fixed, but unfortunately still the PDF export is buggy, so I needed to reopen this issue.
HI->iwpoo: Could you please send an screenshot how it should look like. This would make the evaluation easier in cause our different locale and language systems. Thanks in advance.
.
Created attachment 21244 [details] screenshot (zoomed) SXW in v1.9.62
Created attachment 21245 [details] PDF Acrobat v7 view
HI->HDU: I'm not able to load the bugdoc as it should be (see screenshot). Please take a look for this.
I have a problem with the sinhala.ttf font provided. Even when the font is installed the "Character Map" application does not display any Sinhala characters. Does it work on your system? Can you also provide the PDF created by a recent SRC680 version, e.g. m62?
Created attachment 21535 [details] character menue with some sinhala characters (1.9 m69)
Created attachment 21536 [details] 1.9 m69: PDF, under Acrobat 7 it now does not display sinhala properly any more, but special ASCII characters ...
It looks like there are some problems with the encoding already in the sample document. The *sxw provided requests these unicodes U+00B8 U+00BC U+00AB U+00BC U+00D4, which is consistent with what the PDF output shows. Was the sample document created using Copy+Paste? What was the encoding of the source of the Copy+Paste operation? Can you recreate the sample document by reentering the non-ASCII characters using the "Insert->SpecialCharacter" dialog? I'm also wondering why the wrongly encoded text looks good on your display?
Created attachment 21621 [details] v1.9 m69, reentered via insert, special characters
Even the latest document has the same problem with the wrong codes stored in there. I guess it happened when saving the file. So we have to reproduce it manually. From looking at the screenshot it seems that the originally text entered had the unicodes U+0DB9 U+0DA5 U+0DA9 U+0DA5 U+0DC9. Correct?
Thanks for your patience ... ! No, those are not the unicode's we entered. You need to use in Latin-1: U+00B8 (184), U+00BC (188), U+00AB (171), U+00BC (188), U+00D4 (212).
Ouch! So the unicodes you requested are exactly the ones you get in the PDF output :-) There is an anomaly in Sinhala that it both suggests Mac-Roman and unicode Sinhala encodings, which are contradicting. OOo will always unicode prefer the unicode information in the font. Please use the Sinhala unicode values U+0D80-U+0DFF...
Created attachment 21758 [details] but I get only '=''s there ...
Hi ! Thanks for OO 2.0 ! I hope the community will find soon time to work on this issue. Currently I experienced a further problem already in v1.14 (but it is also in the new v2.0 release): The letter for a small (lowercase) 't' having a dot below does not show up any more ... there is only a grey rectangular field instead ...
I changed the font, so I'm (currently) having much less trouble now ...
Reset assigne to the default "issues@openoffice.apache.org".