Apache OpenOffice (AOO) Bugzilla – Issue 11579
Japanese chars' wrong appearance/PDF export in vertical text
Last modified: 2013-08-07 15:00:19 UTC
Thanks for developing the valuable Office suite. When in a vertically-directed text (Format->Page->Page->Text direction->Right-to-left(vertical), certain Japanese characters * become garbage on screen/PDF export * disappear on screen/PDF export * are displayed in wrong direction on screen/PDF export The characters in question are special characters, which are expected to be rotated by 90 degrees in a vertically-directed page. Please note that I haven't checked all characters of this kind and this issue may involve other characters of this kind. The following behavior is observed. [Windows2000/XP] [Linux] Screen PDF Screen PDF double-byte dots OK Dir NG OK OK double-byte wave OK Dir NG OK OK double-byte bracket * garbage OK OK double-byte minus + Dir NG # Disappear *: Should be rotated 180 degrees my XP machine. Reportedly OK on some Win2000 machine but disappears/becomes garbage on other Win2000 machines. +: Direction is OK but the char width seems negative (cursor goes back) on my XP machine. Should be rotated by 90 degrees on Win2000 machines Dir NG: Should be rotated by 90 degrees #: When Writer on Linux opens a sxw file created on XP machine, the double-byte minus appears in a wrong direction (should be rotated by 90 degrees). An additional issue is that PDF export on my XP machine places the characters sometimes at a wrong place. Windows2000/XP are Japanese editions. Linux is RedHat 8.0 and Vine 2.6 (a popular distribution used in Japan). Other platforms (Win9x and others) have not been tested on this issue. This issue might involve multiple factors, but I can't isolate them into separate issues. I will attach sxw and PDF files containing a vertical text.
Created attachment 4778 [details] archive of sxw and PDF containing vertical text, created on a XP machine
This issue seems to be still sticking to 644m4. Maybe 'l10n' is more proper for this issue to go?
Since I have no response from wordprocessor team and this issue relates to Japanese punctuation letters, maybe 'l10n' is more proper for this issue to go. This issue relates to display and PDF export on Writer.
DL->CP: Would you please takeover?
cp->pl: is it you or hdu ?
pl->hdu: The onscreen issue you should probably have a look at; it is possible that the PDF problem is the same (e.g. SalLayout), but if it persists in PDF after you did your part i will have a look at the PDF code.
> *: Should be rotated 180 degrees my XP machine. Are you sure about the 180 degrees? Shouldn't it be 90? The (+) case mentioned is fixed in VCL06. The (#) case on linux needs to analyzed. Do you know what font exactly was used to display it? Maybe it's GSUB table doesn't cover the double byte - U+FF0D case.
Thank you for picking up this item. Vertical textflow is crucial in literature-publishing, educational, and governmental documents in Japan. I checked these issues using 644m4 more comprehensively. For your reference I attach screenshots and PDF outputs from MS-word, since they can be a reference. I refer to specific letters by their coordinates on the attached screenshots. >> *: Should be rotated 180 degrees my XP machine. >Are you sure about the 180 degrees? Shouldn't it be 90? By 644m4 on my 2K and XP machines, the letters at B-7 don't show up. >The (+) case mentioned is fixed in VCL06. When I raised this issue, I didn't know MS-word doesn't rotate the letter at D-3. If compatibility with MS-word is of value, it might be worth rechecking this letter. >The (#) case on linux needs to analyzed. Do you know what font >exactly was used to display it? Maybe it's GSUB table doesn't cover >the double byte - U+FF0D case. My guess is that the letter captured when a double-byte minus is typed through Input-Method from keyboard on Linux is actually not a double-byte minus letter (D-3) in Shift-JIS code. Please look at the difference at the 4th and 5th letters in: *double-byte-minus-from-keyboard.sxw *double-byte-minus-from-shiftjis-file.sxw [List of Issues] Both vertical- and horizontal-flow, both on-screen and PDF-export issue: Linux: Letters at B-5 and C-0 become garbage. Both vertical- and horizontal-flow, PDF-export issue: Linux: Letter at D-1 in Times font (the second group on writer_rh_vertical.png) appears in a different, short shape. Vertical-flow on-screen issues: Win2K/XP: Letters at A-7, B-0,1,4,6,7, D-4, G-1,7, and other two letters (the 3rd and 4th letters in the last line) become garbage and letters at H-0 and H-1 appear in a slightly different shape. Linux: Single-byte letters in Times font don't show up at all. (Lucida Sans font -- the 1st group -- shows up.) Vertical-flow PDF-export issues: Win2K/XP: Letters at A-7, B-0,1,4,6,7, D-4, G-1,7, H-0,1, and other two letters in the last line become space or garbage. Letters at B-3 and F-1,2 appear in a wrong direction. The last line go to the left margin on WinXP. Linux: Letter in Times at D-4 becomes garbage. Two letters in the middle of the last line disappear. [Note] Some letters in the last line appear in different directions, but I think this issue is less serious. [Archive Contents] punct_flow.txt : Shift-JIS-code text file punct_flow_xp.sxw: made on XP punct_flow_rh.sxw: made on RedHat8.0 writer_{xp/2k/rh}_{vertical/horizontal}.png: screenshots of Writer msword_{vertical/horizontal}.png: screenshots of MS-Word on Win2K writer_{xp/2k/rh}.pdf: PDF-export from Writer msword.pdf: PDF output from MS-Word over Acrobat Distiller XP and 2K use Lucida Sans Unicode for European letters and MS-Mincho for Japanese letters. RH uses Lucida Sans and Times for European letters and Kochi-Mincho for Japanese letters. double-byte-minus.txt: created by NotePad on Win2K; the 4th and 5th letters are double-byte minus. double-byte-minus-from-shiftjis-file.sxw: read double-byte-minus.txt on Linux and saved this file. double-byte-minus-from-keyboard.sxw: captured double-byte-minus from keyboard on Linux and saved this file. I hope this information helps. I would appreciate your effort.
Created attachment 5095 [details] Archive of relevant files
Still ongoing..., but have to change the target.
I reviewed this issue with OOo1.1beta2. Please refer to the attached screenshots and PDFs. [Vertical text on screen] With Win2K/XP, it got really improved. The only problem that I found still survives is: Latin comma and period (H-0,1)in vertical flow appears as Japanese punctuations. With Linux, it got really worse! Many chars are displaced with some offset. Many chars disappear. (Some chars in A-H array and the whole list of chars to the left of the A-H array) [Vertical text on PDF export] With Win2K/XP, it got improved. However, brackets, parenteses, and dashes (A-7,B-0,1,3,4,6,7,F-1,2should be rotated. Latin comma and period (H-0,1) becomes garbage. The list of chars to the left of A-H array are rotated, but they appear unrotated on screen. Shouldn't they appear in the same way on screen and PDF? FYI, MS-Word doesn't rotate them. With Linux, the only proglem I noticed is: B-5, C-0 and some on the list disappear or get garbage. [Horizontal text] On screen with Win2K/XP, I don't see any issue. For PDF export on Win2K/XP, I noticed several issues, which I reported as issue #15444. On screen with Linux, it got worse. On the list of chars below A-H array, the characters overlap. B-5 and C-0 still become garbage. PDF with Linux, B-5 and C-0 still become garbage. If these issues are fixed in OOo1.1, I, and probably most Japanese users, would be very happy. Thanks,
Created attachment 6759 [details] Screenshots and PDF generated with OOo1.1beta2
Issue 15535 fixed a lot of the PDF problems, so retargeting this because of time constraints is reasonable.
Hi HDU, I would like to mention first that issue #15444 (could be a duplicate of other issues, I hope) is really serious and must be fixed by OO.o 1.1. That's said, I also wish the milestone for this issue #11579 could be reconsidered. I am really surprised to hear this important issue (raised 4 months ago) was given lower priority than another younger issue and may not be fixed by OO.o 1.1. I can't find how to announce this bad news to people joining the bug- tracking project in Japan. A lot of them will surely be disappointed. I am afraid they may just quit evaluating OO.o. These folks have reported and discussed dozens of issues locally, and I have reported the confirmed issues to IssueZilla. I will be unable to continue reporting issues without their help. Thanks
Hi, there is a good news. This message came from one of ja members. ---- I confirmed that the bugs also existed in Japanese StarSuite 6.1 beta and RC1, and they have been fixed according to the request of Sun Japan. So I think there's no difficulty to integrate existing fixes of StarSuite into OpenOffice.org about the Japanese PDF output problems. ----- Would you please contact with StarOffice development team about this ?
Showing off value-added proprietary product over open-source version? I didn't expect IssueZilla is a place for Sun-insider to sell SO/SS.
I compared "OpenOffice.org1.1beta2" with "1.1rc (vcl13 applied)". Incorrect position of Japanese commas and periods and 90-degree spinning of brackets and macrons in PDF outputs. http://www.transwift.net/ooo/11rc_pdf_vcl13_e.html Summary: Problems================Commas-and-Periods======Brackets=======Macrons===== 1.1beta2Horizontal======Incorrect===============Spinning=======Spinning==== 1.1beta2Vertical========Correct=================Spinning=======Correct===== 1.1rc(vcl13)Horizontal==Correct(+)==============Correct(+)=====Correct(+)== 1.1rc(vcl13)Vertical====Incorrect(-)============Correct(+)=====Spinning(-)= =========================================================================== (+)Improved (-)Regression
Above 1.1rc with vcl13 applied uses: gsl/ vcl/ source/ gdi/ outdev3.cxx gsl/ psprint/ source/ printergfx/ glyphset.cxx gsl/ psprint/ source/ fontmanager/ fontmanager.cxx gsl/ psprint/ inc/ psprint/ fontmanager.hxx
Above files from cws_srx645_vcl13 Thanks
Now I've found that 1.1rc behaves 1.1rc with vcl13 applied in terms of PDF outputs. Let me adjust my summary above. Summary: Problems================Commas-and-Periods======Brackets=======Macrons===== 1.1beta2Horizontal======Incorrect===============Spinning=======Spinning==== 1.1beta2Vertical========Correct=================Spinning=======Correct===== 1.1rcHorizontal=========Correct(+)==============Correct(+)=====Correct(+)== 1.1rcVertical===========Incorrect(-)============Correct(+)=====Spinning(-)= =========================================================================== (+)Improved (-)Regression
Now I have been looking into modules, files and codes concerning to "vert" feature and PDF outputs on UNX platform and I found exactly what mechanism I wanted to know on those features. Anyway OOo1.1beta2 and 1.1rc for LinuxIntel works fine so far in terms of PDF outputs with "vert" feature. But this problem on PDF outputs with "vert" feature only occurs on Win32 platforms. On Win32 platforms, Herbert Duerr@gsl said: "On W32 platforms it is assumed that the OS's GDI layer handles these transformations when the vertical writing mode is set for a font. " As long as I have seen, 1.1beta2 and 1.1rc behave different on PDF outputs on my same Windows98SE machine. http://www.transwift.net/ooo/11rc_pdf_e_utf8.html So I am compelled to think that something has changed from 1.1beta2 to 1.1rc. I would like to know what code have changed and how on OOo's code side? What mechanism does OOo use to perform PDF exports? Is this feature implemented as printing function, or Export Filter ( http://xml.openoffice.org/filter/ )? Should "component" be changed from l10n to something else? And can this issue be linked or merged to: http://www.openoffice.org/issues/show_bug.cgi?id=15444 I really want this issue to get fixed in 1.1RC2 for Windows. Thanks
I propose changes of OS and Version statuses in this issue #11579: from "All" to "Windows" for OS, and from "644" to "1.1rc". Now some analysis on these issues has been provided by a UNICODE expert OOo Japanese user. Hope this information help solve the issues. PDF Export in Vertical Writing ------------------------------ There are special characters in Japanese which are normally used in daily sentences. These characters should be displayed and printed (including PDF Export) differently in vertical writing compared to horizontal writing. Some of them are listed below: u3001 - IDEOGRAPHIC COMMA u3002 - IDEOGRAPHIC FULL STOP u300C - LEFT CORNER BRACKET u300D - RIGHT CORNER BRACKET u30FC - KATAKANA-HIRAGANA PROLONGED SOUND MARK Both displaying and printing these characters in Writer have no problem but we meet problems when exporting them into PDF format. Current phenomena with 1.1rc, locally built July 1, on Windows98SE. English font: BitstreamVeraSans Japanese font: MS Mincho PDF output illustrates unsatisfied image: http://www.transwift.net/ooo/11rc_pdf_v.bmp Display image sentences in Writer shows correct image: http://www.transwift.net/ooo/11rc_Writer_v.bmp Bugdoc: http://www.transwift.net/ooo/ja_openoffice_org.sxw Font file of "MS Mincho": http://www.transwift.net/ooo/MSMINCHO.TTC Quick Investigations: --------------------- 1. u30FC(Katakana-Hiragana prolonged sound mark) in the PDF is drawn with a horizontal glyph in the way of 180 (not 90) degree rotation: http://www.transwift.net/ooo/u30FC_rotation.bmp while one on the display is rendered with a vertical glyph. A vertical glyph should be used for u30FC(Katakana-Hiragana prolonged sound mark) in vertical writing for PDF export. The horizontal glyph on the display can be found in this snapshot. http://www.transwift.net/ooo/11rc_Writer_h.bmp 2. Both u300C(Left corner bracket) and u300D(Right corner bracket) in the PDF look good at a glance, but the position is too low. Does this wrong position come from glyph metrics information? 3. It seems that both u3001(Ideographic comma) and u3002(Ideographic full stop) are drawn with horizontal glyphs even though the sentence is in vertical writing. They should be drawn with vertical glyphs or metrics for vertical writing. 4. In 1.1beta2, u3001(Ideographic comma), u3002(Ideographic full stop), and u30FC(Katakana-Hiragana prolonged sound mark) were drawn properly in even vertical writing. Both u300C(Left corner bracket) and u300D(Right corner bracket) were wrong. In contrast 1.1rc has a problem with these characters. Here is a snapshot of PDF output with 1.1beta2. http://www.transwift.net/ooo/11beta2_Writer_v.bmp Please compare to one with 1.1rc. http://www.transwift.net/ooo/11rc_pdf_v.bmp 5. Linux version of 1.1rc seems not to have any problem with PDF export. Vertical Glyph --------------- What shall we do if a vertical glyph is not included in a font file? Can we do nothing in such case? I think yes for 1.1rc. We can assume that most recent Japanese vector fonts have vertical glyph. Bitmap fonts, however, don't have such information, but these fonts would be rarely used for publishing. Therefore, we can implement PDF export simply for 1.1rc. PDF Export in Horizontal Writing -------------------------------- http://www.openoffice.org/project/www/issues/show_bug.cgi?id=15444 It seems that problems have been fixed in 1.1rc. References ----------- * CJK Symbols and Punctuation (3000-303F) http://www.unicode.org/charts/PDF/U3000.pdf * Katakana (30A0-30FF) http://www.unicode.org/charts/PDF/U30A0.pdf Thanks
*** Issue 15444 has been marked as a duplicate of this issue. ***
HDU->US: similar to task 110440, please use the great bugdocs here for additional testing in CWS RC3VCL.
Changing status to fixed.
HDU->HI: Related to #110440#..., please verify in CWS rc3vcl.
Verified in 645m12_8655 (CWS: rc3vcl) = ok
.
This is great!! A developer in OOo ja project built vcl645mi.dll with vcl/source/gdi/winlayout.cxx under cws_srx645_rc3vcl. And we have tried to use this vcl645mi.dll in 1.1rc2 Japanese snapshot build. Yes, it works!! Is it possible to change the target from 1.1.1 to 1.1rc or 1.1 and merge the winlayout.cxx into cws_srx645_ooo11rc3 if there will be such a cvs tree? Thanks
Hi, developers, I really appreciate your excellent work to address this important issue. Please indicate how to help enabling this fix to come with 1.1 release. I hope Japanese folks are willing to test a snapshot build to help ensure this fix is free from any unwanted side effect. I also thank Hirano-san and other Japanese folks for your proactive commitment to this issue.
Hi, Now the tree "cws_srx645_ooo11rc3" is open now. Is "CWS rc3vcl" going to be merged to the ooo11rc3 tree? Thanks
should be all ok in the next install version.
Thank you very much, Herbert Duerr! I keep watching Bonsai and waiting for THIS to be merged to cws_srx645_ooo11rc3. Best Wishes
PDF outputs generated by recently released 1.1rc2 for Windows on Windows98SE shows problems different from the previous versions. Vertical texts in Writer is OK: http://www.transwift.net/ooo/11rc2_sxw_v.bmp PDF outputs of the above on Windows98SE shows problems: http://www.transwift.net/ooo/11rc2_pdf_v.bmp Thanks
I'm waiting for this to be merged to cws_srx645_ooo11rc3. So please.