Apache OpenOffice (AOO) Bugzilla – Issue 9391
DXF import filter doesn't correctly display Japanese text.
Last modified: 2004-02-27 13:37:35 UTC
Problem: Japanese characters can not be displayed correctly in importing a DXF file. Reproduction: 1. Prepare OOo1.0.1 Japanese. 2. Open a bugdoc OO-Draw.dxf through DXF import filter. 3. The converted document would look like OO-Draw-DXF.png An original image, however, is OO-Draw.png This problem is originally reported by Mr. Miyazaki, a member of OpenOffice.org Users Group Japan http://blow-away.net/openoffice/ The bugdoc and snapshots will be attached soon.
Created attachment 3690 [details] A bugdoc prepared with JW_CAD http://www.jwcad.net/ ; MIME type is image/vnd.dxf
Created attachment 3691 [details] An original image on JW_CAD
Created attachment 3692 [details] Converted image on OpenOffice.org
Tora->sba: The files have been attached. The DXF file is encoded in Shift_JIS.
Regarding Header Section of DXF file format http://www.autodesk.com/techpubs/autocad/acad2000/dxf/header_section_group_codes_dxf_02.htm $DWGCODEPAGE can specify type of encoding in which the DXF file is encoded. In this bug case, the bugdoc does not contain such definition. Adding either of the followings to HEADER section of the bugdoc file does not help however. --- $DWGCODEPAGE 3 ANSI_932 9 ---- or ---- $DWGCODEPAGE 3 DOC932 9 ----
SBA: DXF is a draw file format. Changed Component accordingly. Reassigned to Wolfram.
SBA: Prio changed to 3 (no crash).
Set to new.
Reproduceable in 1.0.1 and in a current internal version. Reassigned to Sven. Please have a look.
accepted
changed target
.
Tora: How can we, members of ja.ooo, help you?
Is it possible to provide some diffs or something to confirm by ourselves?
Hi, the bugfix itself should not be so difficult, but I have currently not enough time for this now. It would be cool, if anybody can provide a patch, then this will made it into the next version. The dxf source code is located at: graphics/goodies/source/filter.vcl/idxf/* Every diff is appreciated, Sven
That is a good idea!
Could you show us a direction? Should we replace "char" with an Unicode related type such as "OUString" in dxfentrd.hxx or just apply rtl_convertTextToUnicode() in a method EvaluateGroup(DXFGroupReader & rDGR) ? --- class DXFTextEntity : public DXFBasicEntity { public: char sText[DXF_MAX_STRING_LEN+1]; // 1 ---
Replacing a type of encoding, as below, has given us a quick solution for a Japanese DXF file encoded in SHIFT_JIS. In goodies/source/filter.vcl/idxf/dxf2mtf.cxx , replace String aUString( aStr, RTL_TEXTENCODING_IBM_437 ); with String aUString( aStr, RTL_TEXTENCODING_SHIFT_JIS ); For general implementation, we should recognize a keyword "$DWGCODEPAGE" in the DXF file format and choose an appropriate encoding. There would be another issue. What actual font will be used for font family "FAMILY_SWISS" in an individual language version, which is specified in the source code? StarSuite 7 Solaris (ja) plus the quick patch can solve this, whereas OOo 1.1 RC5 Solaris (en) cannot.
Sorry for the long delay, I was too busy. Looking for "$DWGCODEPAGE" would be the solution, but there is another problem, we need to have a codepage conversion table from DXF codepage to our RTL_TEXTENCODING (rtl/textenc.h). Currenlty I know that there might exist a "ANSI_932" which is most likely to be mapped to RTL_TEXTENCODING_MS_932, but I need the complete set of textencodings which are used within DXF, so we will fix this bug also for the other languages. I was not able to find the complete DXF textencoding table yet. I see, each font in our DXF filter is getting the FAMILY_SWISS and the FontName is not used, I think the default font depends to the current platform that is used. I don't know if DXF supports fontname and fontfamily, then perhaps our filter needs to be improved.
Thank you for your response. [dev] How to obtain locale information? http://www.openoffice.org/servlets/BrowseList?listName=dev&by=thread&from=44513 For FAMILY_SWISS, there might be nothing to do since Japanese comunity has confirmed this quick patch works with both OOo1.1 (ja) Windows and Linux.
"According to the OpenOffice.org roadmap (http://tools.openoffice.org/releases) this issue was retargeted to OOo Later."
Created attachment 10625 [details] a patch to solve this problem (partially)
A patch for problem is supplied. --maho
This patch solves this issue for Windows.
Hi, I have attached a very very untested and tentative patch to handle this by parsing $DWGCODEPAGE and just special casing the instance of ANSI_932 or DOC932 Will someone who has access to many dxf files that can manually add $DWGCODEPAGE 3 ANSI_932 9 to the header section of the file, please test the attached patch to make sure it works (all I have tested is that it will build). This is probably close to working (I hope). Kevin
Created attachment 10659 [details] untested patch that attempts to parse $DWGCODEPAGE
Hi, Please ignore my last patch, it had a bug in it. I have now fixed that I think. Please test out dxf_fix_rev2.patch (note: it has some debug output to stderr that should be removed after verifying it works). Kevin
Created attachment 10661 [details] a revised patch that seems to work on my machine
Adding myself to CC on this
Created attachment 10663 [details] A perl script to insert $DWGCODEPAGE in DXF file.
Created attachment 10664 [details] A DXF sample file that includes $DWGCODEPAGE = DOS932, prepared by the tool above.
Hi Kevin, Thank you for the patch. Could you modify it a little? 1. Replace with "DOC932" with "DOS932" 2. Substitute "_SHIFT_JIS" with "_MS_932" like: if (strcmp(rDGR.GetS(),"ANSI_932")==0 || strcmp(rDGR.GetS(),"DOS932")==0) { setTextEncoding(RTL_TEXTENCODING_MS_932); Tora
I was mistaken in the earier part of this issue. The correct one is: --- 9 $DWGCODEPAGE 3 ANSI_932 ---- or ---- 9 $DWGCODEPAGE 3 DOS932 ---- Further information: http://www.autodesk.com/techpubs/autocad/acad2000/dxf/header_group_codes_in_dxf_files_dxf_aa.htm In addition, ANSI_932 is used for AutoCAD R14 or later; DOS932 is for R13 or earier.
Hi Tora, Thanks for testing it. Please create an improved patch with your indicated changes and attach it to this issue. Then after it has been tested and verified by others we can try to get this retargeted to OOo 1.1.1 fix1. Kevin
Created attachment 10696 [details] Patch updated, verified with SS6.1 Beta2 Solaris (ja)
The patch idxf_fix_rev3.patch.txt will be working for a Japanese DXF file that contains a variable $DWGCODEPAGE. Otherwise, a tool that inserts the variable should be applied to the DXF file before importing it.
reset target milestone, reassign
Hi, Cleaned up patch (white space and debug print statements removed) committed to cws_src645_ooo111fix1 So resolving this as fixed. Please verify and close one 1.1.1 becomes available. Thanks, Kevin
Please revert or fix immediatelly. On Windows: guw.pl /cygdrive/c/Progra~1/Micros~3/VC98/Bin/cl.exe @/tmp/mkb01152 Command: /cygdrive/c/Progra~1/Micros~3/VC98/Bin/cl.exe dxf2mtf.cxx c:\OOo\pavel\BuildDir\ooo_1.1.0_src\goodies\source\filter.vcl\idxf\dxf2mtf.cxx(4 36) : error C2662: 'getTextEncoding' : cannot convert 'this' pointer from 'const class DXFRepresentation' to 'class DXFRepresentation &' Conversion loses qualifiers c:\OOo\pavel\BuildDir\ooo_1.1.0_src\goodies\source\filter.vcl\idxf\dxf2mtf.cxx(4 95) : error C2662: 'getTextEncoding' : cannot convert 'this' pointer from 'const class DXFRepresentation' to 'class DXFRepresentation &' Conversion loses qualifiers dmake: Error code 2, while making '../../../wntmsci9.pro/slo/dxf2mtf.obj'
Hi Pavel, That is my change so the fault is mine. The problem is that I do not understand why this is causing a problem under Windows at all. The class DXF2GDIMetaFile has a public member declared as follows: const DXFRepresentation * pDXF; We use that to access the DXFRrepresentation objects's getTextEncoding method as follows: { String aUString( aStr, pDXF->getTextEncoding() ); So we are using a pointer to access the object method. This seems to work fine under Linux. Why does the Windows complier try to convert that pointer at all? error C2662: 'getTextEncoding' : cannot convert 'this' pointer from 'const class DXFRepresentation' to 'class DXFRepresentation &' Conversion loses qualifiers What am I missing here? Does the "const" somehow prevent the object pointer from being used the way it can normally be used? To fix this would be a complete guess, since I do not understand why the error is occurring. Would you please try removing the const from the definition of that method to see if that matters? Some hint at what the compiler is trying to do here and why it can't do it would be nice? Thanks, Kevin
Created attachment 10837 [details] Fixed by following neighbor classes. Verified on Solaris.
Hi Pavel, Using -Wall under Linux I got a message about dropping const when passed as "this" which may be related to what you are seeing. I changed the dxfreprd.hxx header file to declare getTextEncoding outside of the class declaration and added const to it (since all getTextEncoding does is read and never write. This removed the dropping "const" message from -Wall under Linux. I hope it also fixes/prevents the funny conversion that Windows is trying to do here as well. I have committed that change to fix1. Please give it a try and let me know if it does the trick. Kevin
Hi, Are the neighbor classes actually needed to make this compile under Windows? They shoudl not be as long as we add const to the getTextEncoding mthod and move its defintion outside of the class itself. Will you please check out and try what I just committed and if it does not build under Windows, we will go with your neighbor class solution. Thanks, Kevin
I would like to recommend adopting this new patch idxf_fix_rev4.patch.txt since it follows the existing programing style that has had no problem for a long time. Not to mention, it does not include unnecessary lines and have correct indents. I have verified this new patch with SS6.1 Beta2 on Solaris.
Hi, Thanks for your new patch. The probnlem has already been fixed in OOo 1.1.1 fix1 and that fix should appear in anonymous cvs soon. The orignal patch (rev3) was modified before committal to remove all white space changes and all of the debug print statements and in general to clean it up. As it turns out, the error under Win was just related to the missing "const" in the definition of getTextEncoding and neighbor classes should not be needed here (especially since rtl_TextEncoding is in reality just a "short int"!). Please give the version now in cws_srx645_ooo111fix1 a look see. If you are still unhappy with that change, please submit a new patch to the owner of that issue to modify it to make it better. Thanks, I am remarking this as resolved -fixed. Please verify when OOo 1.1.1 is released. Kevin
Thanks Kevin, build on Windows fixed.
Thanks Kebin, you are on the right track. Tora
close issue