Issue 93503 - Calc export of non-gregorian date to Excel loses format
Summary: Calc export of non-gregorian date to Excel loses format
Status: ACCEPTED
Alias: None
Product: Calc
Classification: Application
Component: save-export (show other issues)
Version: OOo 3.2.1
Hardware: All All
: P3 Trivial with 4 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: ms_interoperability
Depends on:
Blocks: 41707 92549
  Show dependency tree
 
Reported: 2008-09-05 08:36 UTC by leknarm
Modified: 2017-05-20 11:11 UTC (History)
12 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Before save as Microsoft Excel 97/2000/XP (.xls) (296.72 KB, image/bmp)
2008-09-05 08:51 UTC, tantai
no flags Details
After open in Excel (256.77 KB, image/bmp)
2008-09-05 08:53 UTC, tantai
no flags Details
the bug document (73.50 KB, text/plain)
2008-09-05 08:54 UTC, tantai
no flags Details
fix date format export to excel (1.09 KB, patch)
2010-07-16 11:56 UTC, tantai
no flags Details | Diff
fix buddhist calendar export in excel format (2.52 KB, patch)
2010-08-03 04:09 UTC, tantai
no flags Details | Diff
fix excel import and export (3.47 KB, patch)
2010-09-22 07:40 UTC, tantai
no flags Details | Diff
fixed the case that excel export date with locale id [$-xxx0000] (3.57 KB, patch)
2010-10-21 09:53 UTC, tantai
no flags Details | Diff
contain date with locale id 0 ( [$-1070000] ) (23.00 KB, application/vnd.ms-excel)
2010-11-18 08:02 UTC, tantai
no flags Details
updated version (3.82 KB, patch)
2010-12-02 04:26 UTC, tantai
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description leknarm 2008-09-05 08:36:41 UTC
Steps to reproduce:

1. Start OpenOffice.org Calc
2. Tool -> Option -> Language Setting -> Languages and set Locale setting to Thai
3. Type a date (ex. 15/05/2008) in a cell
4. Click the cell and choose Format -> Cells -> Numbers, then changes the 
following 
	3.1) Change Category to Date
	3.2) Change Language to Thai
	        3.3) Change Fomat to 31 ธันวาคม 2542
5. Click OK
6. Save as Microsoft Excel 97/2000/XP (.xls)
7. Exit OpenOffice.org Calc
8. Open the xls file in Microsoft Excel
    Excel will show error dialog box "Some number formats may have been lost."
9. Observe the Date will be formatted as an integer number
Comment 1 tantai 2008-09-05 08:51:47 UTC
Created attachment 56252 [details]
Before save as Microsoft Excel 97/2000/XP (.xls)
Comment 2 tantai 2008-09-05 08:53:10 UTC
Created attachment 56253 [details]
After open in Excel
Comment 3 tantai 2008-09-05 08:54:56 UTC
Created attachment 56254 [details]
the bug document
Comment 4 stefan.baltzer 2009-07-22 14:41:28 UTC
Problem still exists in DEV300_m52 (OOo 3.2 code line). Confirming issue.
Adjusting summary. Old summary "Calc export of Buddhist date to Excel is broken"
could be misinterpreted as "this is a regression"...
Comment 5 tantai 2010-07-16 11:56:21 UTC
Created attachment 70648 [details]
fix date format export to excel
Comment 6 tantai 2010-07-16 11:59:00 UTC
This the patch to fix this issue.

diff -rup OOO320_m19_ori/sc/source/filter/excel/xestyle.cxx
OOO320/sc/source/filter/excel/xestyle.cxx
--- OOO320_m19_ori/sc/source/filter/excel/xestyle.cxx	2010-05-26
23:30:44.000000000 +0700
+++ OOO320/sc/source/filter/excel/xestyle.cxx	2010-07-16 17:50:08.188256700 +0700
@@ -1327,8 +1327,15 @@ String XclExpNumFmtBuffer::GetFormatCode
             }
 
             aFormatStr = pEntry->GetMappedFormatstring( *mpKeywordTable,
*mxFormatter->GetLocaleData() );
+            xub_StrLen pos;
             if( aFormatStr.EqualsAscii( "Standard" ) )
                 aFormatStr.AssignAscii( "General" );
+            else if( ( pos = aFormatStr.SearchAscii("[~buddhist]", 0) ) !=
STRING_NOTFOUND )
+                aFormatStr.ReplaceAscii( pos, 11, "[$-107041E]", 11 );
+            else if( ( pos = aFormatStr.SearchAscii("[~hijri]", 0) ) !=
STRING_NOTFOUND )
+                aFormatStr.ReplaceAscii( pos, 8, "[$-1060401]", 11);
+            else if( ( pos = aFormatStr.SearchAscii("[~persian]", 0) ) !=
STRING_NOTFOUND )
+                aFormatStr.ReplaceAscii( pos, 10, "[$-3060429]", 11 );
         }
     }
     else
Comment 7 niklas.nebel 2010-07-16 13:25:57 UTC
reassigning for Daniel to evaluate the patch
Comment 8 daniel.rentz 2010-07-16 13:32:50 UTC
The patch has a little drawback, it does not respect quoted string inside the
number format, e.g. the format

  "[~buddhist]" 0

gives numbers like

  [~buddhist] 1234

instead of

  1234

in the cell. But I think we can live with that, as correct handling would
require a complete format code parser...
Comment 9 ooo 2010-07-16 14:17:54 UTC
This should be done at the number formatter instead, as also Writer/Word use
those format codes. SvNumberformat::GetMappedFormatstring() in
svl/source/numbers/zformat.cxx would be the right place, and act upon
NF_SYMBOLTYPE_CALENDAR in the switch(pType[j]).

Simply replacing the calendar string with a fixed "[$-...]" notation may not
always work as desired, as the value in that notation depends on the actual
locale selected, not sure what happens in Excel if those don't match. Better
would be to use the locale set at the number format and construct the proper
string. For MS-LCID documentation in this context see for example
http://office.microsoft.com/en-us/excel-help/creating-international-number-formats-HA001034635.aspx#BMcalendartype
However, we may live with fixed values for a first attempt if that sounds too
complicated.
Comment 10 samphan 2010-07-17 08:32:26 UTC
Thank er, we will find a better solution by looking at 
SvNumberformat::GetMappedFormatstring() next.
Comment 11 samphan 2010-07-19 13:53:02 UTC
Accept the issue. Trying to find the best solution for this problem.
Comment 12 tantai 2010-08-03 04:09:49 UTC
Created attachment 70926 [details]
fix buddhist calendar export in excel format
Comment 13 tantai 2010-08-03 04:11:08 UTC
patch to fix export buddhist calendar

diff -rup OOO320m19-ori/svtools/source/numbers/zformat.cxx
OOO320m19/svtools/source/numbers/zformat.cxx
--- OOO320m19-ori/svtools/source/numbers/zformat.cxx	2010-05-26
23:30:44.000000000 +0700
+++ OOO320m19/svtools/source/numbers/zformat.cxx	2010-08-02 17:35:47.000000000 +0700
@@ -4128,14 +4128,6 @@ String SvNumberformat::GetMappedFormatst
         }
 
         const SvNumberNatNum& rNum = NumFor[n].GetNatNum();
-        // The Thai T NatNum modifier during Xcl export.
-        if (rNum.IsSet() && rNum.GetNatNum() == 1 &&
-                rKeywords[NF_KEY_THAI_T].EqualsAscii( "T") &&
-                MsLangId::getRealLanguage( rNum.GetLang()) ==
-                LANGUAGE_THAI)
-        {
-            aPrefix += 't';     // must be lowercase, otherwise taken as literal
-        }
 
         USHORT nAnz = NumFor[n].GetnAnz();
         if ( nSem && (nAnz || aPrefix.Len()) )
@@ -4195,6 +4187,34 @@ String SvNumberformat::GetMappedFormatst
                                 aStr += '"';
                             }
                             break;
+                        case NF_SYMBOLTYPE_CALDEL :
+                            if ( pStr[j].EqualsAscii("[~") )
+                            {
+                                if ( pStr[j+1].EqualsAscii("buddhist") )
+                                {
+                                    aStr.InsertAscii( "[$-", aStr.Len() );
+                                    if ( rNum.IsSet() && rNum.GetNatNum() == 1 &&
+                                            MsLangId::getRealLanguage(
rNum.GetLang() ) ==
+                                            LANGUAGE_THAI )
+                                    {
+                                        aStr.InsertAscii( "0D07041E", aStr.Len() );
+                                    }
+                                    else
+                                    {
+                                        aStr.InsertAscii( "0107041E", aStr.Len() );
+                                    }
+                                }
+                                else  // ignore other calendar
+                                {
+                                    j++;  // step to skip "]"
+                                }
+                                j++;
+                            }
+                            else
+                            {
+                                aStr += pStr[j];
+                            }
+                        break;
                         default:
                             aStr += pStr[j];
                     }
Comment 14 tantai 2010-08-03 04:11:53 UTC
I fixed this issue, Excel export with buddhist calendar, with this patch.  
 
This patch check "[~buddhist]" and [NatNum1] and convert it into appropriate
MS-LCID eg. "[$-0107041E]". 
 
It works successfully when open the xls file in Excel. Please take a look at the
patch to see whether we're on the right track. 
 
However, when I open the exported xls file with OpenOffice.org, the number
format will be converted to "[$-41E]" by the import filter. However, it will not
convert any native number or calendar options.  
I will fix this import filter issue in another issue report. BTW, can you guide
me where the code is.
Comment 15 ooo 2010-08-03 11:51:28 UTC
@tantai_thanakanok:

Thanks, some nitpicks though:

0. Please always attach diffs as files of type text/patch, pasting diffs to
   the comment field may introduce unintended line breaks and other quirks.

1. The patch removes the Thai T NatNum modifier exported as t to Excel, that
   is unacceptable. This modifier is a specialty of the Thai Excel version we
   import and export.
   Any specific reason you removed that?

2. When encountering NF_SYMBOLTYPE_CALDEL there is no need to additionally
   check for "[~", NF_SYMBOLTYPE_CALDEL is only set for that context.

For import the number format scanner/parser is used, see
svl/source/numbers/zforscan.cxx with the strategic entry point
ImpSvNumberformatScan::ScanFormat()
Be warned, that is hard stuff ;-)

Btw, I suggest you use a recent DEV300 milestone for development, things are
quite changing between releases and the number formatter isn't even in module
svtools anymore but in svl instead.
Comment 16 tantai 2010-09-22 07:40:06 UTC
Created attachment 71799 [details]
fix excel import and export
Comment 17 tantai 2010-09-22 07:41:07 UTC
In the export filter, I remove the code that "check Thai T NatNum modifier and 
export as "t" to Excel" because in Excel, the "t" format will format Thai number 
using native digits correctly in Thai locale only. So I check NatNum specifier 
and export it as LCID to format Thai number shape correctly in all locale. 

In the import filter, I add the check for LCID and convert it into an 
appropriate calendarID and NatNum before checking for symboltype.

I try to add support for other calendarID in the export and import filter. 
However, I have the following problems in the export filter :- 
- for [~hijri], in GetMappedFormatString(), I always get LANGUAGE_ENGLISH_US in 
the cell while actually it should be LANGUAGE_ARABIC_XXXX. So I cannot add the 
right numeral shape code into the LCID.
- for [~persian], in GetMappedFormatString(), the celendarID [~-persion] 
disappear from the format code. I don't know when and why.
Comment 18 daniel.rentz 2010-09-22 08:52:37 UTC
DR->ER: patch changes number formatter, please have a look
Comment 19 tantai 2010-10-21 09:53:03 UTC
Created attachment 72126 [details]
fixed the case that excel export date with locale id [$-xxx0000]
Comment 20 samphan 2010-11-08 08:13:55 UTC
We have built Windows binary with the above patch and tested both internally and
by a few volunteers. The patched binary work correctly.

ER: can you please review the code?
Comment 21 ooo 2010-11-16 00:06:26 UTC
@tantai_thanakanok:

1. What exactly does this change in SvNumberformat::ImpGetLanguageType()
   attempt to fix?

-    return (nNum && (cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :
+    return ((cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :


2. I don't like the change made to SvNumberformat::ImpNextSymbol()

+    xub_StrLen pos;
+    if ( ( pos = rString.SearchAscii( "[$-", 0 ) ) != STRING_NOTFOUND )
+    {
+        if ( rString.GetChar(pos+10) == ']' )
+        {
+            if ( rString.GetChar(4) == '0' && rString.GetChar(5) == '7' )
+            {
+                rString.InsertAscii( "[~buddhist]", pos+11 );
+            }
+            if ( rString.GetChar(3) == 'D' )
+            {
+                rString.InsertAscii( "[NatNum1]", pos+11 );
+            }
+        }
+    }

because it tries to search for the "[$-" substring upon each and every call of
the method, completely ignoring the state machine used otherwise in the
method. The case of "[$-" is already detected with case SsGetBracketed and
cToken case '$' and rString.GetChar(nPos) == '-' where eSymbolType
= BRACKET_SYMBOLTYPE_LOCALE is set.

Furthermore, since it always searches from position 0, it detects the
condition multiple times and could insert the new strings each time the
subconditions are met, which currently is not the case because the
LanguageType value is only 16 bit and not a full LCID, hence the result the
calling method reinserts (the scanned value in proper case) is shorter than
the scanned D07xxxx (might in fact be considered a bug) and the problem
doesn't occur.

Anyhow, using the number formatter dialog and setting the Language to Thai and
trying to use the code
[$-D07041E]0
and pressing the green check mark button does not accept the code. If
I understood correctly that should result in
[$-41E][~buddhist][NatNum1]0


3. I still don't agree to the removal or conversion of

-        // The Thai T NatNum modifier during Xcl export.
-        if (rNum.IsSet() && rNum.GetNatNum() == 1 &&
-                rKeywords[NF_KEY_THAI_T].EqualsAscii( "T") &&
-                MsLangId::getRealLanguage( rNum.GetLang()) ==
-                LANGUAGE_THAI)
-        {
-            aPrefix += 't';     // must be lowercase, otherwise taken as literal
-        }

This condition happens only if either the user did input such a format or,
more likely, the document was imported from a file originating from a Thai
Excel version. For the latter case in round trip scenarios the 't' has to be
preserved for older Excel versions. This was a requirement of the Thai user
community.

A conversion of NatNum==1 in LANGUAGE_THAI _without_ NF_KEY_THAI_T being set
to "[$-D00041E]" may be fine though.
Comment 22 tantai 2010-11-18 07:56:58 UTC
thank you. I'll take a good look at your comments
Comment 23 tantai 2010-11-18 08:02:19 UTC
Created attachment 75035 [details]
contain date with locale id 0 ( [$-1070000] )
Comment 24 tantai 2010-11-18 08:51:00 UTC
Step to create date with locale id 0:
1.) open Microsoft excel
2.) type date eg. 18/11/2010
3.) right click > format cells > Number > Date , choose 14/03/2544
4.) check in category custom, the LCID is [$-1070000]
5.) click ok
6.) save in xls format
7.) open the xls file in OpenOffice.org Calc, the format code have a locale id 0
([$-0])

Comment 25 samphan 2010-11-18 09:18:51 UTC
@er

> 1. What exactly does this change in SvNumberformat::ImpGetLanguageType()
>   attempt to fix?
> -    return (nNum && (cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :
> +    return ((cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :

The reason for this fix is to allow nNum == 0, i.e. LocaleId == 0 which mean
"system locale". The above attached file/use-case shows that it is possible.

We've made the fix while also make sure that there's no other side-effect, e.g.
cases from invalid localeid still return the same.
Comment 26 tantai 2010-12-02 04:26:19 UTC
Created attachment 75211 [details]
updated version
Comment 27 samphan 2010-12-02 04:29:39 UTC
> 2. I don't like the change made to SvNumberformat::ImpNextSymbol()
Corrected in the updated version of the patch.

> Anyhow, using the number formatter dialog and setting the Language to Thai and
> trying to use the code
> [$-D07041E]0
> and pressing the green check mark button does not accept the code. If
> I understood correctly that should result in
> [$-41E][~buddhist][NatNum1]0

I guess the example should actually be written as 
[$-D07041E]dd/mm/yy
because the number format '0' doesn't make sense for date.
And the result, after applying the patch, will be
[$-41E][NatNum1][~buddhist]dd/mm/yy

Is this OK?

Since in the current implementation, you can't use the format
[$-41E][~buddhist][NatNum1]dd/mm/yy
at all. The number format dialog will accept the format code
if you enter it manually but after you click the green checkmark 
but it will do nothing.

The sequence "[$-41E][~buddhist][NatNum1]dd/mm/yy" happens
to be illegal in the current implementation. Using the sequence
"[$-41E][NatNum1][~buddhist]dd/mm/yy" will produce the desired
effect. We can fix this but it require a lot more modifications.

> 3. I still don't agree to the removal or conversion of
>-        // The Thai T NatNum modifier during Xcl export.
>-        if (rNum.IsSet() && rNum.GetNatNum() == 1 &&
>-                rKeywords[NF_KEY_THAI_T].EqualsAscii( "T") &&
>-                MsLangId::getRealLanguage( rNum.GetLang()) ==
>-                LANGUAGE_THAI)
>-        {
>-            aPrefix += 't';
>-        }
 
 The above code is modified as

+        // The Thai T NatNum modifier during Xcl export.
+        if (rNum.IsSet() && rNum.GetNatNum() == 1 &&
+                rKeywords[NF_KEY_THAI_T].EqualsAscii( "T") &&
+                MsLangId::getRealLanguage( rNum.GetLang()) ==
+                LANGUAGE_THAI && !LCIDInserted )
+        {
+            
+            aStr.InsertAscii( "[$-D00041E]", 0 );
+        }

Which means that the filter will accept the number format "t0" (processed 
elsewhere) from Excel and convert to "[$41E][NatNum1]0" in Calc. 
When convert back to excel it will become "[$-D00041E]0" (not "t0") 
which when import into Calc will become "[$41E][NatNum1]0" again.

The difference is that when export to Excel, the code will no longer generate
"t0" but the more correct (right?) "[$-D00041E]0".
Comment 28 samphan 2011-01-05 04:12:41 UTC
@er could you please review the patch.
Comment 29 Rob Weir 2013-03-11 15:01:22 UTC
I'm adding this comment to all open issues with Issue Type == PATCH.  We have 220 such issues, many of them quite old.  I apologize for that.  

We need your help in prioritizing which patches should be integrated into our next release, Apache OpenOffice 4.0.

If you have submitted a patch and think it is applicable for AOO 4.0, please respond with a comment to let us know.

On the other hand, if the patch is no longer relevant, please let us know that as well.

If you have any general questions or want to discuss this further, please send a note to our dev mailing list:  dev@openoffice.apache.org

Thanks!

-Rob
Comment 30 Marcus 2017-05-20 11:11:12 UTC
Reset assigne to the default "issues@openoffice.apache.org".