Issue 69088 - MML Import of multiple subscripts fails.
Summary: MML Import of multiple subscripts fails.
Status: CONFIRMED
Alias: None
Product: Math
Classification: Application
Component: code (show other issues)
Version: OOo 2.0.3
Hardware: All All
: P3 Trivial with 3 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-30 09:40 UTC by keinstein
Modified: 2013-08-07 14:54 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Same excerpt as file (365 bytes, application/octet-stream)
2006-08-30 12:25 UTC, michael.ruess
no flags Details
Correct MathML code in odt, which has wrong representation in OO (1.38 KB, text/xml)
2009-10-30 22:43 UTC, eugene_b
no flags Details
Formula with double subscripts, corrected in OOMath (1.74 KB, text/plain)
2009-11-13 11:59 UTC, eugene_b
no flags Details
The same file as http://www.openoffice.org/nonav/issues/showattachment.cgi/66094/content_corrected.mml, but with OOMath internal format string deleted (1.59 KB, text/plain)
2009-11-13 12:01 UTC, eugene_b
no flags Details
Small bugdoc to reproduce the import problem (10.70 KB, text/plain)
2009-11-16 14:11 UTC, thomas.lange
no flags Details
MathML file after cleaning up for ODF validation, "annotation" tag is removed (1008 bytes, text/plain)
2009-11-20 07:40 UTC, eugene_b
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description keinstein 2006-08-30 09:40:10 UTC
Trying to translate a LaTeX document to OOo I encountered that multiple indices 
are not correctly grouped, when imported from MML. This can be reproduced using
oomath by deleting the SO5 Annotation.

Example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE math:math PUBLIC "-//OpenOffice.org//DTD Modified W3C MathML 1.01//EN"
"math.dtd">
<math:math xmlns:math="http://www.w3.org/1998/Math/MathML">
<math:semantics>
<math:msub>
<math:mi>x</math:mi>
<math:msub>
<math:mi>s</math:mi>
<math:mi>j</math:mi>
</math:msub>
</math:msub>
</math:semantics>
</math:math>

should give "x_{s_j}", but produces "x_s_j" which is not displayed as expected.
Comment 1 michael.ruess 2006-08-30 12:25:58 UTC
Created attachment 38847 [details]
Same excerpt as file
Comment 2 michael.ruess 2006-08-30 12:38:47 UTC
MRU->TL: the attached file can be displaeyd as expected by FireFox browser, but
OO Math will display it as error, because it can't interpret something like
x_y_z . Thus it should be imported as x_{y_z}.
Comment 3 eugene_b 2009-10-30 22:43:33 UTC
Created attachment 65809 [details]
Correct MathML code in odt, which has wrong representation in OO
Comment 4 eugene_b 2009-10-30 22:44:46 UTC
I have tested this with OO-3.1.1. Nothing changed from 2006. This bug is still 
exist.
I made a simple latex test file:

\documentclass{article}
\begin{document}
\[
        R_{k} = \int_{t_{k}}^{t_{k+1}} y(t)\cdot s_{ref}(t) dt
\]
\end{document}

Having opened odt file from tex4ht, a saw the wrong OOMath formula:
R_k ="∫"_t_k^t_{k + 1} y { \(  t  \)}  cdot s_{r e f} { \(  t  \)} d t

Then I unpacked odt file and opened test-m2/content.xml file with MathML of the 
given formula in Opera browser, which displayed it correctly. So, I suggest 
this to be the OO bug.

See attachment http://www.openoffice.org/nonav/issues/showattachment.cgi/65809/
content.xml
Comment 5 thomas.lange 2009-11-03 09:22:40 UTC
tl->eugene_b: have a look at the target of the issue, it means it is not even
targeted for a planned release. This is usually because of the large numbers of
open issues and limited developer resources. And compared to others this one
seemed to be of lesser importance when being evaluated last time.

In generally: To get a better target you need either to raise the issue by
discussing it with QA or product management. Or, if you are capable, you can
provide a patch for the problem. Patches always have high priority and will be
integrated quickly if they are correct and don't introduce other problems.
Comment 6 eugene_b 2009-11-03 19:04:51 UTC
tl, I'm working on  this problem - it would be faster to solve it by myself. If 
I'll obtain the solution, I'll send a patch. But I am new in openoffice. It is 
quite large program and it will take a lot of time to understand the sources. 
Now I think the reason is somewhere in build/starmath/mathml.cxx file.
Comment 7 t3 2009-11-04 11:43:03 UTC
Eugene_b, I hope you will manage to fix this long standing bug, because it 
makes OO useless for interchange of mathematical texts. Too bad that OO 
developers don't see that as a higher priority problem. I suspect that fixing 
it wouldn't take more than an hour for someone familiar with the OO code base.

Note also that starmath/mathml.cxx file is no longer in the trunk. The one to 
look at now (I think) is starmath/source/mathmlimport.cxx.
Comment 8 eugene_b 2009-11-04 21:56:54 UTC
t3, thank you for this notice. I didn't use trunk, I investigated the stable 
sources whic my Gentoo system downloaded with its portage system. I want to 
takle this problem seriously, so I need to get latest sources from SVN. 
Unfortunately I'll have no time to digg the sources for next several days for 
some reasons.

The bug definitely seems to be easy to fix but unfortunately I'm unfamiliar 
with the OO sources and its exploration will also take some time. It seems 
nobody cares of this bug...
Comment 9 eugene_b 2009-11-12 16:33:42 UTC
I had to conclude the SUN odf plugin 3.1 from http://www.sun.com/software/star/
odf_plugin/get.jsp have exactly the same bug. Evidently it shares the same 
buggy code with OOo.
Comment 10 thomas.lange 2009-11-13 08:02:09 UTC
About the SUN odf plugin: more likely it just calls the OOo code to get things done.
Comment 11 eugene_b 2009-11-13 10:53:52 UTC
I have installed Odf plugin but there were no OpenOffice. So this code is 
included into Odf plugin itself.
Comment 12 eugene_b 2009-11-13 11:59:44 UTC
Created attachment 66094 [details]
Formula with double subscripts, corrected in OOMath
Comment 13 eugene_b 2009-11-13 12:01:41 UTC
Created attachment 66095 [details]
The same file as http://www.openoffice.org/nonav/issues/showattachment.cgi/66094/content_corrected.mml, but with OOMath internal format string deleted
Comment 14 eugene_b 2009-11-13 12:18:52 UTC
Some observations of the way OOMath handles math formulas.

One can notice two things.
1) OOMath incorrectrly interprets the MathML code with double subscripts.
2) Having created in OOMath the formula with double subscripts one can save it 
as MathML and reopen it correctily.
How could it be?

The answer is - OOMath additionally saves the formula in it's internal format! 

Look at attachment 66094 [details] (http://www.openoffice.org/nonav/issues/
showattachment.cgi/66094/content_corrected.mml). It is the formula with double 
subscripts, corrected and saved in OOMath. You could open it in OOMath and it 
displayed it correctly. I've checked the MathML in this file (and reindented it 
for better readability) - the MathML is correct. But this file has an 
additional string:
<math:annotation math:encoding="StarMath 5.0">R_k =&quot;∫&quot;_{t_k}^{t_{k + 
1}} y { \(  t  \)}  cdot s_{r e f} { \(  t  \)} d t</math:annotation>
It is the code of the formula in the OOMath format.
Having deleted this string I obtained the file you can see in attachment 66095 [details] 
(http://www.openoffice.org/nonav/issues/showattachment.cgi/66095/
content_wo_oomath.mml)
OOMath can't open it correctly!

So, some conclusions about the way OOMath handles formulas:
1) OOMath saves correct MathML code.
2) Along with the MathML, it saves an additional string with the formula in 
it's own format.
3) While opening document, OOMath look up the string in it's own format. Having 
found it OOMath uses it and completely ignores the MathML code.
4) If the file have no the string in OOMath's format, OOMath had to use MathML 
code, which it can't interpret correctly.


Comment 15 thomas.lange 2009-11-16 13:49:35 UTC
tl->eugene_b: Please try again once your content_wo_oomath.mml is a valid MathML
file. 

I created a text document with a single formula and replaced the the text in the
content.xml file of the formula with the one from content_wo_oomath.mml attache
here. Then I recreated a new odt from that by zipping all the files and folders.
After that I ran it through the ODF validator, it failed and I got this output:

This file is NOT valid

Result details:

upload:///BBBneu.odt/Object 1//Object
1/content.xml[15,48]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[38,60]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[53,52]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[59,52]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[63,48]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[84,48]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[90,48]:Error:cvc-complex-type.3.2.2: Attribute 'math:stretchy' is
not allowed to appear in element 'math:mo'.
upload:///BBBneu.odt/Object 1//Object
1/content.xml[101,22]:Error:cvc-complex-type.2.4.b: The content of element
'math:semantics' is not complete. One of
'{"http://www.w3.org/1998/Math/MathML":annotation,
"http://www.w3.org/1998/Math/MathML":annotation-xml}' is expected.
upload:///BBBneu.odt:Info:validation errors found
upload:///BBBneu.odt:Info:Generator: StarOffice/9$Win32
OpenOffice.org_project/300m64$Build-9446$CWS-tl76

That is there are two types of errors the one with the stretchy in 'mo' tags and
the one with the 'semantic' tag. Thus you need to fix those errors first before
we can see if there is a problem with the MathML import.

The ODF validator can be found here
http://tools.services.openoffice.org/odfvalidator/

Comment 16 thomas.lange 2009-11-16 14:11:30 UTC
Created attachment 66139 [details]
Small bugdoc to reproduce the import problem
Comment 17 thomas.lange 2009-11-16 14:14:39 UTC
tl->eugene_b: to make things a bit easier for you I created a small bugdoc
without the StarMath annotation tag to reproduce the problem.
(Don't worry about the wrong replacementment image it will get fixed once you
activate the formula.)

If your fix solves the import problem then you should be able to open the
document, double-click the formula and have it displayed correctly.
Comment 18 eugene_b 2009-11-20 07:40:03 UTC
Created attachment 66205 [details]
MathML file after cleaning up for ODF validation, "annotation" tag is removed
Comment 19 thomas.lange 2009-11-20 08:08:18 UTC
tl->eugene_b: Ok, as I can see that one can be imported now. However, in my
unfixed version, of course incorrect.
But is that one now working with your fix?
Comment 20 eugene_b 2009-11-20 08:23:22 UTC
This file was taken from OOo (I corrected and resaved the file, prepaired with 
Tex4ht). OK, I cleaned up all errors found with ODF validator manually and the 
file passed the check.

After that, I removed the "annotation" tag from the MathML file (attachment 
66205 http://www.openoffice.org/nonav/issues/showattachment.cgi/66205/
content_valid_wo_annotate.mml). ODF validator suggested this to be an error:

upload:///test2.odt/test-m2//test-m2/content.xml[2,953]:Error:cvc-complex-
type.2.4.b: The content of element 'math:semantics' is not complete. One of 
'{"http://www.w3.org/1998/Math/MathML":annotation, "http://www.w3.org/1998/Math/
MathML":annotation-xml}' is expected.

So, ODF validator requires "annotation" tag. Next, look at the MathML page in 
Wikipedia (http://en.wikipedia.org/wiki/MathML). There is an example of MathML 
code with two "annotation" tags, one in TeX format, another in "StarMath 
format":
  <annotation encoding="TeX">
     x=\frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
  </annotation>
  <annotation encoding="StarMath 5.0">
     x={-b plusminus sqrt {b^2 - 4 ac}} over {2 a}
  </annotation>
The software should distinguish the annotation format with "encoding" token. 
But ODF validator claims "encoding" token to be incorrect!

upload:///test1.odt/test-m2//test-m2/content.xml[2,1136]:Error:cvc-complex-
type.3.2.2: Attribute 'math:encoding' is not allowed to appear in element 
'math:annotation'.

Next, I placed the annotation in TeX format to the MathMl file and removed the 
"encoding" token (which is ignored by ODF validator and OOo). ODF validator 
detects this:

upload:///test2.odt/test-m2//test-m2/content.xml[2,948]:Error:cvc-complex-
type.2.4.a: Invalid content was found starting with element 'annotation'. One 
of '{"http://www.w3.org/1998/Math/MathML":annotation, "http://www.w3.org/1998/
Math/MathML":annotation-xml}' is expected.

The results. ODF validator requires MathML, embedded into ODF, to have 
"annotation" tag and it should be exactly in StarMath format. Otherwise the 
file would be suggested to be incorrect. But Open Document Format itself is the 
standard irrespective to the particular software, like StarMath. Unfortunatelly 
its current version lacks the description of the formula format. Maybe the root 
of this problem lays here. I haven't found any mention of "StarMath 5.0" if the 
ODF standard and in http://www.w3.org/1998/Math/MathML.

So now, the ODF validator http://tools.services.openoffice.org/odfvalidator/ 
check files not to be complicant to the Open Document Format standard, but to 
be compartible with OpenOffice. It's the OOo validator.

By the way, OOMath incorrectly displays the MathML formulas from official 
MathML testsuite http://www.w3.org/Math/testsuite/
Comment 21 eugene_b 2009-11-20 08:53:36 UTC
I took another ODF validator: http://opendocumentfellowship.com/validator
It found another errors! But it haven't found the errors, Sun ODF validator 
found previously. In particular, it allows the "annotation" tag to have 
"encoding" token and it allows the MathML not to have "annotation" tag at all.

I haven't found the obligatory requirement for MathML to have "annotation" tag 
nor in Open Document format standard (ISO/IEC 26300), nor in MathML 2.0 
description (W3C recommendation 21 october 2003 http://www.w3.org/TR/2003/REC-
MathML2-20031021/
)
Comment 22 thomas.lange 2009-11-20 09:42:50 UTC
tl->eugene_b:
I agree that according to 
  http://www.w3.org/TR/MathML2/chapter4.html#contm.semantics
a annotation tag should not be required and therefore or ODF validator might
have a slight problem here. (I have forwarded a note about this to someone else.)

However, since despite that the corrected MathML can still be imported lets drop
this particular item. (My attached sample doc also does not have that tag.)

And no, w3.org not listing a "StarMath 5.0" annotation is not a problem at all. 
w3.org is just listing some examples for known existing tags. But that does not
mean there is predefined and fixed set from that annotations must be chosen.

But I'm still curious about the effect of your fix, did it solve the problem?
If yes it would seem we have a patch. If that is the case please add it to this 
issue, change the type to 'PATCH' and assign the issue to me.