Issue 126720 - no text imported from XSLX (xl/SharedStrings.xml instead of xl/sharedStrings.xml)
Summary: no text imported from XSLX (xl/SharedStrings.xml instead of xl/sharedStrings....
Status: RESOLVED FIXED
Alias: None
Product: Calc
Classification: Application
Component: open-import (show other issues)
Version: 4.1.2
Hardware: All All
: P3 Normal (vote)
Target Milestone: 4.1.14
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: ms_interoperability
: 127086 (view as issue list)
Depends on:
Blocks:
 
Reported: 2015-12-03 20:06 UTC by tananser
Modified: 2023-01-07 17:23 UTC (History)
4 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: 4.2.0-dev
Developer Difficulty: ---


Attachments
two files with different results of import (25.10 KB, application/zip)
2015-12-03 20:06 UTC, tananser
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description tananser 2015-12-03 20:06:11 UTC
Created attachment 85192 [details]
two files with different results of import

I have a lot of files created in MS Excel 2007 and saved as .xlsx. When I open them with OpenOffice 4.1.2 Calc there is no text in cells, but numbers and digits are there. I opened such file with MS Excel and saved it as .xls and it was opened correctly with Calc. Both of xlsx and xls will be attached
Comment 1 Keith N. McKenna 2015-12-04 02:11:39 UTC
I have confirmed that the xlsx file does not import the text values with AOO 4.1.2. Also the xls file open correctly in AOO 4.1.2. The xlsx file opens in the Excel Viewer with all the text entries shown.

System Configuration:
Processor: Intel Core i5 CPU M560 @2.67GHz
Installed Memory: 2.00 GB (1.6 usable)
Operating System: Windows 7 Home Premium 64 bit

Apache Open Office:
AOO412m3(Build:9782)  -  Rev. 1709696
2015-10-21 09:53:29 (Mi, 21 Okt 2015)
Language: en_US
Additional Language Packs: None
Comment 2 damjan 2023-01-03 17:02:36 UTC
Same issue as 127086, one of the files is named:
xl/SharedStrings.xml
instead of:
xl/sharedStrings.xml

If you rename it to a zip file, unzip it, change the filename, zip it back up, rename back to xslx, it opens perfectly, with all the text visible.

We should treat OOXML filenames case-insensitively, like Excel and LibreOffice do.
Comment 3 damjan 2023-01-03 17:03:41 UTC
*** Issue 127086 has been marked as a duplicate of this issue. ***
Comment 4 damjan 2023-01-06 09:06:11 UTC
Where in the code does this problem occur, and how can we fix it?

main/oox/source/xls/workbookfragment.cxx does this:

---snip---
    // read the shared string table substream (requires finalized styles buffer)
    OUString aSstFragmentPath = getFragmentPathFromFirstType( CREATE_OFFICEDOC_RELATION_TYPE( "sharedStrings" ) );
    if( aSstFragmentPath.getLength() > 0 )
        importOoxFragment( new SharedStringsFragment( *this, aSstFragmentPath ) );
---snip---

Debugging that code:

Thread 1 hit Breakpoint 1, oox::xls::WorkbookFragment::finalizeImport (this=0x80dc9ed00) at source/xls/workbookfragment.cxx:208
208	    if( aSstFragmentPath.getLength() > 0 )
(gdb) print dbg_dump(aSstFragmentPath)
$1 = (const sal_Char *) 0x80a0ef168 "xl/sharedStrings.xml"


Eventually we get as far as this, trying to open that xl/sharedStrings.xml:


#0  OStorage::OpenStreamElement_Impl(rtl::OUString const&, int, unsigned char) (this=this@entry=0x80dca4bc0, aStreamName=..., nOpenMode=nOpenMode@entry=1, bEncr=bEncr@entry=0 '\000') at source/xstor/xstorage.cxx:2204
#1  0x000000080e0761d5 in OStorage::openStreamElement(rtl::OUString const&, int) (this=0x80dca4bc0, aStreamName=..., nOpenMode=1) at source/xstor/xstorage.cxx:2507
#2  0x000000080e076ab2 in non-virtual thunk to OStorage::openStreamElement(rtl::OUString const&, int) ()
    at instsetoo_native/unxfbsdx/Apache_OpenOffice/installed/install/en-US/openoffice4/program/../program/libxstor.so
#3  0x000000080e60c795 in oox::ZipStorage::implOpenInputStream(rtl::OUString const&) (this=<optimized out>, rElementName=...) at source/helper/zipstorage.cxx:171
#4  0x000000080e609cb9 in oox::StorageBase::openInputStream(rtl::OUString const&) (this=0x80dc4b030, rStreamName=...) at source/helper/storagebase.cxx:164
#5  0x000000080e609c70 in oox::StorageBase::openInputStream(rtl::OUString const&) (this=0x80db770f0, rStreamName=...) at source/helper/storagebase.cxx:160
#6  0x000000080e4f9889 in oox::core::FilterBase::openInputStream(rtl::OUString const&) const (this=<optimized out>, rStreamName=...) at source/core/filterbase.cxx:370
#7  0x000000080e50340f in oox::core::FragmentHandler::openFragmentStream() const (this=0x80dc73c00) at source/core/fragmenthandler.cxx:123
#8  0x000000080e5096c2 in oox::core::XmlFilterBase::importFragment(rtl::Reference<oox::core::FragmentHandler> const&) (this=0x80daff000, rxHandler=...) at source/core/xmlfilterbase.cxx:208
#9  0x000000080e70ecf3 in oox::xls::WorkbookFragment::finalizeImport() (this=0x80dca4b20) at source/xls/workbookfragment.cxx:209


Then an exception is thrown, because it's not found.

Now where best to scan the zip file for names with different casing?
Comment 5 damjan 2023-01-06 10:14:15 UTC
OStorage::openStreamElement() is in main/package, which isn't just used by OOXML but also by ODF (a breakpoint there gets hit many times while loading an ODF too), so I don't like making changes there for an OOXML-specific bug.

oox::ZipStorage::implOpenInputStream() seems like a better place.
Comment 6 damjan 2023-01-06 10:22:22 UTC
I've now patched oox::ZipStorage::implOpenInputStream() to do case insensitive filenames matching when case sensitive fails, and it gets this file to open successfully and all the text shows.

Fixed by commit 0f42b9a04e21324973f03349bb2929327cf84a20.

Resolving FIXED :).

Thank you for your bug report and sample file!
Comment 7 Matthias Seidel 2023-01-06 21:11:59 UTC
Cherry-picked for AOO42X with:
https://github.com/apache/openoffice/commit/bd3f92fa7151c22b06c065512cbefd13960d9f7c
Comment 8 Matthias Seidel 2023-01-07 17:23:18 UTC
Cherry-picked for AOO41X with:
https://github.com/apache/openoffice/commit/25c6f4b735608c9ccf2d582718536ff7c9470ddd