Apache OpenOffice (AOO) Bugzilla – Issue 834
csv data format issues
Last modified: 2013-08-07 15:15:02 UTC
The import filter for csv files improperly interprets a newline within a text field as the end of the current record. The behaviour should be to keep on reading that field until the text field is closed, escaping any newlines along the way. Excel will properly import the same csv file, properly escaping any newlines that are encountered within a text field in a record. We really need to match this behaviour. Even kspreadsheet from kde properly matches this behaviour. (sadly, gnumeric does not; they recieved a similar bug report this evening)
The 605 is a quite 'old' version. But I'll try to reproduce it in a 625.
dummy text
Reproduced in OOo627 on W2K. OOo doesn't recognise linefeed within a record. If you create a file with linebreaks in a cell they are also not stored in CSV format.
yep
reassigned to correct account
reaccepted..
changing QA contact from bugs@ to issues@
Setting target to OOo 2.0
Eike, this problem hits many people. Is it possible to change target to 1.0.x or 1.1?
@Pavel: Only if I'd have some spare time, which I doubt and the reason I set this issue to OOo2.0. Fixing this would mean to change the parser and the preview to not read in lines of data anymore, which isn't that complicated but also not a trivial fix. In fact the preview would give more headache I guess.
Preview should replace \n in a string with something reasonable, as currently done with TAB characters. I think trying to show strings in multiple lines would break nearly everything in the preview.
*** Issue 7370 has been marked as a duplicate of this issue. ***
*** Issue 19556 has been marked as a duplicate of this issue. ***
*** Issue 21625 has been marked as a duplicate of this issue. ***
Hmmm .... would this also be interesting for database access?
Hi Frank, Makes also sense for the dba CSV import. I'll try to generalize the approach if it can be untangled from the Calc import and make it available as a kind of SvStream::ReadField() method, if possible.
great!
Hi A while ago I reported http://www.openoffice.org/issues/show_bug.cgi?id=21625 (a duplicate, sorry). I just wanted to show how I dealt with the problem, here's some sample code: http://www.pinkjuice.com/howto/vimxml/tasks.xml#markinguptables (=> "Complex CSV") Not sure if it's of any help, since you probably don't use regexen to parse the CSV data, and since my code is probably not general enough. Anyways, good luck with fixing this bug, Tobi
*** Issue 25814 has been marked as a duplicate of this issue. ***
Is there anything I can do to help get this issue fixed quickly? It's been around for a long time, and it's causing a lot of extra work here. We know that the spreadsheet can handle multi-line data in a single cell (hit Ctrl-Enter), so it can't be a fundamental problem with the spreadsheet; it is a problem with the import filter, the routines that support imports, the import API, or something completely different?
Lbc, This is nothing for a quick-fix, the line oriented parser has to be changed, and as mentioned above, ideally a stream reading field extractor should be created, which isn't much more work than doing nearly the same for Calc only. I have it on my ToDo list for OOo2.0 together with some other CSV related issues. If you want to help with it you could create a SvStream::ReadField() method that reads fields into an OUStringBuffer, taking the field separators into account and whether consecutive separators should be combined into one. If you're interested please submit a JCA form (see http://www.openoffice.org/contributing.html) before submitting code. Thanks Eike
er, can we up the priority on this one? It's been around a while and it's causing a lot of grief.
It's targeted for 2.0. Even if you raise the priority, it will still only be fixed for 2.0 ...
*** Issue 32966 has been marked as a duplicate of this issue. ***
So the actual data import part of this is fixed here: http://bugzilla.ximian.com/show_bug.cgi?id=62446 The preview needs some re-architecting loving - but the code is ugly in there. It can of course, stay line-based, but the line-data needs re-freshing if the text delimiter changes.
Michael, thanks for the pointer. Just a status update: this issue is assigned to CWS csvio, see http://eis.services.openoffice.org/EIS2/servlet/cws.ShowCWS?Id=1037&Path=SRC680%2Fcsvio Eike
So - I completed our GUI re-factor to make this pleasant, and powerful. Our patches are at: http://ooo.ximian.com/ooo-build/patches/OOO_1_1/sc-csv-newline.diff http://ooo.ximian.com/ooo-build/patches/OOO_1_1/sc-csv-gui.diff HTH.
On branch cws_src680_csvio: tools/inc/stream.hxx 1.6.36.1 tools/source/stream/stream.cxx 1.16.34.1 sc/source/ui/dbgui/asciiopt.cxx 1.21.12.1 sc/source/ui/dbgui/scuiasciiopt.cxx 1.5.12.1 sc/source/ui/docshell/docsh.cxx 1.72.22.1 sc/source/ui/docshell/impex.cxx 1.29.12.1 sc/source/ui/inc/asciiopt.hxx 1.8.146.1 sc/source/ui/inc/scuiasciiopt.hxx 1.3.132.1 Note that there are quite some differences to the patches mentioned above. @Frank: new method SvStream::ReadCsvLine()
Reopen to reassign.
Reassign to QA.
Restore status.
Found fixed on Solaris, Linux and Windows using CWS cvsIO
*** Issue 35612 has been marked as a duplicate of this issue. ***
*** Issue 36599 has been marked as a duplicate of this issue. ***
Found fixed on Master src680m62 using Linux, Solaris and Windows Build
*** Issue 41888 has been marked as a duplicate of this issue. ***
*** Issue 44488 has been marked as a duplicate of this issue. ***
*** Issue 48271 has been marked as a duplicate of this issue. ***
*** Issue 13878 has been marked as a duplicate of this issue. ***