Issue 19826 - Postscript output uses ambiguous font encoding values. Unparsable.
Summary: Postscript output uses ambiguous font encoding values. Unparsable.
Status: CLOSED FIXED
Alias: None
Product: gsl
Classification: Code
Component: code (show other issues)
Version: OOo 1.1 RC3
Hardware: PC Linux, all
: P4 Trivial (vote)
Target Milestone: OOo 2.0
Assignee: Joost Andrae
QA Contact: issues@gsl
URL:
Keywords:
: 42983 (view as issue list)
Depends on:
Blocks:
 
Reported: 2003-09-20 05:23 UTC by dcinege
Modified: 2005-02-17 12:08 UTC (History)
1 user (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description dcinege 2003-09-20 05:23:41 UTC
Font encodings in postscript output have changed in openoffice 1.1RCx 
and now use random ambiguous values instead of the ascii value of 
the referenced character. This is bad and should be corrected. 
 
Open office 1.0.3 postscript output: 
 
/FontName /GnuMICRNormalHSet1 def 
/XUID [103 0 0 16#3CA9857D 13 16#7F837AB1 16#181136D6] def 
/FontMatrix [.001 0 0 .001 0 0] def 
/FontBBox [0 0 648 701] def 
/Encoding 256 array def 
    0 1 255 {Encoding exch /.notdef put} for 
    Encoding 32 /glyph0 put 
    Encoding 48 /glyph1 put 
    Encoding 49 /glyph2 put 
    Encoding 50 /glyph3 put 
    Encoding 51 /glyph4 put 
    Encoding 52 /glyph5 put 
    Encoding 53 /glyph6 put 
    Encoding 54 /glyph7 put 
    Encoding 55 /glyph8 put 
    Encoding 56 /glyph9 put 
    Encoding 57 /glyph10 put 
    Encoding 65 /glyph11 put 
    Encoding 67 /glyph12 put 
 
Open office 1.1RC3 postscript output: 
/FontName (GnuMICRNormalHGSet2) cvn def 
/XUID [103 0 0 16#3CA9857D 14 16#501FB36A 16#87C0B0A7] def 
/FontMatrix [.001 0 0 .001 0 0] def 
/FontBBox [0 0 648 701] def 
/Encoding 256 array def 
    0 1 255 {Encoding exch /.notdef put} for 
    Encoding 0 /glyph0 put 
    Encoding 8 /glyph1 put 
    Encoding 7 /glyph2 put 
    Encoding 6 /glyph3 put 
    Encoding 5 /glyph4 put 
    Encoding 4 /glyph5 put 
    Encoding 3 /glyph6 put 
    Encoding 2 /glyph7 put 
    Encoding 10 /glyph8 put 
    Encoding 11 /glyph9 put 
    Encoding 12 /glyph10 put 
    Encoding 13 /glyph11 put 
    Encoding 9 /glyph12 put 
    Encoding 1 /glyph13 put 
 
The above font is a MICR check printing font. The chacters utilized are 0 
-9, A, B, and ' ' (space). 1.0.3's output properly references these 
encoding by their ASCII value. IE: 
	Encoding 32 /glyph0 put 
References ASCII value 32, aka ' ' (space) 
However the same character in 1.1RC3: 
	Encoding 8 /glyph1 put 
The value 8 is arbitarily used. 
 
What this means, is references to the actual text further on in the PS 
output use the value 8 instead of 32 for 'show' output. 
 
IE: 
1.0.3 
/GnuMICRNormalHSet1 findfont 50 -50 matrix scale makefont setfont 
<43353433323131432041313233343132333441203536373839353637383943> 
[37 38 37 38 37 38 37 38 37 38 37 38 37 38 37 38 37 38 37 38 37 38 37 
38 37 38 37 38 37 38 0] 
xshow 
 
1.1RC3: 
(GnuMICRNormalHGSet2) cvn findfont 50 -50 matrix scale makefont 
setfont 
<0102030405060701080906050403060504030908020A0B0C0D020A0B0C0D01> 
[38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 
38 38 38 38 38 38 38 0] 
xshow 
 
The data in question lies between the <>'s. They should be identical. 
In 1.0.3 the values are the ASCII hexadecimal values. In 1.1RC3 the 
values are the arbitary values assigned to the font encoding.  
 
This postscript output change makes it impossible to post process the 
an OO postscript file with anything but a full postscript engine. I found 
this bug, as I am using OO as the template system for a sequencial 
check printing program.  OO's 1.1 postscript output, can no longer be 
parsed in any reasonable way!
Comment 1 christof.pintaske 2003-09-22 11:43:03 UTC
cp->dcinege: there is no warranty on the way we generate postscript
output. it is completely unspecified and you must not rely on any
implementation detail. it is subject of change without further notice.

cp->pl: in 1.0 we generated an ascii subset (HSet1) but in 1.1 we
never seem to do but start with subset (HGSet2) for the same
characters. Any ideas about that ?
Comment 2 dcinege 2003-09-22 18:26:13 UTC
dcinege->cp: I should note that the remainder of the 1.1RC3 
output still DOES encode based on the ascii char values, for 
different fonts. (IE Arial) But for this font, for some reason, 
it does not. (But in 1.0.3 it did) So it's output is inconsistent to 
itself...and to me that's a bug.  
 
As for the output standard, I understand it's subject to change, 
but in this area in particular, I've never seen postscript output 
from a word processing type program that didn't refer to a char 
according to their underlying ascii values in SOME way. (Over 
the years I've used 3 other programs for check templating 
before moving to OO. This is the first time It found an 
unparsable condition.) 
 
Comment 3 philipp.lohmann 2003-10-06 15:38:46 UTC
You'll find encoded characters used for type1 and printer builtin
fonts (Times, Helvetica and the like) since they are addressed via
their encoding. All TrueType fonts are subsetted, that is only the
used glyphs will be put into the new downloaded font - this was so in
1.0.3 also. The difference is that the printing code is not driven
with characters anymore but with glyph id's, that is the original
Unicode code point is not known at the point the character is output.
This is due to complex text layout which makes it possible to print
languages like arabic, thai and the like which do not have a simple
character <-> glyph correlation. 

That being said it would be possible to make an exception for ascii,
or even better ISO8859-15, but it would require some rework. I'll see
if i can do something in the 2.0 timeframe.
Comment 4 philipp.lohmann 2003-10-06 15:41:46 UTC
adjusting component and type
Comment 5 Martin Hollmichel 2004-05-28 17:50:04 UTC
according to the announcement on releases
(http://www.openoffice.org/servlets/ReadMsg?list=releases&msgNo=7503) this issue
will be re-targeted to OOo Later.
Comment 6 philipp.lohmann 2004-06-02 17:05:08 UTC
target
Comment 7 philipp.lohmann 2004-06-09 12:46:03 UTC
fixed in CWS vcl23; this will only work with Ansi1252 characters of course as
all other characters have to be mapped into an arbitrary single byte glyph map.
Comment 8 philipp.lohmann 2004-06-24 14:39:09 UTC
reopen
Comment 9 philipp.lohmann 2004-06-24 14:40:08 UTC
ja->pl: please verify in CWS vcl23; output is now ansi encoded for ansi characters
Comment 10 philipp.lohmann 2004-06-24 14:40:30 UTC
fixed
Comment 11 Joost Andrae 2004-06-25 14:14:25 UTC
JA: verified within cws vcl23

äääöööüüüßßáéç is now used with it's unicode values within the postscript output
<E4E4E4F6F6F6FCFCFCDFDFE1E9E7>
Comment 12 Joost Andrae 2004-10-08 15:53:15 UTC
JA: closing
Comment 13 philipp.lohmann 2005-02-17 12:08:59 UTC
*** Issue 42983 has been marked as a duplicate of this issue. ***