Issue 40213 - title and first link use the same string -> optimize translation
Summary: title and first link use the same string -> optimize translation
Status: CONFIRMED
Alias: None
Product: documentation
Classification: Unclassified
Component: Online help (show other issues)
Version: current
Hardware: All All
: P3 Trivial (vote)
Target Milestone: AOO PleaseHelp
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-01-08 22:31 UTC by pavel
Modified: 2017-05-20 11:31 UTC (History)
4 users (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description pavel 2005-01-08 22:31:13 UTC
Hi,

we use PO files for translation of help. There are too many pieces like this:

#: 08020000.xhp#tit.help.text
msgid "Current Size"
msgstr "Aktuální velikost"

#: 08020000.xhp#hd_id3154011.1.help.text
msgid "\\<link href=\\\"text/simpress/02/08020000.xhp\\\" name=\\\"Current
Size\\\"\\>Current Size\\</link\\>"
msgstr "\\<link href=\\\"text/simpress/02/08020000.xhp\\\" name=\\\"Aktuální
velikost\\\"\\>Aktuální velikost\\</link\\>"

These strings come directly from the GSI file (1:1 mapping between GSI and PO file).

Can we define some mechanism or automated process to make it easier/consistent
to translate? If the string in the title of the page is already translated, why
do we have to translate it again?

Can't we re-use it somehow? What about e.g. (meta-diff):

 <title id="tit" xml-lang="en-US">Current Size</title>
-<link href="text/simpress/02/08020000.xhp" name="Current Size">Current
Size</link></paragraph>
+<link href="text/simpress/02/08020000.xhp" name="$title">$title</link></paragraph>

As you know, I like clear evidences, so:

pavel@linux:~/.ooo/cs> grep helpcontent en-US.sdf | awk -F'     ' '{if ($2 !=
PREVHELPFILE) { print $11; getline; print $11 ; PREVHELPFILE=$2} }'|grep -B1
"link href"|grep "link href"|wc -l
829
pavel@linux:~/.ooo/cs> 

In English, we have 829 occurences of this (title and exactly the following
string - link - contains the same string).

pavel@linux:~/.ooo/cs> grep helpcontent en-US.sdf | awk -F'     ' '{if ($2 !=
PREVHELPFILE) { print $11; getline; print $11 ; PREVHELPFIL
E=$2} }'|grep -B1 "link href"|grep "link href"|grep name|wc -l
739
pavel@linux:~/.ooo/cs> 

And as the same string is used also 739 times in the "name" in the link, we can
save 1568 strings from translations. Right now, we have 69559 strings in en-US
GSI file. This means, that by implementing this, we can save up-to 2.25% of the
translations.

The above applies for <link ...> being just after the <title in the GSI.
Sometimes, the link is the third line, so have to add them too:

pavel@linux:~/.ooo/cs> grep helpcontent en-US.sdf | awk -F'     ' '{if ($2 !=
PREVHELPFILE) { print $11; getline; getline; print $11 ; PR
EVHELPFILE=$2} }'|grep -B1 "link href"|grep "link href"|grep name|wc -l
991
pavel@linux:~/.ooo/cs> grep helpcontent en-US.sdf | awk -F'     ' '{if ($2 !=
PREVHELPFILE) { print $11; getline; getline; print $11 ; PREVHELPFILE=$2}
}'|grep -B1 "link href"|grep "link href"|wc -l
1118
pavel@linux:~/.ooo/cs> 

-> another 2109 strings saved.

So total numbers for this improvement would be: 3677 strings, 5.28% of the
complete translations (8.5% of the complete help). This a huge improvement!

Of course I can implement some mechanism in our tools for generating PO files,
but I'd like to solve this directly in the source also for teams who do not use
PO files. How does this affects Sun's translation mechanisms?

Giving this Prio 1 because it could save *a lot of time/money* for translators.
Comment 1 frank.thomas.peters 2005-01-09 20:15:21 UTC
First of all, this is by no means a P1.

Secondly, although this happens most of the time, title and first heading do not
*need* to use the same string. We can discuss changing the help DTD in a way
that this can be resolved.

Thirdly, this is yet another localization issue that could be resolved by using
TM or pretranslation. It may be theroretically possible to get rid of any
redundancy but you would end up with files full of cross references. Believe me,
we were at that stage before with StarOffice help where we did the weirdest
things to avoid duplication. I leave this to the localization tools because they
can handle that most efficiently without forcing the tech writer to find out
whether a phrase was used before in the help or not.

I am open for suggestions for this particular case, though, how we can optimize
this for >2.0
Comment 2 Marcus 2017-05-20 11:31:03 UTC
Reset assigne to the default "issues@openoffice.apache.org".