Issue 23541

Summary: Index from concordance file to ignore optional hyphens in doc
Product: Writer Reporter: othr <jnk1>
Component: codeAssignee: AOO issues mailing list <issues>
Status: CONFIRMED --- QA Contact:
Severity: Trivial    
Priority: P3 CC: issues
Version: OOo 1.1 RC5   
Target Milestone: ---   
Hardware: Other   
OS: Windows XP   
Issue Type: ENHANCEMENT Latest Confirmation in: ---
Developer Difficulty: ---
Issue Depends on:    
Issue Blocks: 128492    
Attachments:
Description Flags
Test Case with Generated Index none

Description othr 2003-12-14 07:58:10 UTC
Action: generating alphabetical indexes from a concordance file.

Enhancement: Would be nice if optional hyphens in words were ignored when 
matching words with the concordance file.
Comment 1 h.ilter 2003-12-15 11:12:30 UTC
Reassigned to BH
Comment 2 pesala 2006-11-04 13:14:17 UTC
If one uses optional hyphens [-] in index entries they are also treated 
differently. For example the following would generate four entries in the index:

concordance
con[-]cordance
concor[-]dance
con[-]cor[-]dance

It would be better if all optional hyphens [-]could be ignored when comparing index 
entries. 

Perhaps ordinary hyphens should be ignored too, though this is debatable. 
Perhaps the following should be treated as a single entry? (anti-semitism)

anti-semitism
antisemitism
anti[-]semitism
antisemit[-]ism
anti-semit[-]ism
Comment 3 pesala 2008-05-16 07:35:05 UTC
Created attachment 53695 [details]
Test Case with Generated Index
Comment 4 pesala 2008-05-16 07:41:40 UTC
This issue still affects release 2.4 

Optional hyphens should be ignored for indexing purposes, though it would be useful to 
include them in the index so that long words still break in the desired place in the index.

Non-breaking hyphens and regular hyphens should be treated as the same.

Optionally, hyphenated and unhyphenated terms that are otherwise identical could be 
combined under a single index entry, i.e. for indexing purposes anti-semitism = 
antisemitism = Anti-semitism. Whichever spelling was used first would take precedence 
as the index entry. 
Comment 5 bettina.haberer 2010-05-21 14:50:27 UTC
To grep the issues easier via "requirements" I put the issues currently lying on
my owner to the owner "requirements".