Language

The Free and Open Productivity Suite
Released: Apache OpenOffice 4.1.15

API

SDK

Tips ‘n’ Tricks

Miscellaneous


:: com :: sun :: star :: i18n ::

interface XBreakIterator
Description
contains the base routines for iteration in Unicode string. Iterates over characters, words, sentences and line breaks.

Assumption: StartPos is inclusive and EndPos is exclusive.

Developers Guide
OfficeDev - Implementing a New Locale - XBreakIterator
OfficeDev - Overview and Using the API - XBreakIterator

Methods' Summary
nextCharacters Traverses specified number of characters/cells in Text from nStartPos forwards. CharacterIteratorMode can be cell based or character based. A cell is made of more than one character.  
previousCharacters Traverses specified number of characters/cells in Text from nStartPos backwards. CharacterIteratorMode can be cell based or character based. A cell is made of more than one character.  
nextWord Traverses one word in Text from nStartPos forwards.  
previousWord Traverses one word in Text from nStartPos backwards.  
getWordBoundary Identifies StartPos and EndPos of current word.  
getWordType [ DEPRECATED ]
 
isBeginWord If a word starts at position nPos.  
isEndWord If a word ends at position nPos.  
beginOfSentence Traverses in Text from nStartPos to the start of a sentence.  
endOfSentence Traverses in Text from nStartPos to the end of a sentence.  
getLineBreak Calculate the line break position in the Text from the specified nStartPos.  
beginOfScript Traverses in Text from nStartPos to the beginning of the specified script type.  
endOfScript Traverses in Text from nStartPos to the end of the specified script type.  
nextScript Traverses in Text from nStartPos to the next start of the specified script type.  
previousScript Traverses in Text from nStartPos to the previous start of the specified script type.  
getScriptType Get the script type of the character at position nPos.  
beginOfCharBlock Traverses in Text from nStartPos to the beginning of the specified character type.  
endOfCharBlock Traverses in Text from nStartPos to the end of the specified character type.  
nextCharBlock Traverses in Text from nStartPos to the next start of the specified character type.  
previousCharBlock Traverses in Text from nStartPos to the previous start of the specified character type.  
Methods' Details
nextCharacters
long
nextCharacters( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nCharacterIteratorMode,
[in] long  nCount,
[out] long  nDone );

Description
Traverses specified number of characters/cells in Text from nStartPos forwards. CharacterIteratorMode can be cell based or character based. A cell is made of more than one character.
Parameter nCount
Number of characters to traverse, it should not be less than 0. If you want to traverse in the opposite direction use XBreakIterator::previousCharacters() instead.
previousCharacters
long
previousCharacters( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nCharacterIteratorMode,
[in] long  nCount,
[out] long  nDone );

Description
Traverses specified number of characters/cells in Text from nStartPos backwards. CharacterIteratorMode can be cell based or character based. A cell is made of more than one character.
Parameter nCount
Number of characters to traverse, it should not be less than 0. If you want to traverse in the opposite direction use XBreakIterator::nextCharacters() instead.
nextWord
Boundary
nextWord( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nWordType );

Description
Traverses one word in Text from nStartPos forwards.
Parameter nWordType
One of WordType, specifies the type of travelling.
Returns
The Boundary of the found word. Normally used for CTRL-Right.
previousWord
Boundary
previousWord( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nWordType );

Description
Traverses one word in Text from nStartPos backwards.
Parameter aLocale
The locale of the character preceding nStartPos.

If the previous character is a space character and nWordType indicates spaces should be skipped, and if the first non-space character is an Asian character, then, since Asian word break needs language specific wordbreak dictionaries, the method will return -1 in Boundary::endPos and the position after the Asian character (i.e. the space character) in Boundary::startPos. The caller then has to call this method again with a correct aLocale referring to the Asian character, which is then the previous character of the space character where nStartPos points to.

Note that the OpenOffice.org 1.0 / StarOffice 6.0 / StarSuite 6.0 i18n framework doesn't behave like this and mixed Western/CJK text may lead to wrong word iteration. This is fixed in later versions.

Parameter nWordType
One of WordType, specifies the type of travelling.
Returns
The Boundary of the found word. Normally used for CTRL-Left.
getWordBoundary
Boundary
getWordBoundary( [in] string  aText,
[in] long  nPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nWordType,
[in] boolean  bPreferForward );

Description
Identifies StartPos and EndPos of current word.

If nPos is the boundary of a word, it is StartPos of one word and EndPos of previous word. In this situation, the outcome of the algorithm can be indeterminate. In this situation the bPreferForward flag is used. If bPreferForward == false, nPos is considered to be the end of the word and we look backwards for beginning of word, otherwise nPos is considered to be the start of the next word and we look forwards for the end of the word.

Parameter nWordType
One of WordType.
Returns
The Boundary of the current word.
getWordType
short
getWordType( [in] string  aText,
[in] long  nPos,
[in] ::com::sun::star::lang::Locale  aLocale );

Usage Restrictions
deprecated
Deprecation Info
Get the WordType of the word that starts at position nPos.

This method is mis-defined, since WordType is not an attribute of a word, but a way to break words, like excluding or including tail spaces for spellchecker or cursor traveling. It returns 0 always.

isBeginWord
boolean
isBeginWord( [in] string  aText,
[in] long  nPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nWordType );

Description
If a word starts at position nPos.

It is possible that both of this method and following method isEndWord all return true, since StartPos of a word is inclusive while EndPos of a word is exclusive.

isEndWord
boolean
isEndWord( [in] string  aText,
[in] long  nPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nWordType );

Description
If a word ends at position nPos.
beginOfSentence
long
beginOfSentence( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale );

Description
Traverses in Text from nStartPos to the start of a sentence.
Returns
The position where the sentence starts.
endOfSentence
long
endOfSentence( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale );

Description
Traverses in Text from nStartPos to the end of a sentence.
Returns
The position where the sentence ends.
getLineBreak
LineBreakResults
getLineBreak( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] long  nMinBreakPos,
[in] LineBreakHyphenationOptions  aHyphOptions,
[in] LineBreakUserOptions  aUserOptions );

Description
Calculate the line break position in the Text from the specified nStartPos.
Parameter nMinBreakPos
Defines a minimum break position for hyphenated line break. When the position for hyphenated line break is less than nMinBreakPos, break position in LineBreakResults is set to -1.
Parameter aHyphOptions
Defines if the hyphenator is to be used.
Parameter aUserOptions
Defines how to handle hanging punctuations and forbidden characters at the start/end of a line.
Returns
The LineBreakResults contain the break position of the line, BreakType and ::com::sun::star::linguistic2::XHyphenatedWord
beginOfScript
long
beginOfScript( [in] string  aText,
[in] long  nStartPos,
[in] short  nScriptType );

Description
Traverses in Text from nStartPos to the beginning of the specified script type.
Parameter nScriptType
One of ScriptType.
Returns
The position where the script type starts.
endOfScript
long
endOfScript( [in] string  aText,
[in] long  nStartPos,
[in] short  nScriptType );

Description
Traverses in Text from nStartPos to the end of the specified script type.
Parameter nScriptType
One of ScriptType.
Returns
The position where the script type ends.
nextScript
long
nextScript( [in] string  aText,
[in] long  nStartPos,
[in] short  nScriptType );

Description
Traverses in Text from nStartPos to the next start of the specified script type.
Parameter nScriptType
One of ScriptType.
Returns
The position where the next script type starts.
previousScript
long
previousScript( [in] string  aText,
[in] long  nStartPos,
[in] short  nScriptType );

Description
Traverses in Text from nStartPos to the previous start of the specified script type.
Parameter nScriptType
One of ScriptType.
Returns
The position where the previous script type starts.
getScriptType
short
getScriptType( [in] string  aText,
[in] long  nPos );

Description
Get the script type of the character at position nPos.
Returns
One of ScriptType.
beginOfCharBlock
long
beginOfCharBlock( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nCharType );

Description
Traverses in Text from nStartPos to the beginning of the specified character type.
Parameter nCharType
One of CharType
Returns
The position where the character type starts
endOfCharBlock
long
endOfCharBlock( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nCharType );

Description
Traverses in Text from nStartPos to the end of the specified character type.
Parameter nCharType
One of CharType
Returns
The position where the character type ends.
nextCharBlock
long
nextCharBlock( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nCharType );

Description
Traverses in Text from nStartPos to the next start of the specified character type.
Parameter nCharType
One of CharType
Returns
The position where the next character type starts.
previousCharBlock
long
previousCharBlock( [in] string  aText,
[in] long  nStartPos,
[in] ::com::sun::star::lang::Locale  aLocale,
[in] short  nCharType );

Description
Traverses in Text from nStartPos to the previous start of the specified character type.
Parameter nCharType
One of CharType
Returns
The position where the previous character type starts.
Top of Page

Apache Software Foundation

Copyright & License | Privacy | Contact Us | Donate | Thanks

Apache, OpenOffice, OpenOffice.org and the seagull logo are registered trademarks of The Apache Software Foundation. The Apache feather logo is a trademark of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.