Difference between revisions of "LoadICUBreakIterator"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Breaking encapsulation of ICU BreakIterator: Link to Bug 88411, which is a better implementation than the one that patches ICU 3.6)
 
Line 113: Line 113:
 
# Figure out if locale BreakIteratorRules (<code>{edit_word, dict_word, count_word, char, line}</code>) gives something for the requested locale
 
# Figure out if locale BreakIteratorRules (<code>{edit_word, dict_word, count_word, char, line}</code>) gives something for the requested locale
 
# If not, try to load ''rule''+<code>_</code> + ''lang'' string anyway.
 
# If not, try to load ''rule''+<code>_</code> + ''lang'' string anyway.
 +
[[Category:Localization]]

Latest revision as of 20:20, 14 March 2010

Breaking encapsulation of ICU BreakIterator

Because of Issue 84467 (duplicate of the Issue 81519 ) we are using RuleBasedBreakIterator() constructor and then we want to setBreakType() there.

There is a fix to this that removes the patch to ICU by creating a subclass of RuleBasedBreakIterator which can access the protected setBreakType() member. The bug is here: Issue 88411

ICU code:

OpenOffice.org code:

Mailing list discussions:

Example reasons to use custom rules:

Use cases of loadICUBreakIterator

Questions:

  • Why does wordRule need to be static and preserved across the calls?
  • Is rulestring word used at all? Other WordTypes?
public method loadICU call resulting rule text

nextCharacters(Text, nStartPos, rLocale, SKIPCELL, sal_Int32 nCount, nDone)

prevCharacters(Text, nStartPos, rLocale, SKIPCELL, sal_Int32 nCount, nDone)

loadICUBreakIterator(rLocale, LOAD_CHARACTER_BREAKITERATOR, 0, "char", Text) char

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, ANYWORD_IGNOREWHITESPACES)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, ANYWORD_IGNOREWHITESPACES)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, ANYWORD_IGNOREWHITESPACES, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, ANYWORD_IGNOREWHITESPACES, NULL, Text) edit_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, DICTIONARY_WORD)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, DICTIONARY_WORD)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, DICTIONARY_WORD, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, DICTIONARY_WORD, NULL, Text) dict_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, WORD_COUNT)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, WORD_COUNT)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, WORD_COUNT, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, WORD_COUNT, NULL, Text) count_word

nextWord( const OUString& Text, sal_Int32 nStartPos, rLocale, another_word_type)

previousWord(const OUString& Text, sal_Int32 nStartPos, rLocale, another_word_type)

getWordBoundary( const OUString& Text, sal_Int32 nPos, rLocale, another_word_type, sal_Bool bDirection)

loadICUBreakIterator(rLocale, LOAD_WORD_BREAKITERATOR, another_word_type NULL, Text) word (???)

beginOfSentence( const OUString& Text, sal_Int32 nStartPos, rLocale)

endOfSentence( const OUString& Text, sal_Int32 nStartPos,rLocale)

loadICUBreakIterator(rLocale, LOAD_SENTENCE_BREAKITERATOR, 0, NULL, Text); NULL

getLineBreak( const OUString& Text, sal_Int32 nStartPos, const lang::Locale& rLocale, sal_Int32 nMinBreakPos, const LineBreakHyphenationOptions& hOptions, const LineBreakUserOptions& /*rOptions*/ )

loadICUBreakIterator(rLocale, LOAD_LINE_BREAKITERATOR, 0, "line", Text); line
  1. Figure out if locale BreakIteratorRules ({edit_word, dict_word, count_word, char, line}) gives something for the requested locale
  2. If not, try to load rule+_ + lang string anyway.
Personal tools