|
Qizx/open API | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectnet.axyana.qizxopen.util.DefaultWordSifter
A default word extractor suitable for European languages compatible with ISO-8859-1.
By default, words start on a letter, accept letters/digits inside. Optionally (and by default), characters are folded to lowercase and accented letters are converted to the corresponding non-accented letters.
Constructor Summary | |
DefaultWordSifter()
Builds a case-insensitive and accents-insensitive sifter. |
|
DefaultWordSifter(boolean caseSensitive,
boolean accentSensitive)
Builds a sifter specifying case and accent sensitiveness. |
Method Summary | |
char |
charAt(int ahead)
Returns the character at current position + ahead, or 0 if after end. |
boolean |
isWordPart(char c)
Returns true if the char can be part of a word. |
boolean |
isWordStart(char c)
Returns true if the char can be at start of a word. |
char |
mapChar(char c)
Normalizes a character (belonging to a word) |
char |
nextChar()
Moves to next character and return it, returns 0 if at end. |
char[] |
nextWord()
Gets the next normalized word, or null if no more words. |
void |
start(char[] text,
int length)
Starts the analysis of a new text chunk. |
char |
wildcardSeveral()
Returns the wildcard character which matches several characters. |
char |
wildcardSingle()
Returns the wildcard character which matches a single character. |
int |
wordLength()
Returns the original length of the last word returned by nextWord. |
int |
wordOffset()
Returns the offset of the last word returned by nextWord. |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public DefaultWordSifter()
public DefaultWordSifter(boolean caseSensitive, boolean accentSensitive)
caseSensitive
- if false, uppercase and lowercase characters are equivalent.accentSensitive
- if false, a letter with diacritic signs is equivalent to
the same letter without diacritic sign, for example '?' is equivalent to 'e'.Method Detail |
public void start(char[] text, int length)
WordSifter
start
in interface WordSifter
public boolean isWordStart(char c)
WordSifter
isWordStart
in interface WordSifter
public boolean isWordPart(char c)
WordSifter
isWordPart
in interface WordSifter
public char wildcardSeveral()
WordSifter
wildcardSeveral
in interface WordSifter
public char wildcardSingle()
WordSifter
wildcardSingle
in interface WordSifter
public char mapChar(char c)
WordSifter
mapChar
in interface WordSifter
public char[] nextWord()
WordSifter
nextWord
in interface WordSifter
public char charAt(int ahead)
WordSifter
charAt
in interface WordSifter
public char nextChar()
WordSifter
nextChar
in interface WordSifter
public int wordOffset()
WordSifter
wordOffset
in interface WordSifter
public int wordLength()
WordSifter
wordLength
in interface WordSifter
|
© 2005 Axyana Software | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |