Enum Class TdbSegmentationType
- All Implemented Interfaces:
Serializable
,Comparable<TdbSegmentationType>
,Constable
Types of token segmentation that can be performed on Chinese text.
-
Nested Class Summary
Nested classes/interfaces inherited from class java.lang.Enum
Enum.EnumDesc<E extends Enum<E>>
-
Enum Constant Summary
Enum ConstantsEnum ConstantDescriptionThe segmentation algorithm will emit all possible segments for a String of Chinese characters.The segmentation algorithm will attempt to match only the longest possible segment (String) of characters.No special word segmentation is performed, each character is indexed as a word in its own right; this is the default behavior and reflects historical TRIP behavior.This segmentation algorith is similar to MaxLengthOnly and adds re-segmentation of all words longer than three Chinese characters. -
Method Summary
Modifier and TypeMethodDescriptiongetName()
Return the TRIPxpi protocol name for the current identifierstatic TdbSegmentationType
Retrieve the type ID that matches the provided namestatic TdbSegmentationType
Returns the enum constant of this class with the specified name.static TdbSegmentationType[]
values()
Returns an array containing the constants of this enum class, in the order they are declared.Methods inherited from class java.lang.Enum
compareTo, describeConstable, equals, getDeclaringClass, hashCode, name, ordinal, toString, valueOf
-
Enum Constant Details
-
None
No special word segmentation is performed, each character is indexed as a word in its own right; this is the default behavior and reflects historical TRIP behavior. -
MaxLengthOnly
The segmentation algorithm will attempt to match only the longest possible segment (String) of characters. This will result in a smaller index and faster searching, but also has the potential to miss or incorrectly index certain terms. -
Word
This segmentation algorith is similar to MaxLengthOnly and adds re-segmentation of all words longer than three Chinese characters. -
AllTokens
The segmentation algorithm will emit all possible segments for a String of Chinese characters. This will result in a larger index and possibly slower searches than usingMaxLengthOnly
but has much lower potential for missing terms.
-
-
Method Details
-
values
Returns an array containing the constants of this enum class, in the order they are declared.- Returns:
- an array containing the constants of this enum class, in the order they are declared
-
valueOf
Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)- Parameters:
name
- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException
- if this enum class has no constant with the specified nameNullPointerException
- if the argument is null
-
getTypeof
Retrieve the type ID that matches the provided name- Parameters:
name
- The name to match- Returns:
- The type ID
-
getName
Return the TRIPxpi protocol name for the current identifier- Returns:
- The name
-