Enum TdbSegmentationType

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Comparable<TdbSegmentationType>

    public enum TdbSegmentationType
    extends java.lang.Enum<TdbSegmentationType>
    Types of token segmentation that can be performed on Chinese text.
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
      AllTokens
      The segmentation algorithm will emit all possible segments for a String of Chinese characters.
      MaxLengthOnly
      The segmentation algorithm will attempt to match only the longest possible segment (String) of characters.
      None
      No special word segmentation is performed, each character is indexed as a word in its own right; this is the default behavior and reflects historical TRIP behavior.
      Word
      This segmentation algorith is similar to MaxLengthOnly and adds re-segmentation of all words longer than three Chinese characters.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.String getName()
      Return the TRIPxpi protocol name for the current identifier
      static TdbSegmentationType getTypeof​(java.lang.String name)
      Retrieve the type ID that matches the provided name
      static TdbSegmentationType valueOf​(java.lang.String name)
      Returns the enum constant of this type with the specified name.
      static TdbSegmentationType[] values()
      Returns an array containing the constants of this enum type, in the order they are declared.
      • Methods inherited from class java.lang.Enum

        compareTo, equals, getDeclaringClass, hashCode, name, ordinal, toString, valueOf
      • Methods inherited from class java.lang.Object

        getClass, notify, notifyAll, wait, wait, wait
    • Enum Constant Detail

      • None

        public static final TdbSegmentationType None
        No special word segmentation is performed, each character is indexed as a word in its own right; this is the default behavior and reflects historical TRIP behavior.
      • MaxLengthOnly

        public static final TdbSegmentationType MaxLengthOnly
        The segmentation algorithm will attempt to match only the longest possible segment (String) of characters. This will result in a smaller index and faster searching, but also has the potential to miss or incorrectly index certain terms.
      • Word

        public static final TdbSegmentationType Word
        This segmentation algorith is similar to MaxLengthOnly and adds re-segmentation of all words longer than three Chinese characters.
      • AllTokens

        public static final TdbSegmentationType AllTokens
        The segmentation algorithm will emit all possible segments for a String of Chinese characters. This will result in a larger index and possibly slower searches than using MaxLengthOnly but has much lower potential for missing terms.
    • Method Detail

      • values

        public static TdbSegmentationType[] values()
        Returns an array containing the constants of this enum type, in the order they are declared. This method may be used to iterate over the constants as follows:
        for (TdbSegmentationType c : TdbSegmentationType.values())
            System.out.println(c);
        
        Returns:
        an array containing the constants of this enum type, in the order they are declared
      • valueOf

        public static TdbSegmentationType valueOf​(java.lang.String name)
        Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)
        Parameters:
        name - the name of the enum constant to be returned.
        Returns:
        the enum constant with the specified name
        Throws:
        java.lang.IllegalArgumentException - if this enum type has no constant with the specified name
        java.lang.NullPointerException - if the argument is null
      • getTypeof

        public static TdbSegmentationType getTypeof​(java.lang.String name)
        Retrieve the type ID that matches the provided name
        Parameters:
        name - The name to match
        Returns:
        The type ID
      • getName

        public java.lang.String getName()
        Return the TRIPxpi protocol name for the current identifier
        Returns:
        The name