Class TdbTextExtractionInfo

java.lang.Object
com.tietoenator.trip.jxp.data.TdbTextExtractionInfo

public class TdbTextExtractionInfo extends Object
Container for text extraction information. Access to instances of this class is provided via properties in the TdbTextField class. Use this class to perform and control text extraction using TRIPcof or TRIPview-C.
  • Method Summary

    Modifier and Type
    Method
    Description
    Get the name of the STRING field that receives a copy of the document data, and/or contains the document data to extract text from.
    boolean
    Determines if the text extraction should be performed on the server (false; default) or on the client-side (true).
    boolean
    Determines if the text extraction should be performed from the value already stored in the STRING field defined by the property BinaryCopyField.
    boolean
    Indicates if text extraction is to be performed.
    Get the name of the file that the Stream property refers to.
    Get the name of the phrase field to recieve document property names during text extraction.
    Get the value of the phrase field to recieve document property values during text extraction.
    Get the stream for document data to extract text from and optionally store a copy of.
    If this method returns a non-null value, text extraction will be performed from any value assigned to the TdbStringField instance in question.
    The text field in which the extracted text is to be stored.
    void
    setBinaryCopyField(String binaryCopyField)
    Set the name of the STRING field that receives a copy of the document data, and/or contains the document data to extract text from.
    void
    setClientSide(boolean clientside)
    If set to true, text extraction will be performed on the client-side and require a local installation of TRIPcof or of TRIPview-C.
    void
    setExtractFromStored(boolean extractFromStored)
    Determines if the text extraction should be performed from the value already stored in the STRING field defined by the property BinaryCopyField.
    void
    setExtractText(boolean doExtractText)
    Indicates if text extraction is to be performed.
    void
    setFileName(String filename)
    Set the name of the file that the Stream property refers to.
    void
    setPropertyNameField(String propertyNameField)
    Sets the name of the phrase field to recieve document property names during text extraction.
    void
    setPropertyValueField(String propertyValueField)
    Sets the value of the phrase field to recieve document property values during text extraction.
    void
    Set a stream for document data to extract text from and optionally store a copy of.
    void
    If assigned a non-null value, text extraction will be performed from any value assigned to the provided TdbStringField instance.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • getClientSide

      public boolean getClientSide()
      Determines if the text extraction should be performed on the server (false; default) or on the client-side (true).
      Returns:
      True to extract text on client and false to extract text on server.

      If TRIPcof is used, the TRIPcof installation directory must be specified as either a system property with the name TRIPCOF_HOME, or an environment variable with the name TRIPCOF_HOME

      If TRIPview-C is used, the TRIPview-C installation directory must be specified as either a system property with the name TRIPVIEW_HOME, or an environment variable with the name TRIPVIEW_HOME.

      If the text extraction is performed on the client-side, it is fully performed before commit of the record. The data transmitted to the server for the associated text field will be the extracted text. No TRIPview-C/TRIPcof processing is then done on the server.

    • setClientSide

      public void setClientSide(boolean clientside)
      If set to true, text extraction will be performed on the client-side and require a local installation of TRIPcof or of TRIPview-C.
      Parameters:
      clientside - True to extract text on client and false to extract text on server.

      If TRIPcof is used, the TRIPcof installation directory must be specified as either a system property with the name TRIPCOF_HOME, or an environment variable with the name TRIPCOF_HOME

      If TRIPview-C is used, the TRIPview-C installation directory must be specified as either a system property with the name TRIPVIEW_HOME, or an environment variable with the name TRIPVIEW_HOME.

      If the text extraction is performed on the client-side, it is fully performed before commit of the record. The data transmitted to the server for the associated text field will be the extracted text. No TRIPview-C/TRIPcof processing is then done on the server.

    • getTextField

      public String getTextField()
      The text field in which the extracted text is to be stored.
    • getBinaryCopyField

      public String getBinaryCopyField()
      Get the name of the STRING field that receives a copy of the document data, and/or contains the document data to extract text from.

      If a STRING field is specified in this property and neither the FileName nor the Stream property is specified, text extraction will be performed from the data already stored in the specified string field. Note that this is only supported for updates of existing records.

    • setBinaryCopyField

      public void setBinaryCopyField(String binaryCopyField)
      Set the name of the STRING field that receives a copy of the document data, and/or contains the document data to extract text from.

      If a STRING field is specified in this property and neither the FileName nor the Stream property is specified, text extraction will be performed from the data already stored in the specified string field. Note that this is only supported for updates of existing records.

      If the setStringField(String) method has been called with a non-null value and that refers to a different STRING field than this property does, that value will be reset to null when the setBinaryCopyField method is called with a non-null, non-empty argument.

      When the setStringField method property is called with a non-null value, the BinaryCopyField property will be automatically assigned the name of the associated STRING field.

    • getStringField

      public TdbStringField getStringField()
      If this method returns a non-null value, text extraction will be performed from any value assigned to the TdbStringField instance in question.
    • setStringField

      public void setStringField(TdbStringField stringField)
      If assigned a non-null value, text extraction will be performed from any value assigned to the provided TdbStringField instance.

      When this property is assigned a non-null value, the value of the BinaryCopyField property is assigned the name of the assigned string field automatically.

      When this method is called with a non-null value, the property Stream will be ignored. The property FileName is still required, though.

      If text extraction is to be performed from a value already stored in a string field, it is better not to include the TdbStringField instance at all in the TdbRecord object used for commit, and instead call setExtractFromStored with 'true' as argument.

      See Also:
    • getPropertyNameField

      public String getPropertyNameField()
      Get the name of the phrase field to recieve document property names during text extraction.
    • setPropertyNameField

      public void setPropertyNameField(String propertyNameField)
      Sets the name of the phrase field to recieve document property names during text extraction.
    • getPropertyValueField

      public String getPropertyValueField()
      Get the value of the phrase field to recieve document property values during text extraction.
    • setPropertyValueField

      public void setPropertyValueField(String propertyValueField)
      Sets the value of the phrase field to recieve document property values during text extraction.
    • getFileName

      public String getFileName()
      Get the name of the file that the Stream property refers to.

      The file name is a valuable aid to the text extractor in helping to determine the type of the file, if it cannot be determined by any other means.

      If FileName is specified and refers to an existing local file and the Stream property is null, text extraction will be performed from the named file. This is the preferred choice if the application is running on the TRIP server machine.

    • setFileName

      public void setFileName(String filename)
      Set the name of the file that the Stream property refers to.

      The file name is a valuable aid to the text extractor in helping to determine the type of the file, if it cannot be determined by any other means.

      If FileName is specified and refers to an existing local file and the Stream property is null, text extraction will be performed from the named file. This is the preferred choice if the application is running on the TRIP server machine.

    • getStream

      public InputStream getStream()
      Get the stream for document data to extract text from and optionally store a copy of.
    • setStream

      public void setStream(InputStream is)
      Set a stream for document data to extract text from and optionally store a copy of.
    • getExtractText

      public boolean getExtractText()
      Indicates if text extraction is to be performed. Must be set to true in order for extraction to work.
      Returns:
      true if text extraction is to be performed
    • setExtractText

      public void setExtractText(boolean doExtractText)
      Indicates if text extraction is to be performed. Must be set to true in order for extraction to work.
      Parameters:
      doExtractText - Pass true to perform text extraction.
    • getExtractFromStored

      public boolean getExtractFromStored()
      Determines if the text extraction should be performed from the value already stored in the STRING field defined by the property BinaryCopyField. This property is false by default.
    • setExtractFromStored

      public void setExtractFromStored(boolean extractFromStored) throws TdbException
      Determines if the text extraction should be performed from the value already stored in the STRING field defined by the property BinaryCopyField.

      Set this property to true if you wish to perform a text extraction on data previously added to a STRING field, but do not intend or wish to supply the value again.

      This property is false by default.

      Parameters:
      extractFromStored - True to extract from stored file, false otherw.se
      Throws:
      TdbException - If the TRIPsystem version is too old.