Textract API
Home of the AWS Textract processor family
Our AWS Textract processor family brings the functionality of Amazon's flagship OCR API to your NiFi flow. They can analyze .png
, .jpg
/.jpeg
and .pdf
files, stored in an Amazon S3 bucket.
Here is a list of the processors in the AWS Textract processor family:
Common Properties
Common properties are properties that are shared between all Textract processors This means every Textract processor will include these properties, plus whatever additional properties the individual processors add.
Properties whose names are in bold and italics are required.
Textract Region
- A dropdown list of AWS regions. This is to be set to the region you've set for TextractS3 Region
- A dropdown list of AWS regions. This is to be set to the region of the S3 bucket(s) you will be pulling documents from for analysisBucket
- An expression-language supporting input that holds the S3 bucket of the file to be analyzed.Object Key
- An expression-language supporting input that holds the name of the S3 object to be analyzedDestination
- A dropdown input that determines what part of the outgoing FlowFile will contain the output information. The value can be set to one of the following:flowfile-body
: the data will be put to the FlowFile body. Additionally, the FlowFile's mime.type property will be set to application/jsonflowfile-attribute
: the data will be put to an attribute, whose name depends on the processor. This name is listed on the processor's documentation page.
Communications Timeout
- how long before the processor routes a FlowFile to failure due to lack of API responseAWS Credentials Provider Service
- A reusable provider controller service that stores AWS credentials. If this is not set, you will need to put in whatever relevant credentials information manually into their respective properties.Access Key ID
- The secret access key ID of an AWS credentialSecret Access Key
- The body of an AWS credential's secret access keyCredentials File
- The path to a properties file (on your instance) containing an AWS access key and secret keySSL Context Service
- an optional reusable SSL context service which will be used to create connections if provided
Last updated