DetectDocumentText Processor

Part of the AWS Textract processor family

The DetectDocumentText processor will extract text from a given document, which can be either an image or a PDF document.

Properties

All of our Textract processors also include these common properties.

This processor does not have any unique properties outside of the common ones.

Data Output

If the Destination property is set to flowfile-attribute, then the output of this processor will be routed to the FlowFile's ocr.DetectedText attribute, which will be created if it isn't present.

Field Name

Data Type

Description

blocks

array of Block

The list of blocks returned from the API

Last updated