AWS SDK for C++AWS SDK for C++ Version 1.11.440 |
#include <TextractClient.h>
Amazon Textract detects and analyzes text in documents and converts it into machine-readable text. This is the API reference documentation for Amazon Textract.
Definition at line 23 of file TextractClient.h.
Definition at line 26 of file TextractClient.h.
Definition at line 30 of file TextractClient.h.
Definition at line 31 of file TextractClient.h.
Aws::Textract::TextractClientConfiguration()
,
nullptr
Initializes client to use DefaultCredentialProviderChain, with default http client factory, and optional client config. If client config is not specified, it will be initialized to default values.
nullptr
,
Aws::Textract::TextractClientConfiguration()
Initializes client to use SimpleAWSCredentialsProvider, with default http client factory, and optional client config. If client config is not specified, it will be initialized to default values.
nullptr
,
Aws::Textract::TextractClientConfiguration()
Initializes client to use specified credentials provider with specified client config. If http client factory is not supplied, the default http client factory will be used
Initializes client to use DefaultCredentialProviderChain, with default http client factory, and optional client config. If client config is not specified, it will be initialized to default values.
Initializes client to use SimpleAWSCredentialsProvider, with default http client factory, and optional client config. If client config is not specified, it will be initialized to default values.
Initializes client to use specified credentials provider with specified client config. If http client factory is not supplied, the default http client factory will be used
Analyzes an input document for relationships between detected items.
The types of information returned are as follows:
Form data (key-value pairs). The related information is returned in two Block objects, each of type KEY_VALUE_SET
: a KEY Block
object and a VALUE Block
object. For example, Name: Ana Silva Carolina contains a key and value. Name: is the key. Ana Silva Carolina is the value.
Table and table cell data. A TABLE Block
object contains information about a detected table. A CELL Block
object is returned for each cell in a table.
Lines and words of text. A LINE Block
object contains one or more WORD Block
objects. All lines and words that are detected in the document are returned (including text that doesn't have a relationship with the value of FeatureTypes
).
Signatures. A SIGNATURE Block
object contains the location information of a signature in a document. If used in conjunction with forms or tables, a signature can be given a Key-Value pairing or be detected in the cell of a table.
Query. A QUERY Block object contains the query text, alias and link to the associated Query results block object.
Query Result. A QUERY_RESULT Block object contains the answer to the query and an ID that connects it to the query asked. This Block also contains a confidence score.
Selection elements such as check boxes and option buttons (radio buttons) can be detected in form data and in tables. A SELECTION_ELEMENT Block
object contains information about a selection element, including the selection status.
You can choose which type of analysis to perform by specifying the FeatureTypes
list.
The output is returned in a list of Block
objects.
AnalyzeDocument
is a synchronous operation. To analyze documents asynchronously, use StartDocumentAnalysis.
For more information, see Document Text Analysis.
nullptr
An Async wrapper for AnalyzeDocument that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 131 of file TextractClient.h.
A Callable wrapper for AnalyzeDocument that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 122 of file TextractClient.h.
AnalyzeExpense
synchronously analyzes an input document for financially related relationships between text.
Information is returned as ExpenseDocuments
and seperated as follows:
LineItemGroups
- A data set containing LineItems
which store information about the lines of text, such as an item purchased and its price on a receipt.
SummaryFields
- Contains all other information a receipt, such as header information or the vendors name.
nullptr
An Async wrapper for AnalyzeExpense that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 163 of file TextractClient.h.
A Callable wrapper for AnalyzeExpense that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 154 of file TextractClient.h.
Analyzes identity documents for relevant information. This information is extracted and returned as IdentityDocumentFields
, which records both the normalized field and value of the extracted text. Unlike other Amazon Textract operations, AnalyzeID
doesn't return any Geometry data.
nullptr
An Async wrapper for AnalyzeID that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 192 of file TextractClient.h.
A Callable wrapper for AnalyzeID that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 183 of file TextractClient.h.
Creates an adapter, which can be fine-tuned for enhanced performance on user provided documents. Takes an AdapterName and FeatureType. Currently the only supported feature type is QUERIES
. You can also provide a Description, Tags, and a ClientRequestToken. You can choose whether or not the adapter should be AutoUpdated with the AutoUpdate argument. By default, AutoUpdate is set to DISABLED.
nullptr
An Async wrapper for CreateAdapter that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 222 of file TextractClient.h.
A Callable wrapper for CreateAdapter that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 213 of file TextractClient.h.
Creates a new version of an adapter. Operates on a provided AdapterId and a specified dataset provided via the DatasetConfig argument. Requires that you specify an Amazon S3 bucket with the OutputConfig argument. You can provide an optional KMSKeyId, an optional ClientRequestToken, and optional tags.
nullptr
An Async wrapper for CreateAdapterVersion that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 251 of file TextractClient.h.
A Callable wrapper for CreateAdapterVersion that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 242 of file TextractClient.h.
Deletes an Amazon Textract adapter. Takes an AdapterId and deletes the adapter specified by the ID.
nullptr
An Async wrapper for DeleteAdapter that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 277 of file TextractClient.h.
A Callable wrapper for DeleteAdapter that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 268 of file TextractClient.h.
Deletes an Amazon Textract adapter version. Requires that you specify both an AdapterId and a AdapterVersion. Deletes the adapter version specified by the AdapterId and the AdapterVersion.
nullptr
An Async wrapper for DeleteAdapterVersion that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 304 of file TextractClient.h.
A Callable wrapper for DeleteAdapterVersion that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 295 of file TextractClient.h.
Detects text in the input document. Amazon Textract can detect lines of text and the words that make up a line of text. The input document must be in one of the following image formats: JPEG, PNG, PDF, or TIFF. DetectDocumentText
returns the detected text in an array of Block objects.
Each document page has as an associated Block
of type PAGE. Each PAGE Block
object is the parent of LINE Block
objects that represent the lines of detected text on a page. A LINE Block
object is a parent for each word that makes up the line. Words are represented by Block
objects of type WORD.
DetectDocumentText
is a synchronous operation. To analyze documents asynchronously, use StartDocumentTextDetection.
For more information, see Document Text Detection.
nullptr
An Async wrapper for DetectDocumentText that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 342 of file TextractClient.h.
A Callable wrapper for DetectDocumentText that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 333 of file TextractClient.h.
Gets configuration information for an adapter specified by an AdapterId, returning information on AdapterName, Description, CreationTime, AutoUpdate status, and FeatureTypes.
nullptr
An Async wrapper for GetAdapter that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 369 of file TextractClient.h.
A Callable wrapper for GetAdapter that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 360 of file TextractClient.h.
Gets configuration information for the specified adapter version, including: AdapterId, AdapterVersion, FeatureTypes, Status, StatusMessage, DatasetConfig, KMSKeyId, OutputConfig, Tags and EvaluationMetrics.
nullptr
An Async wrapper for GetAdapterVersion that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 397 of file TextractClient.h.
A Callable wrapper for GetAdapterVersion that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 388 of file TextractClient.h.
Gets the results for an Amazon Textract asynchronous operation that analyzes text in a document.
You start asynchronous text analysis by calling StartDocumentAnalysis, which returns a job identifier (JobId
). When the text analysis operation finishes, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that's registered in the initial call to StartDocumentAnalysis
. To get the results of the text-detection operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED
. If so, call GetDocumentAnalysis
, and pass the job identifier (JobId
) from the initial call to StartDocumentAnalysis
.
GetDocumentAnalysis
returns an array of Block objects. The following types of information are returned:
Form data (key-value pairs). The related information is returned in two Block objects, each of type KEY_VALUE_SET
: a KEY Block
object and a VALUE Block
object. For example, Name: Ana Silva Carolina contains a key and value. Name: is the key. Ana Silva Carolina is the value.
Table and table cell data. A TABLE Block
object contains information about a detected table. A CELL Block
object is returned for each cell in a table.
Lines and words of text. A LINE Block
object contains one or more WORD Block
objects. All lines and words that are detected in the document are returned (including text that doesn't have a relationship with the value of the StartDocumentAnalysis
FeatureTypes
input parameter).
Query. A QUERY Block object contains the query text, alias and link to the associated Query results block object.
Query Results. A QUERY_RESULT Block object contains the answer to the query and an ID that connects it to the query asked. This Block also contains a confidence score.
While processing a document with queries, look out for INVALID_REQUEST_PARAMETERS
output. This indicates that either the per page query limit has been exceeded or that the operation is trying to query a page in the document which doesn’t exist.
Selection elements such as check boxes and option buttons (radio buttons) can be detected in form data and in tables. A SELECTION_ELEMENT Block
object contains information about a selection element, including the selection status.
Use the MaxResults
parameter to limit the number of blocks that are returned. If there are more results than specified in MaxResults
, the value of NextToken
in the operation response contains a pagination token for getting the next set of results. To get the next page of results, call GetDocumentAnalysis
, and populate the NextToken
request parameter with the token value that's returned from the previous call to GetDocumentAnalysis
.
For more information, see Document Text Analysis.
nullptr
An Async wrapper for GetDocumentAnalysis that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 466 of file TextractClient.h.
A Callable wrapper for GetDocumentAnalysis that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 457 of file TextractClient.h.
Gets the results for an Amazon Textract asynchronous operation that detects text in a document. Amazon Textract can detect lines of text and the words that make up a line of text.
You start asynchronous text detection by calling StartDocumentTextDetection, which returns a job identifier (JobId
). When the text detection operation finishes, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that's registered in the initial call to StartDocumentTextDetection
. To get the results of the text-detection operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED
. If so, call GetDocumentTextDetection
, and pass the job identifier (JobId
) from the initial call to StartDocumentTextDetection
.
GetDocumentTextDetection
returns an array of Block objects.
Each document page has as an associated Block
of type PAGE. Each PAGE Block
object is the parent of LINE Block
objects that represent the lines of detected text on a page. A LINE Block
object is a parent for each word that makes up the line. Words are represented by Block
objects of type WORD.
Use the MaxResults parameter to limit the number of blocks that are returned. If there are more results than specified in MaxResults
, the value of NextToken
in the operation response contains a pagination token for getting the next set of results. To get the next page of results, call GetDocumentTextDetection
, and populate the NextToken
request parameter with the token value that's returned from the previous call to GetDocumentTextDetection
.
For more information, see Document Text Detection.
nullptr
An Async wrapper for GetDocumentTextDetection that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 518 of file TextractClient.h.
A Callable wrapper for GetDocumentTextDetection that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 509 of file TextractClient.h.
Gets the results for an Amazon Textract asynchronous operation that analyzes invoices and receipts. Amazon Textract finds contact information, items purchased, and vendor name, from input invoices and receipts.
You start asynchronous invoice/receipt analysis by calling StartExpenseAnalysis, which returns a job identifier (JobId
). Upon completion of the invoice/receipt analysis, Amazon Textract publishes the completion status to the Amazon Simple Notification Service (Amazon SNS) topic. This topic must be registered in the initial call to StartExpenseAnalysis
. To get the results of the invoice/receipt analysis operation, first ensure that the status value published to the Amazon SNS topic is SUCCEEDED
. If so, call GetExpenseAnalysis
, and pass the job identifier (JobId
) from the initial call to StartExpenseAnalysis
.
Use the MaxResults parameter to limit the number of blocks that are returned. If there are more results than specified in MaxResults
, the value of NextToken
in the operation response contains a pagination token for getting the next set of results. To get the next page of results, call GetExpenseAnalysis
, and populate the NextToken
request parameter with the token value that's returned from the previous call to GetExpenseAnalysis
.
For more information, see Analyzing Invoices and Receipts.
nullptr
An Async wrapper for GetExpenseAnalysis that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 564 of file TextractClient.h.
A Callable wrapper for GetExpenseAnalysis that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 555 of file TextractClient.h.
Gets the results for an Amazon Textract asynchronous operation that analyzes text in a lending document.
You start asynchronous text analysis by calling StartLendingAnalysis
, which returns a job identifier (JobId
). When the text analysis operation finishes, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that's registered in the initial call to StartLendingAnalysis
.
To get the results of the text analysis operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED. If so, call GetLendingAnalysis, and pass the job identifier (JobId
) from the initial call to StartLendingAnalysis
.
nullptr
An Async wrapper for GetLendingAnalysis that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 599 of file TextractClient.h.
A Callable wrapper for GetLendingAnalysis that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 590 of file TextractClient.h.
Gets summarized results for the StartLendingAnalysis
operation, which analyzes text in a lending document. The returned summary consists of information about documents grouped together by a common document type. Information like detected signatures, page numbers, and split documents is returned with respect to the type of grouped document.
You start asynchronous text analysis by calling StartLendingAnalysis
, which returns a job identifier (JobId
). When the text analysis operation finishes, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that's registered in the initial call to StartLendingAnalysis
.
To get the results of the text analysis operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED. If so, call GetLendingAnalysisSummary
, and pass the job identifier (JobId
) from the initial call to StartLendingAnalysis
.
nullptr
An Async wrapper for GetLendingAnalysisSummary that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 637 of file TextractClient.h.
A Callable wrapper for GetLendingAnalysisSummary that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 628 of file TextractClient.h.
{}
)
const
nullptr
,
{}
An Async wrapper for ListAdapters that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 689 of file TextractClient.h.
{}
)
const
A Callable wrapper for ListAdapters that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 680 of file TextractClient.h.
{}
)
const
List all version of an adapter that meet the specified filtration criteria.
nullptr
,
{}
An Async wrapper for ListAdapterVersions that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 663 of file TextractClient.h.
{}
)
const
A Callable wrapper for ListAdapterVersions that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 654 of file TextractClient.h.
nullptr
An Async wrapper for ListTagsForResource that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 714 of file TextractClient.h.
A Callable wrapper for ListTagsForResource that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 705 of file TextractClient.h.
Starts the asynchronous analysis of an input document for relationships between detected items such as key-value pairs, tables, and selection elements.
StartDocumentAnalysis
can analyze text in documents that are in JPEG, PNG, TIFF, and PDF format. The documents are stored in an Amazon S3 bucket. Use DocumentLocation to specify the bucket name and file name of the document.
StartDocumentAnalysis
returns a job identifier (JobId
) that you use to get the results of the operation. When text analysis is finished, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that you specify in NotificationChannel
. To get the results of the text analysis operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED
. If so, call GetDocumentAnalysis, and pass the job identifier (JobId
) from the initial call to StartDocumentAnalysis
.
For more information, see Document Text Analysis.
nullptr
An Async wrapper for StartDocumentAnalysis that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 755 of file TextractClient.h.
A Callable wrapper for StartDocumentAnalysis that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 746 of file TextractClient.h.
Starts the asynchronous detection of text in a document. Amazon Textract can detect lines of text and the words that make up a line of text.
StartDocumentTextDetection
can analyze text in documents that are in JPEG, PNG, TIFF, and PDF format. The documents are stored in an Amazon S3 bucket. Use DocumentLocation to specify the bucket name and file name of the document.
StartTextDetection
returns a job identifier (JobId
) that you use to get the results of the operation. When text detection is finished, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that you specify in NotificationChannel
. To get the results of the text detection operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED
. If so, call GetDocumentTextDetection, and pass the job identifier (JobId
) from the initial call to StartDocumentTextDetection
.
For more information, see Document Text Detection.
nullptr
An Async wrapper for StartDocumentTextDetection that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 795 of file TextractClient.h.
A Callable wrapper for StartDocumentTextDetection that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 786 of file TextractClient.h.
Starts the asynchronous analysis of invoices or receipts for data like contact information, items purchased, and vendor names.
StartExpenseAnalysis
can analyze text in documents that are in JPEG, PNG, and PDF format. The documents must be stored in an Amazon S3 bucket. Use the DocumentLocation parameter to specify the name of your S3 bucket and the name of the document in that bucket.
StartExpenseAnalysis
returns a job identifier (JobId
) that you will provide to GetExpenseAnalysis
to retrieve the results of the operation. When the analysis of the input invoices/receipts is finished, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that you provide to the NotificationChannel
. To obtain the results of the invoice and receipt analysis operation, ensure that the status value published to the Amazon SNS topic is SUCCEEDED
. If so, call GetExpenseAnalysis, and pass the job identifier (JobId
) that was returned by your call to StartExpenseAnalysis
.
For more information, see Analyzing Invoices and Receipts.
nullptr
An Async wrapper for StartExpenseAnalysis that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 837 of file TextractClient.h.
A Callable wrapper for StartExpenseAnalysis that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 828 of file TextractClient.h.
Starts the classification and analysis of an input document. StartLendingAnalysis
initiates the classification and analysis of a packet of lending documents. StartLendingAnalysis
operates on a document file located in an Amazon S3 bucket.
StartLendingAnalysis
can analyze text in documents that are in one of the following formats: JPEG, PNG, TIFF, PDF. Use DocumentLocation
to specify the bucket name and the file name of the document.
StartLendingAnalysis
returns a job identifier (JobId
) that you use to get the results of the operation. When the text analysis is finished, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that you specify in NotificationChannel
. To get the results of the text analysis operation, first check that the status value published to the Amazon SNS topic is SUCCEEDED. If the status is SUCCEEDED you can call either GetLendingAnalysis
or GetLendingAnalysisSummary
and provide the JobId
to obtain the results of the analysis.
If using OutputConfig
to specify an Amazon S3 bucket, the output will be contained within the specified prefix in a directory labeled with the job-id. In the directory there are 3 sub-directories:
detailedResponse (contains the GetLendingAnalysis response)
summaryResponse (for the GetLendingAnalysisSummary response)
splitDocuments (documents split across logical boundaries)
nullptr
An Async wrapper for StartLendingAnalysis that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 884 of file TextractClient.h.
A Callable wrapper for StartLendingAnalysis that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 875 of file TextractClient.h.
nullptr
An Async wrapper for TagResource that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 910 of file TextractClient.h.
A Callable wrapper for TagResource that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 901 of file TextractClient.h.
nullptr
An Async wrapper for UntagResource that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 936 of file TextractClient.h.
A Callable wrapper for UntagResource that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 927 of file TextractClient.h.
Update the configuration for an adapter. FeatureTypes configurations cannot be updated. At least one new parameter must be specified as an argument.
nullptr
An Async wrapper for UpdateAdapter that queues the request into a thread executor and triggers associated callback when operation has finished.
Definition at line 963 of file TextractClient.h.
A Callable wrapper for UpdateAdapter that returns a future to the operation so that it can be executed in parallel to other requests.
Definition at line 954 of file TextractClient.h.
Definition at line 970 of file TextractClient.h.