DataSource - upbound/provider-aws-kendra@v1.13.0

A block with the configuration information to connect to your Data Source repository. You can't specify the configuration block when the type parameter is set to CUSTOM. Detailed below.

s3Configuration

array

A block that provides the configuration information to connect to an Amazon S3 bucket as your data source. Detailed below.

accessControlListConfiguration

array

A block that provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources. Detailed below.

keyPath

string

bucketName

string

bucketNameRef

object

Reference to a Bucket in s3 to populate bucketName.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

bucketNameSelector

object

Selector for a Bucket in s3 to populate bucketName.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

resolution

string

resolve

string

documentsMetadataConfiguration

array

A block that defines the Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document. Detailed below.

s3Prefix

string

exclusionPatterns

array

A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed. Refer to Exclusion Patterns for more examples.

inclusionPatterns

array

A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed. Refer to Inclusion Patterns for more examples.

inclusionPrefixes

array

A list of S3 prefixes for the documents that should be included in the index.

webCrawlerConfiguration

array

A block that provides the configuration information required for Amazon Kendra Web Crawler. Detailed below.

authenticationConfiguration

array

A block with the configuration information required to connect to websites using authentication. You can connect to websites using basic authentication of user name and password. You use a secret in AWS Secrets Manager to store your authentication credentials. You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS. Detailed below.

basicAuthentication

array

The list of configuration information that's required to connect to and crawl a website host using basic authentication credentials. The list includes the name and port number of the website host. Detailed below.

credentials

string

credentialsRef

object

Reference to a Secret in secretsmanager to populate credentials.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

credentialsSelector

object

Selector for a Secret in secretsmanager to populate credentials.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

number

number

maxContentSizePerPageInMegaBytes

number

maxLinksPerPage

number

maxUrlsPerMinuteCrawlRate

number

proxyConfiguration

array

Configuration information required to connect to your internal websites via a web proxy. You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS. Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication. To store web proxy credentials, you use a secret in AWS Secrets Manager. Detailed below.

credentials

string

credentialsRef

object

Reference to a Secret in secretsmanager to populate credentials.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

credentialsSelector

object

Selector for a Secret in secretsmanager to populate credentials.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

number

array

A list of regular expression patterns to exclude certain URLs to crawl. URLs that match the patterns are excluded from the index. URLs that don't match the patterns are included in the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index. Array Members: Minimum number of 0 items. Maximum number of 100 items. Length Constraints: Minimum length of 1. Maximum length of 150.

urlInclusionPatterns

array

A list of regular expression patterns to include certain URLs to crawl. URLs that match the patterns are included in the index. URLs that don't match the patterns are excluded from the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index. Array Members: Minimum number of 0 items. Maximum number of 100 items. Length Constraints: Minimum length of 1. Maximum length of 150.

urls

array

A block that specifies the seed or starting point URLs of the websites or the sitemap URLs of the websites you want to crawl. You can include website subdomains. You can list up to 100 seed URLs and up to 3 sitemap URLs. You can only crawl websites that use the secure communication protocol, Hypertext Transfer Protocol Secure (HTTPS). If you receive an error when crawling a website, it could be that the website is blocked from crawling. When selecting websites to index, you must adhere to the Amazon Acceptable Use Policy and all other Amazon terms. Remember that you must only use Amazon Kendra Web Crawler to index your own webpages, or webpages that you have authorization to index. Detailed below.

seedUrlConfiguration

array

A block that specifies the configuration of the seed or starting point URLs of the websites you want to crawl. You can choose to crawl only the website host names, or the website host names with subdomains, or the website host names with subdomains and other domains that the webpages link to. You can list up to 100 seed URLs. Detailed below.

seedUrls

array

The list of seed or starting point URLs of the websites you want to crawl. The list can include a maximum of 100 seed URLs. Array Members: Minimum number of 0 items. Maximum number of 100 items. Length Constraints: Minimum length of 1. Maximum length of 2048.

webCrawlerMode

string

siteMapsConfiguration

array

A block that specifies the configuration of the sitemap URLs of the websites you want to crawl. Only URLs belonging to the same website host names are crawled. You can list up to 3 sitemap URLs. Detailed below.

siteMaps

array

The list of sitemap URLs of the websites you want to crawl. The list can include a maximum of 3 sitemap URLs.

customDocumentEnrichmentConfiguration

array

A block with the configuration information for altering document metadata and content during the document ingestion process. For more information on how to create, modify and delete document metadata, or make other content alterations when you ingest documents into Amazon Kendra, see Customizing document metadata during the ingestion process. Detailed below.

inlineConfigurations

array

Configuration information to alter document attributes or metadata fields and content when ingesting documents into Amazon Kendra. Minimum number of 0 items. Maximum number of 100 items. Detailed below.

condition

array

Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra. See condition.

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

operator

string

documentContentDeletion

boolean

target

array

Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value. Detailed below.

targetDocumentAttributeKey

string

targetDocumentAttributeValue

array

The target value you want to create for the target attribute. For example, 'Finance' could be the target value for the target attribute key 'Department'. See target_document_attribute_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

targetDocumentAttributeValueDeletion

boolean

postExtractionHookConfiguration

array

A block that specifies the configuration information for invoking a Lambda function in AWS Lambda on the structured documents with their metadata and text extracted. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation. Detailed below.

invocationCondition

array

A block that specifies the condition used for when a Lambda function should be invoked. For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time. See invocation_condition.

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

string

string

string

string

preExtractionHookConfiguration

array

Configuration information for invoking a Lambda function in AWS Lambda on the original or raw documents before extracting their metadata and text. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation. Detailed below.

invocationCondition

array

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

string

string

string

string

string

string

string

object

Reference to a Index in kendra to populate indexId.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

indexIdSelector

object

Selector for a Index in kendra to populate indexId.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

string

requiredstring

string

object

Reference to a Role in iam to populate roleArn.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

roleArnSelector

object

Selector for a Role in iam to populate roleArn.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

object

string

object

THIS IS A BETA FIELD. It will be honored unless the Management Policies feature flag is disabled. InitProvider holds the same fields as ForProvider, with the exception of Identifier and other resource reference fields. The fields that are in InitProvider are merged into ForProvider when the resource is created. The same fields are also added to the terraform ignore_changes hook, to avoid updating them after creation. This is useful for fields that are required on creation, but we do not desire to update them after creation, for example because of an external controller is managing them, like an autoscaler.

configuration

array

A block with the configuration information to connect to your Data Source repository. You can't specify the configuration block when the type parameter is set to CUSTOM. Detailed below.

s3Configuration

array

A block that provides the configuration information to connect to an Amazon S3 bucket as your data source. Detailed below.

accessControlListConfiguration

array

A block that provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources. Detailed below.

keyPath

string

bucketName

string

bucketNameRef

object

Reference to a Bucket in s3 to populate bucketName.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

bucketNameSelector

object

Selector for a Bucket in s3 to populate bucketName.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

resolution

string

resolve

string

documentsMetadataConfiguration

array

string

array

array

array

A list of S3 prefixes for the documents that should be included in the index.

webCrawlerConfiguration

array

A block that provides the configuration information required for Amazon Kendra Web Crawler. Detailed below.

authenticationConfiguration

array

basicAuthentication

array

credentials

string

credentialsRef

object

Reference to a Secret in secretsmanager to populate credentials.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

credentialsSelector

object

Selector for a Secret in secretsmanager to populate credentials.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

number

number

maxContentSizePerPageInMegaBytes

number

maxLinksPerPage

number

maxUrlsPerMinuteCrawlRate

number

proxyConfiguration

array

credentials

string

credentialsRef

object

Reference to a Secret in secretsmanager to populate credentials.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

credentialsSelector

object

Selector for a Secret in secretsmanager to populate credentials.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

number

array

array

array

array

array

string

siteMapsConfiguration

array

siteMaps

array

The list of sitemap URLs of the websites you want to crawl. The list can include a maximum of 3 sitemap URLs.

customDocumentEnrichmentConfiguration

array

inlineConfigurations

array

condition

array

Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra. See condition.

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

operator

string

documentContentDeletion

boolean

target

array

Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value. Detailed below.

targetDocumentAttributeKey

string

targetDocumentAttributeValue

array

The target value you want to create for the target attribute. For example, 'Finance' could be the target value for the target attribute key 'Department'. See target_document_attribute_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

targetDocumentAttributeValueDeletion

boolean

postExtractionHookConfiguration

array

invocationCondition

array

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

string

string

string

string

preExtractionHookConfiguration

array

invocationCondition

array

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

string

string

string

string

string

string

string

object

Reference to a Index in kendra to populate indexId.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

indexIdSelector

object

Selector for a Index in kendra to populate indexId.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

string

string

object

Reference to a Role in iam to populate roleArn.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

roleArnSelector

object

Selector for a Role in iam to populate roleArn.

matchControllerRef

boolean

matchLabels

object

policy

object

Policies for selection.

string

string

string

object

string

array

THIS IS A BETA FIELD. It is on by default but can be opted out through a Crossplane feature flag. ManagementPolicies specify the array of actions Crossplane is allowed to take on the managed and external resources. This field is planned to replace the DeletionPolicy field in a future release. Currently, both could be set independently and non-default values would be honored if the feature flag is enabled. If both are custom, the DeletionPolicy field will be ignored. See the design doc for more information: https://github.com/crossplane/crossplane/blob/499895a25d1a1a0ba1604944ef98ac7a1a71f197/design/design-doc-observe-only-resources.md?plain=1#L223 and this one: https://github.com/crossplane/crossplane/blob/444267e84783136daa93568b364a5f01228cacbe/design/one-pager-ignore-changes.md

providerConfigRef

object

ProviderConfigReference specifies how the provider that will be used to create, observe, update, and delete this managed resource should be configured.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

publishConnectionDetailsTo

object

PublishConnectionDetailsTo specifies the connection secret config which contains a name, metadata and a reference to secret store config to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource.

configRef

object

SecretStoreConfigRef specifies which secret store config should be used for this ConnectionSecret.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

metadata

object

Metadata is the metadata for connection secret.

object

object

string

requiredstring

writeConnectionSecretToRef

object

WriteConnectionSecretToReference specifies the namespace and name of a Secret to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource. This field is planned to be replaced in a future release in favor of PublishConnectionDetailsTo. Currently, both could be set independently and connection details would be published to both without affecting each other.

name

requiredstring

namespace

requiredstring

status

object

DataSourceStatus defines the observed state of DataSource.

atProvider

object

No description provided.

arn

string

configuration

array

A block with the configuration information to connect to your Data Source repository. You can't specify the configuration block when the type parameter is set to CUSTOM. Detailed below.

s3Configuration

array

A block that provides the configuration information to connect to an Amazon S3 bucket as your data source. Detailed below.

accessControlListConfiguration

array

A block that provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources. Detailed below.

keyPath

string

bucketName

string

documentsMetadataConfiguration

array

string

array

array

array

A list of S3 prefixes for the documents that should be included in the index.

webCrawlerConfiguration

array

A block that provides the configuration information required for Amazon Kendra Web Crawler. Detailed below.

authenticationConfiguration

array

array

string

string

number

number

maxContentSizePerPageInMegaBytes

number

maxLinksPerPage

number

maxUrlsPerMinuteCrawlRate

number

array

string

string

number

array

array

array

array

array

string

siteMapsConfiguration

array

siteMaps

array

The list of sitemap URLs of the websites you want to crawl. The list can include a maximum of 3 sitemap URLs.

createdAt

string

customDocumentEnrichmentConfiguration

array

inlineConfigurations

array

condition

array

Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra. See condition.

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

operator

string

documentContentDeletion

boolean

target

array

Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value. Detailed below.

targetDocumentAttributeKey

string

targetDocumentAttributeValue

array

The target value you want to create for the target attribute. For example, 'Finance' could be the target value for the target attribute key 'Department'. See target_document_attribute_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

targetDocumentAttributeValueDeletion

boolean

postExtractionHookConfiguration

array

invocationCondition

array

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

string

string

string

string

preExtractionHookConfiguration

array

invocationCondition

array

conditionDocumentAttributeKey

string

conditionOnValue

array

The value used by the operator. For example, you can specify the value 'financial' for strings in the _source_uri field that partially match or contain this value. See condition_on_value.

dateValue

string

longValue

number

stringListValue

array

A list of strings.

stringValue

string

operator

string

lambdaArn