Marketplace
BrowsePublish
Marketplace
You are viewing an outdated version of provider-gcp.Go to Latest
upbound/provider-gcp@v0.41.2
WorkflowTemplate
dataproc.gcp.upbound.io
WorkflowTemplate
upbound/provider-gcp@v0.41.2dataproc.gcp.upbound.io

WorkflowTemplate is the Schema for the WorkflowTemplates API. A Workflow Template is a reusable workflow configuration.

Type

CRD

Group

dataproc.gcp.upbound.io

Version

v1beta1

apiVersion: dataproc.gcp.upbound.io/v1beta1

kind: WorkflowTemplate

API Documentation
apiVersion
string
kind
string
metadata
object
spec
object
object

WorkflowTemplateSpec defines the desired state of WorkflowTemplate

forProvider
requiredobject
requiredobject

No description provided.

jobs
array
array

Required. The Directed Acyclic Graph of Jobs to submit.

array

Job is a Hadoop job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

mainClass
string
hiveJob
array
array

Job is a Hive job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

labels
object
pigJob
array
array

Job is a Pig job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

array

The optional list of prerequisite job step_ids. If not specified, the job will start at the beginning of workflow.

array

Job is a Presto job.

array

Presto client tags to attach to this query

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

array

Job is a PySpark job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

array

Job scheduling configuration.

array

Job is a Spark job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

mainClass
string
array

Job is a SparkR job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

The runtime log config for job execution.

array

Job is a SparkSql job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

stepId
string
labels
object
location
requiredstring
array

Template parameters whose values are substituted into the template. Values for parameters must be provided when the template is instantiated.

fields
array
array

Required. Paths to all fields that the parameter replaces. A field is allowed to appear in at most one parameter's list of field paths. A field path is similar in syntax to a .sparkJob.args

name
string
array

Validation rules to be applied to this parameter's value.

regex
array
array

Validation based on regular expressions.

regexes
array
array

Required. RE2 regular expressions used to validate the parameter's value. The value must match the regex in its entirety (substring matches are not sufficient).

values
array
array

Required. List of allowed values for the parameter.

values
array
array

Required. List of allowed values for the parameter.

array

Required. WorkflowTemplate scheduling information.

array

A selector that chooses target cluster for jobs based on metadata. The selector is evaluated at the time each job is submitted.

zone
string
array

A cluster that is managed by the workflow.

config
array
array

Required. The cluster configuration.

array

Autoscaling config for the policy associated with the cluster. Cluster does not autoscale if this field is unset.

policy
string
array

Encryption settings for the cluster.

array

Port/endpoint configuration for this cluster

array

The shared Compute Engine config settings for all instances in a cluster.

metadata
object
network
string
array

Node Group Affinity for sole-tenant clusters.

nodeGroup
string
array

Reservation Affinity for consuming Zonal reservation.

key
string
values
array
array

Required. List of allowed values for the parameter.

array

The URIs of service account scopes to be included in Compute Engine instances. The following base set of scopes is always included: * https://www.googleapis.com/auth/cloud.useraccounts.readonly * https://www.googleapis.com/auth/devstorage.read_write * https://www.googleapis.com/auth/logging.write If no scopes are specified, the following defaults are also provided: * https://www.googleapis.com/auth/bigquery * https://www.googleapis.com/auth/bigtable.admin.table * https://www.googleapis.com/auth/bigtable.data * https://www.googleapis.com/auth/devstorage.full_control

array

Shielded Instance Config for clusters using Compute Engine Shielded VMs. Structure defined below.

tags
array
array

The Compute Engine tags to add to all instances (see (https://cloud.google.com/compute/docs/label-or-tag-resources#tags)).

zone
string
array

Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes. You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if ; then ... master specific actions ... else ... worker specific actions ... fi

array

Lifecycle setting for the cluster.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

Security settings for the cluster.

array

The config settings for software inside the cluster.

array

The set of components to activate on the cluster.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
labels
object
project
string
version
number
object

THIS IS A BETA FIELD. It will be honored unless the Management Policies feature flag is disabled. InitProvider holds the same fields as ForProvider, with the exception of Identifier and other resource reference fields. The fields that are in InitProvider are merged into ForProvider when the resource is created. The same fields are also added to the terraform ignore_changes hook, to avoid updating them after creation. This is useful for fields that are required on creation, but we do not desire to update them after creation, for example because of an external controller is managing them, like an autoscaler.

jobs
array
array

Required. The Directed Acyclic Graph of Jobs to submit.

array

Job is a Hadoop job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

mainClass
string
hiveJob
array
array

Job is a Hive job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

labels
object
pigJob
array
array

Job is a Pig job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

array

The optional list of prerequisite job step_ids. If not specified, the job will start at the beginning of workflow.

array

Job is a Presto job.

array

Presto client tags to attach to this query

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

array

Job is a PySpark job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

array

Job scheduling configuration.

array

Job is a Spark job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

mainClass
string
array

Job is a SparkR job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

The runtime log config for job execution.

array

Job is a SparkSql job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

stepId
string
labels
object
array

Template parameters whose values are substituted into the template. Values for parameters must be provided when the template is instantiated.

fields
array
array

Required. Paths to all fields that the parameter replaces. A field is allowed to appear in at most one parameter's list of field paths. A field path is similar in syntax to a .sparkJob.args

name
string
array

Validation rules to be applied to this parameter's value.

regex
array
array

Validation based on regular expressions.

regexes
array
array

Required. RE2 regular expressions used to validate the parameter's value. The value must match the regex in its entirety (substring matches are not sufficient).

values
array
array

Required. List of allowed values for the parameter.

values
array
array

Required. List of allowed values for the parameter.

array

Required. WorkflowTemplate scheduling information.

array

A selector that chooses target cluster for jobs based on metadata. The selector is evaluated at the time each job is submitted.

zone
string
array

A cluster that is managed by the workflow.

config
array
array

Required. The cluster configuration.

array

Autoscaling config for the policy associated with the cluster. Cluster does not autoscale if this field is unset.

policy
string
array

Encryption settings for the cluster.

array

Port/endpoint configuration for this cluster

array

The shared Compute Engine config settings for all instances in a cluster.

metadata
object
network
string
array

Node Group Affinity for sole-tenant clusters.

nodeGroup
string
array

Reservation Affinity for consuming Zonal reservation.

key
string
values
array
array

Required. List of allowed values for the parameter.

array

Shielded Instance Config for clusters using Compute Engine Shielded VMs. Structure defined below.

tags
array
array

The Compute Engine tags to add to all instances (see (https://cloud.google.com/compute/docs/label-or-tag-resources#tags)).

zone
string
array

Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes. You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if ; then ... master specific actions ... else ... worker specific actions ... fi

array

Lifecycle setting for the cluster.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

Security settings for the cluster.

array

The config settings for software inside the cluster.

array

The set of components to activate on the cluster.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
labels
object
project
string
version
number
array

THIS IS A BETA FIELD. It is on by default but can be opted out through a Crossplane feature flag. ManagementPolicies specify the array of actions Crossplane is allowed to take on the managed and external resources. This field is planned to replace the DeletionPolicy field in a future release. Currently, both could be set independently and non-default values would be honored if the feature flag is enabled. If both are custom, the DeletionPolicy field will be ignored. See the design doc for more information: https://github.com/crossplane/crossplane/blob/499895a25d1a1a0ba1604944ef98ac7a1a71f197/design/design-doc-observe-only-resources.md?plain=1#L223 and this one: https://github.com/crossplane/crossplane/blob/444267e84783136daa93568b364a5f01228cacbe/design/one-pager-ignore-changes.md

object

ProviderConfigReference specifies how the provider that will be used to create, observe, update, and delete this managed resource should be configured.

name
requiredstring
policy
object
object

Policies for referencing.

resolve
string
object

PublishConnectionDetailsTo specifies the connection secret config which contains a name, metadata and a reference to secret store config to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource.

configRef
object
object

SecretStoreConfigRef specifies which secret store config should be used for this ConnectionSecret.

name
requiredstring
policy
object
object

Policies for referencing.

resolve
string
metadata
object
object

Metadata is the metadata for connection secret.

labels
object
type
string
name
requiredstring
object

WriteConnectionSecretToReference specifies the namespace and name of a Secret to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource. This field is planned to be replaced in a future release in favor of PublishConnectionDetailsTo. Currently, both could be set independently and connection details would be published to both without affecting each other.

name
requiredstring
namespace
requiredstring
status
object
object

WorkflowTemplateStatus defines the observed state of WorkflowTemplate.

object

No description provided.

id
string
jobs
array
array

Required. The Directed Acyclic Graph of Jobs to submit.

array

Job is a Hadoop job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

mainClass
string
hiveJob
array
array

Job is a Hive job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

labels
object
pigJob
array
array

Job is a Pig job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

array

The optional list of prerequisite job step_ids. If not specified, the job will start at the beginning of workflow.

array

Job is a Presto job.

array

Presto client tags to attach to this query

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

array

Job is a PySpark job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

array

Job scheduling configuration.

array

Job is a Spark job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

mainClass
string
array

Job is a SparkR job.

array

HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args
array
array

The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

array

HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

array

The runtime log config for job execution.

array

Job is a SparkSql job.

array

HCFS URIs of jar files to be added to the Spark CLASSPATH.

array

The runtime log config for job execution.

array

A list of queries.

queries
array
array

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

stepId
string
labels
object
location
string
array

Template parameters whose values are substituted into the template. Values for parameters must be provided when the template is instantiated.

fields
array
array

Required. Paths to all fields that the parameter replaces. A field is allowed to appear in at most one parameter's list of field paths. A field path is similar in syntax to a .sparkJob.args

name
string
array

Validation rules to be applied to this parameter's value.

regex
array
array

Validation based on regular expressions.

regexes
array
array

Required. RE2 regular expressions used to validate the parameter's value. The value must match the regex in its entirety (substring matches are not sufficient).

values
array
array

Required. List of allowed values for the parameter.

values
array
array

Required. List of allowed values for the parameter.

array

Required. WorkflowTemplate scheduling information.

array

A selector that chooses target cluster for jobs based on metadata. The selector is evaluated at the time each job is submitted.

zone
string
array

A cluster that is managed by the workflow.

config
array
array

Required. The cluster configuration.

array

Autoscaling config for the policy associated with the cluster. Cluster does not autoscale if this field is unset.

policy
string
array

Encryption settings for the cluster.

array

Port/endpoint configuration for this cluster

array

The shared Compute Engine config settings for all instances in a cluster.

metadata
object
network
string
array

Node Group Affinity for sole-tenant clusters.

nodeGroup
string
array

Reservation Affinity for consuming Zonal reservation.

key
string
values
array
array

Required. List of allowed values for the parameter.

array

Shielded Instance Config for clusters using Compute Engine Shielded VMs. Structure defined below.

tags
array
array

The Compute Engine tags to add to all instances (see (https://cloud.google.com/compute/docs/label-or-tag-resources#tags)).

zone
string
array

Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes. You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if ; then ... master specific actions ... else ... worker specific actions ... fi

array

Lifecycle setting for the cluster.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

Output only. The list of instance names. Dataproc derives the names from cluster_name, num_instances, and the instance group.

array

Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

Output only. The list of instance names. Dataproc derives the names from cluster_name, num_instances, and the instance group.

array

Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.

array

Security settings for the cluster.

array

The config settings for software inside the cluster.

array

The set of components to activate on the cluster.

array

The Compute Engine config settings for additional worker instances in a cluster.

array

The Compute Engine accelerator configuration for these instances.

array

Disk option config settings.

image
string
array

Output only. The list of instance names. Dataproc derives the names from cluster_name, num_instances, and the instance group.

array

Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.

labels
object
project
string
version
number
array

Conditions of the resource.

lastTransitionTime
requiredstring
message
string
reason
requiredstring
status
requiredstring
type
requiredstring
Marketplace

Discover the building blocks for your internal cloud platform.

© 2022 Upbound, Inc.

SolutionsProvidersConfigurations
LearnDocumentationTry for Free
MorePrivacy PolicyTerms & Conditions
Marketplace

© 2022 Upbound, Inc.

Marketplace

Discover the building blocksfor your internal cloud platform.