WorkflowTemplate - upbound/provider-gcp@v0.26.0

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args

array

Optional. The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

fileUris

array

Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

jarFileUris

array

Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.

loggingConfig

array

Optional. The runtime log config for job execution.

object

string

string

object

array

Optional. Job is a Hive job.

continueOnFailure

boolean

jarFileUris

array

Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.

properties

object

queryFileUri

string

queryList

array

A list of queries.

queries

requiredarray

Required. The queries to execute. You do not need to end a query expression with a semicolon. Multiple queries can be specified in one string by separating each with a semicolon. Here is an example of a Dataproc API snippet that uses a QueryList to specify a HiveJob: "hiveJob": { "queryList": { "queries": } }

scriptVariables

object

labels

object

pigJob

array

Optional. Job is a Pig job.

continueOnFailure

boolean

jarFileUris

array

Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.

loggingConfig

array

Optional. The runtime log config for job execution.

object

object

string

array

A list of queries.

queries

requiredarray

scriptVariables

object

prerequisiteStepIds

array

Optional. The optional list of prerequisite job step_ids. If not specified, the job will start at the beginning of workflow.

prestoJob

array

Optional. Job is a Presto job.

clientTags

array

Optional. Presto client tags to attach to this query

continueOnFailure

boolean

loggingConfig

array

Optional. The runtime log config for job execution.

object

string

object

string

array

A list of queries.

queries

requiredarray

pysparkJob

array

Optional. Job is a PySpark job.

archiveUris

array

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args

array

Optional. The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

fileUris

array

Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

jarFileUris

array

Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.

loggingConfig

array

Optional. The runtime log config for job execution.

object

requiredstring

object

array

Optional. HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

scheduling

array

Optional. Job scheduling configuration.

maxFailuresPerHour

number

maxFailuresTotal

number

sparkJob

array

Optional. Job is a Spark job.

archiveUris

array

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args

array

Optional. The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

fileUris

array

Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

jarFileUris

array

Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.

loggingConfig

array

Optional. The runtime log config for job execution.

object

string

string

object

array

Optional. Job is a SparkR job.

archiveUris

array

Optional. HCFS URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip.

args

array

Optional. The arguments to pass to the driver. Do not include arguments, such as --conf, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

fileUris

array

Optional. HCFS URIs of files to be placed in the working directory of each executor. Useful for naively parallel tasks.

loggingConfig

array

Optional. The runtime log config for job execution.

object

requiredstring

object

array

Optional. Job is a SparkSql job.

jarFileUris

array

Optional. HCFS URIs of jar files to be added to the Spark CLASSPATH.

loggingConfig

array

Optional. The runtime log config for job execution.

object

object

string

array

A list of queries.

requiredarray

object

requiredstring

object

requiredstring

array

Optional. Template parameters whose values are substituted into the template. Values for parameters must be provided when the template is instantiated.

description

string

fields

requiredarray

Required. Paths to all fields that the parameter replaces. A field is allowed to appear in at most one parameter's list of field paths. A field path is similar in syntax to a .sparkJob.args

name

requiredstring

validation

array

Optional. Validation rules to be applied to this parameter's value.

regex

array

Validation based on regular expressions.

regexes

requiredarray

Required. RE2 regular expressions used to validate the parameter's value. The value must match the regex in its entirety (substring matches are not sufficient).

values

array

Required. List of allowed values for the parameter.

values

requiredarray

Required. List of allowed values for the parameter.

placement

requiredarray

Required. WorkflowTemplate scheduling information.

clusterSelector

array

Optional. A selector that chooses target cluster for jobs based on metadata. The selector is evaluated at the time each job is submitted.

clusterLabels

requiredobject

zone

string

managedCluster

array

A cluster that is managed by the workflow.

clusterName

requiredstring

config

requiredarray

Required. The cluster configuration.

autoscalingConfig

array

Optional. Autoscaling config for the policy associated with the cluster. Cluster does not autoscale if this field is unset.

policy

string

encryptionConfig

array

Optional. Encryption settings for the cluster.

gcePdKmsKeyName

string

endpointConfig

array

Optional. Port/endpoint configuration for this cluster

enableHttpPortAccess

boolean

gceClusterConfig

array

Optional. The shared Compute Engine config settings for all instances in a cluster.

boolean

object

string

array

Optional. Node Group Affinity for sole-tenant clusters.

nodeGroup

requiredstring

privateIpv6GoogleAccess

string

reservationAffinity

array

Optional. Reservation Affinity for consuming Zonal reservation.

consumeReservationType

string

key

string

values

array

Required. List of allowed values for the parameter.

serviceAccount

string

serviceAccountScopes

array

Optional. The URIs of service account scopes to be included in Compute Engine instances. The following base set of scopes is always included: * https://www.googleapis.com/auth/cloud.useraccounts.readonly * https://www.googleapis.com/auth/devstorage.read_write * https://www.googleapis.com/auth/logging.write If no scopes are specified, the following defaults are also provided: * https://www.googleapis.com/auth/bigquery * https://www.googleapis.com/auth/bigtable.admin.table * https://www.googleapis.com/auth/bigtable.data * https://www.googleapis.com/auth/devstorage.full_control

shieldedInstanceConfig

array

Optional. Shielded Instance Config for clusters using Compute Engine Shielded VMs. Structure defined below.

enableIntegrityMonitoring

boolean

boolean

boolean

string

array

The Compute Engine tags to add to all instances (see (https://cloud.google.com/compute/docs/label-or-tag-resources#tags)).

zone

string

initializationActions

array

Optional. Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes. You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if ; then ... master specific actions ... else ... worker specific actions ... fi

executableFile

string

executionTimeout

string

lifecycleConfig

array

Optional. Lifecycle setting for the cluster.

string

string

string

array

Optional. The Compute Engine config settings for additional worker instances in a cluster.

accelerators

array

Optional. The Compute Engine accelerator configuration for these instances.

acceleratorCount

number

acceleratorType

string

diskConfig

array

Optional. Disk option config settings.

number

string

number

string

string

string

number

string

secondaryWorkerConfig

array

Optional. The Compute Engine config settings for additional worker instances in a cluster.

accelerators

array

Optional. The Compute Engine accelerator configuration for these instances.

acceleratorCount

number

acceleratorType

string

diskConfig

array

Optional. Disk option config settings.

number

string

number

string

string

string

number

string

array

Optional. Security settings for the cluster.

kerberosConfig

array

Kerberos related configuration.

crossRealmTrustAdminServer

string

crossRealmTrustKdc

string

crossRealmTrustRealm

string

crossRealmTrustSharedPassword

string

boolean

string

string

string

string

string

string

rootPrincipalPassword

string

number

string

string

array

Optional. The config settings for software inside the cluster.

imageVersion

string

optionalComponents

array

No description provided.

object

string

string

array

Optional. The Compute Engine config settings for additional worker instances in a cluster.

accelerators

array

Optional. The Compute Engine accelerator configuration for these instances.

acceleratorCount

number

acceleratorType

string

diskConfig

array

Optional. Disk option config settings.

number

string

number

string

string

string

number

string

object

string

number

object

ProviderConfigReference specifies how the provider that will be used to create, observe, update, and delete this managed resource should be configured.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

providerRef

object

ProviderReference specifies the provider that will be used to create, observe, update, and delete this managed resource. Deprecated: Please use ProviderConfigReference, i.e. providerConfigRef

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

publishConnectionDetailsTo

object

PublishConnectionDetailsTo specifies the connection secret config which contains a name, metadata and a reference to secret store config to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource.

configRef

object

SecretStoreConfigRef specifies which secret store config should be used for this ConnectionSecret.

name

requiredstring

policy

object

Policies for referencing.

resolution

string

resolve

string

metadata

object

Metadata is the metadata for connection secret.

object

object

string

requiredstring

writeConnectionSecretToRef

object

WriteConnectionSecretToReference specifies the namespace and name of a Secret to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource. This field is planned to be replaced in a future release in favor of PublishConnectionDetailsTo. Currently, both could be set independently and connection details would be published to both without affecting each other.

name

requiredstring

namespace

requiredstring

status

object

WorkflowTemplateStatus defines the observed state of WorkflowTemplate.

atProvider

object

No description provided.

createTime

string

placement

array

Required. WorkflowTemplate scheduling information.

managedCluster

array

A cluster that is managed by the workflow.

config

array

Required. The cluster configuration.

endpointConfig

array

Optional. Port/endpoint configuration for this cluster

httpPorts

object

lifecycleConfig

array

Optional. Lifecycle setting for the cluster.

idleStartTime

string

masterConfig

array

Optional. The Compute Engine config settings for additional worker instances in a cluster.

instanceNames

array

Output only. The list of instance names. Dataproc derives the names from cluster_name, num_instances, and the instance group.

isPreemptible

boolean

managedGroupConfig

array

Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.

instanceGroupManagerName

string

instanceTemplateName

string

secondaryWorkerConfig

array

Optional. The Compute Engine config settings for additional worker instances in a cluster.

instanceNames

array

Output only. The list of instance names. Dataproc derives the names from cluster_name, num_instances, and the instance group.

isPreemptible

boolean

managedGroupConfig

array

Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.

instanceGroupManagerName

string

instanceTemplateName

string

workerConfig

array

Optional. The Compute Engine config settings for additional worker instances in a cluster.

instanceNames

array

Output only. The list of instance names. Dataproc derives the names from cluster_name, num_instances, and the instance group.

isPreemptible

boolean

managedGroupConfig

array

Output only. The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.

instanceGroupManagerName

string

instanceTemplateName

string

updateTime

string

conditions

array

Conditions of the resource.

requiredstring

string

requiredstring

requiredstring

requiredstring

template

apiVersion: dataproc.gcp.upbound.io/v1beta1
kind: WorkflowTemplate
metadata:
  annotations:
    meta.upbound.io/example-id: dataproc/v1beta1/workflowtemplate
  labels:
    testing.upbound.io/example-name: template
  name: template
spec:
  forProvider:
    jobs:
      - sparkJob:
          - mainClass: SomeClass
        stepId: someJob
      - prerequisiteStepIds:
          - someJob
        prestoJob:
          - queryFileUri: someuri
        stepId: otherJob
    location: us-central1
    placement:
      - managedCluster:
          - clusterName: my-cluster
            config:
              - gceClusterConfig:
                  - tags:
                      - foo
                      - bar
                    zone: us-central1-a
                masterConfig:
                  - diskConfig:
                      - bootDiskSizeGb: 15
                        bootDiskType: pd-ssd
                    machineType: n1-standard-1
                    numInstances: 1
                secondaryWorkerConfig:
                  - numInstances: 2
                softwareConfig:
                  - imageVersion: 2.0.35-debian10
                workerConfig:
                  - diskConfig:
                      - bootDiskSizeGb: 10
                        numLocalSsds: 2
                    machineType: n1-standard-2
                    numInstances: 3

Marketplace

Discover the building blocks for your internal cloud platform.

SolutionsProviders Configurations

LearnDocumentation Try for Free

CompanyAbout Contact Us

MorePrivacy Policy Terms & Conditions

Marketplace

Discover the building blocksfor your internal cloud platform.