Cluster is the Schema for the Clusters API. Manages a Cloud Dataproc cluster resource.
Type
CRD
Group
dataproc.gcp.upbound.io
Version
v1beta1
apiVersion: dataproc.gcp.upbound.io/v1beta1
kind: Cluster
ClusterSpec defines the desired state of Cluster
No description provided.
Allows you to configure various aspects of the cluster. Structure defined below.
The autoscaling policy config associated with the cluster. Note that once set, if autoscaling_config is the only field set in cluster_config, it can only be removed by setting policy_uri = "", rather than removing the whole block. Structure defined below.
The Compute Engine accelerator (GPU) configuration for these instances. Can be specified multiple times. Structure defined below.
Metrics sources to enable.
One or more [available OSS metrics] (https://cloud.google.com/dataproc/docs/guides/monitoring#available_oss_metrics) to collect for the metric course.
The Customer managed encryption keys settings for the cluster. Structure defined below.
The config settings for port access on the cluster. Structure defined below.
Common config settings for resources of Google Compute Engine cluster instances, applicable to all instances in the cluster. Structure defined below.
Node Group Affinity for sole-tenant clusters.
Reservation Affinity for consuming zonal reservation.
Corresponds to the label values of reservation resource.
Reference to a ServiceAccount in cloudplatform to populate serviceAccount.
Policies for referencing.
The set of Google API scopes to be made available on all of the node VMs under the service_account specified. Both OAuth2 URLs and gcloud short names are supported. To allow full access to all Cloud APIs, use the cloud-platform scope. See a complete list of scopes here.
Selector for a ServiceAccount in cloudplatform to populate serviceAccount.
Policies for selection.
Shielded Instance Config for clusters using Compute Engine Shielded VMs.
The list of instance tags applied to instances in the cluster. Tags are used to identify valid sources or targets for network firewalls.
Commands to execute on each node after config is completed. You can specify multiple versions of these. Structure defined below.
The settings for auto deletion cluster schedule. Structure defined below.
The Google Compute Engine config settings for the master instances in a cluster. Structure defined below.
The Compute Engine accelerator (GPU) configuration for these instances. Can be specified multiple times.
Disk Config
The config setting for metastore service with the cluster. Structure defined below.
The Google Compute Engine config settings for the additional instances in a cluster. Structure defined below.
Security related configuration. Structure defined below.
Kerberos Configuration
The config settings for software inside the cluster. Structure defined below.
The set of optional components to activate on the cluster. See Available Optional Components.
The Google Compute Engine config settings for the worker instances in a cluster. Structure defined below.
The Compute Engine accelerator configuration for these instances. Can be specified multiple times.
Disk Config
Allows you to configure a virtual Dataproc on GKE cluster. Structure defined below.
Configuration of auxiliary services used by this cluster. Structure defined below.
The config setting for metastore service with the cluster. Structure defined below.
The Spark History Server configuration for the workload.
The configuration for running the Dataproc cluster on Kubernetes. Structure defined below.
The configuration for running the Dataproc cluster on GKE.
GKE node pools where workloads will be scheduled. At least one node pool must be assigned the DEFAULT GkeNodePoolTarget.Role. If a GkeNodePoolTarget is not specified, Dataproc constructs a DEFAULT GkeNodePoolTarget. Each role can be given to only one GkeNodePoolTarget. All node pools must have the same location settings.
(Input only) The configuration for the GKE node pool. If specified, Dataproc attempts to create a node pool with the specified shape. If one with the same name already exists, it is verified against all specified fields. If a field differs, the virtual cluster creation will fail.
The autoscaler configuration for this node pool. The autoscaler is enabled only when a valid configuration is present.
The node pool configuration.
The list of Compute Engine zones where node pool nodes associated with a Dataproc on GKE virtual cluster will be located.
The roles associated with the GKE node pool. One of "DEFAULT", "CONTROLLER", "SPARK_DRIVER" or "SPARK_EXECUTOR".
The software configuration for this Dataproc cluster running on Kubernetes.
ProviderConfigReference specifies how the provider that will be used to create, observe, update, and delete this managed resource should be configured.
Policies for referencing.
ProviderReference specifies the provider that will be used to create, observe, update, and delete this managed resource. Deprecated: Please use ProviderConfigReference, i.e. providerConfigRef
Policies for referencing.
PublishConnectionDetailsTo specifies the connection secret config which contains a name, metadata and a reference to secret store config to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource.
WriteConnectionSecretToReference specifies the namespace and name of a Secret to which any connection details for this managed resource should be written. Connection details frequently include the endpoint, username, and password required to connect to the managed resource. This field is planned to be replaced in a future release in favor of PublishConnectionDetailsTo. Currently, both could be set independently and connection details would be published to both without affecting each other.
ClusterStatus defines the observed state of Cluster.
No description provided.
Allows you to configure various aspects of the cluster. Structure defined below.
The autoscaling policy config associated with the cluster. Note that once set, if autoscaling_config is the only field set in cluster_config, it can only be removed by setting policy_uri = "", rather than removing the whole block. Structure defined below.
The Compute Engine accelerator (GPU) configuration for these instances. Can be specified multiple times. Structure defined below.
Metrics sources to enable.
One or more [available OSS metrics] (https://cloud.google.com/dataproc/docs/guides/monitoring#available_oss_metrics) to collect for the metric course.
The Customer managed encryption keys settings for the cluster. Structure defined below.
The config settings for port access on the cluster. Structure defined below.
Common config settings for resources of Google Compute Engine cluster instances, applicable to all instances in the cluster. Structure defined below.
Node Group Affinity for sole-tenant clusters.
Reservation Affinity for consuming zonal reservation.
Corresponds to the label values of reservation resource.
The set of Google API scopes to be made available on all of the node VMs under the service_account specified. Both OAuth2 URLs and gcloud short names are supported. To allow full access to all Cloud APIs, use the cloud-platform scope. See a complete list of scopes here.
Shielded Instance Config for clusters using Compute Engine Shielded VMs.
The list of instance tags applied to instances in the cluster. Tags are used to identify valid sources or targets for network firewalls.
Commands to execute on each node after config is completed. You can specify multiple versions of these. Structure defined below.
The settings for auto deletion cluster schedule. Structure defined below.
The Google Compute Engine config settings for the master instances in a cluster. Structure defined below.
The Compute Engine accelerator (GPU) configuration for these instances. Can be specified multiple times.
Disk Config
List of worker instance names which have been assigned to the cluster.
The config setting for metastore service with the cluster. Structure defined below.
The Google Compute Engine config settings for the additional instances in a cluster. Structure defined below.
Disk Config
List of worker instance names which have been assigned to the cluster.
Security related configuration. Structure defined below.
Kerberos Configuration
The config settings for software inside the cluster. Structure defined below.
The set of optional components to activate on the cluster. See Available Optional Components.
The Google Compute Engine config settings for the worker instances in a cluster. Structure defined below.
The Compute Engine accelerator configuration for these instances. Can be specified multiple times.
Disk Config
List of worker instance names which have been assigned to the cluster.
Allows you to configure a virtual Dataproc on GKE cluster. Structure defined below.
Configuration of auxiliary services used by this cluster. Structure defined below.
The config setting for metastore service with the cluster. Structure defined below.
The Spark History Server configuration for the workload.
The configuration for running the Dataproc cluster on Kubernetes. Structure defined below.
The configuration for running the Dataproc cluster on GKE.
GKE node pools where workloads will be scheduled. At least one node pool must be assigned the DEFAULT GkeNodePoolTarget.Role. If a GkeNodePoolTarget is not specified, Dataproc constructs a DEFAULT GkeNodePoolTarget. Each role can be given to only one GkeNodePoolTarget. All node pools must have the same location settings.
(Input only) The configuration for the GKE node pool. If specified, Dataproc attempts to create a node pool with the specified shape. If one with the same name already exists, it is verified against all specified fields. If a field differs, the virtual cluster creation will fail.
The autoscaler configuration for this node pool. The autoscaler is enabled only when a valid configuration is present.
The node pool configuration.
The list of Compute Engine zones where node pool nodes associated with a Dataproc on GKE virtual cluster will be located.
The roles associated with the GKE node pool. One of "DEFAULT", "CONTROLLER", "SPARK_DRIVER" or "SPARK_EXECUTOR".
The software configuration for this Dataproc cluster running on Kubernetes.
Conditions of the resource.
simplecluster
apiVersion: dataproc.gcp.upbound.io/v1beta1
kind: Cluster
metadata:
annotations:
meta.upbound.io/example-id: dataproc/v1beta1/cluster
labels:
testing.upbound.io/example-name: simplecluster
name: simplecluster
spec:
forProvider:
clusterConfig:
- preemptibleWorkerConfig:
- diskConfig:
- bootDiskType: pd-standard
name: simplecluster
region: us-central1
mycluster
apiVersion: dataproc.gcp.upbound.io/v1beta1
kind: Cluster
metadata:
annotations:
meta.upbound.io/example-id: dataproc/v1beta1/job
labels:
testing.upbound.io/example-name: mycluster
name: mycluster
spec:
forProvider:
clusterConfig:
- preemptibleWorkerConfig:
- diskConfig:
- bootDiskType: pd-standard
name: dproc-cluster-unique-name
region: us-central1
© 2022 Upbound, Inc.
Discover the building blocksfor your internal cloud platform.