Data Proc API, gRPC: ClusterService
A set of methods for managing Data Proc clusters.
Call | Description |
---|---|
Get | Returns the specified cluster. |
List | Retrieves the list of clusters in the specified folder. |
Create | Creates a cluster in the specified folder. |
Update | Updates the configuration of the specified cluster. |
Delete | Deletes the specified cluster. |
Start | Starts the specified cluster. |
Stop | Stops the specified cluster. |
ListOperations | Lists operations for the specified cluster. |
ListHosts | Retrieves the list of hosts in the specified cluster. |
ListUILinks | Retrieves a list of links to web interfaces being proxied by Data Proc UI Proxy. |
Calls ClusterService
Get
Returns the specified cluster.
To get the list of all available clusters, make a ClusterService.List request.
rpc Get (GetClusterRequest) returns (Cluster)
GetClusterRequest
Field | Description |
---|---|
cluster_id | string Required. ID of the Data Proc cluster. To get a cluster ID make a ClusterService.List request. The maximum string length in characters is 50. |
Cluster
Field | Description |
---|---|
id | string ID of the cluster. Generated at creation time. |
folder_id | string ID of the folder that the cluster belongs to. |
created_at | google.protobuf.Timestamp Creation timestamp. |
name | string Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63. |
description | string Description of the cluster. The string length in characters must be 0-256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. |
monitoring[] | Monitoring Monitoring systems relevant to the cluster. |
config | ClusterConfig Configuration of the cluster. |
health | enum Health Aggregated cluster health.
|
status | enum Status Cluster status.
|
zone_id | string ID of the availability zone where the cluster resides. |
service_account_id | string ID of service account for the Data Proc manager agent. |
bucket | string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
ui_proxy | bool Whether UI Proxy feature is enabled. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups hosting VMs of the cluster. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true |
Monitoring
Field | Description |
---|---|
name | string Name of the monitoring system. |
description | string Description of the monitoring system. |
link | string Link to the monitoring system. |
ClusterConfig
Field | Description |
---|---|
version_id | string Image version for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific configuration options. |
HadoopConfig
Field | Description |
---|---|
services[] | enum Service Set of services used in the cluster (if empty, the default set is used). |
properties | map<string,string> Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property. For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml . |
ssh_public_keys[] | string List of public SSH keys to access to cluster hosts. |
initialization_actions[] | InitializationAction Set of init-actions |
InitializationAction
Field | Description |
---|---|
uri | string URI of the executable file |
args[] | string Arguments to the initialization action |
timeout | int64 Execution timeout |
List
Retrieves the list of clusters in the specified folder.
rpc List (ListClustersRequest) returns (ListClustersResponse)
ListClustersRequest
Field | Description |
---|---|
folder_id | string Required. ID of the folder to list clusters in. To get the folder ID make a yandex.cloud.resourcemanager.v1.FolderService.List request. The maximum string length in characters is 50. |
page_size | int64 The maximum number of results per page to return. If the number of available results is larger than page_size , the service returns a ListClustersResponse.next_page_token that can be used to get the next page of results in subsequent list requests. Default value: 100. The maximum value is 1000. |
page_token | string Page token. To get the next page of results, set page_token to the ListClustersResponse.next_page_token returned by a previous list request. The maximum string length in characters is 100. |
filter | string A filter expression that filters clusters listed in the response. The expression must specify:
name=my-cluster . The maximum string length in characters is 1000. |
ListClustersResponse
Field | Description |
---|---|
clusters[] | Cluster List of clusters in the specified folder. |
next_page_token | string Token for getting the next page of the list. If the number of results is greater than the specified ListClustersRequest.page_size, use next_page_token as the value for the ListClustersRequest.page_token parameter in the next list request. Each subsequent page will have its own next_page_token to continue paging through the results. |
Cluster
Field | Description |
---|---|
id | string ID of the cluster. Generated at creation time. |
folder_id | string ID of the folder that the cluster belongs to. |
created_at | google.protobuf.Timestamp Creation timestamp. |
name | string Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63. |
description | string Description of the cluster. The string length in characters must be 0-256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. |
monitoring[] | Monitoring Monitoring systems relevant to the cluster. |
config | ClusterConfig Configuration of the cluster. |
health | enum Health Aggregated cluster health.
|
status | enum Status Cluster status.
|
zone_id | string ID of the availability zone where the cluster resides. |
service_account_id | string ID of service account for the Data Proc manager agent. |
bucket | string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
ui_proxy | bool Whether UI Proxy feature is enabled. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups hosting VMs of the cluster. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true |
Monitoring
Field | Description |
---|---|
name | string Name of the monitoring system. |
description | string Description of the monitoring system. |
link | string Link to the monitoring system. |
ClusterConfig
Field | Description |
---|---|
version_id | string Image version for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific configuration options. |
HadoopConfig
Field | Description |
---|---|
services[] | enum Service Set of services used in the cluster (if empty, the default set is used). |
properties | map<string,string> Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property. For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml . |
ssh_public_keys[] | string List of public SSH keys to access to cluster hosts. |
initialization_actions[] | InitializationAction Set of init-actions |
InitializationAction
Field | Description |
---|---|
uri | string URI of the executable file |
args[] | string Arguments to the initialization action |
timeout | int64 Execution timeout |
Create
Creates a cluster in the specified folder.
rpc Create (CreateClusterRequest) returns (operation.Operation)
Metadata and response of Operation:
Operation.metadata:CreateClusterMetadata
Operation.response:Cluster
CreateClusterRequest
Field | Description |
---|---|
folder_id | string Required. ID of the folder to create a cluster in. To get a folder ID make a yandex.cloud.resourcemanager.v1.FolderService.List request. The maximum string length in characters is 50. |
name | string Name of the cluster. The name must be unique within the folder. The name can't be changed after the Data Proc cluster is created. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9] . |
description | string Description of the cluster. The maximum string length in characters is 256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. The maximum string length in characters for each value is 63. Each value must match the regular expression [-_0-9a-z]* . The string length in characters for each key must be 1-63. Each key must match the regular expression [a-z][-_0-9a-z]* . |
config_spec | CreateClusterConfigSpec Required. Configuration and resources for hosts that should be created with the cluster. |
zone_id | string Required. ID of the availability zone where the cluster should be placed. To get the list of available zones make a yandex.cloud.compute.v1.ZoneService.List request. The maximum string length in characters is 50. |
service_account_id | string Required. ID of the service account to be used by the Data Proc manager agent. |
bucket | string Name of the Object Storage bucket to use for Data Proc jobs. |
ui_proxy | bool Enable UI Proxy feature. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups to place VMs of cluster on. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service |
CreateClusterConfigSpec
Field | Description |
---|---|
version_id | string Version of the image for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific options. |
subclusters_spec[] | CreateSubclusterConfigSpec Specification for creating subclusters. |
HadoopConfig
Field | Description |
---|---|
services[] | enum Service Set of services used in the cluster (if empty, the default set is used). |
properties | map<string,string> Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property. For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml . |
ssh_public_keys[] | string List of public SSH keys to access to cluster hosts. |
initialization_actions[] | InitializationAction Set of init-actions |
InitializationAction
Field | Description |
---|---|
uri | string URI of the executable file |
args[] | string Arguments to the initialization action |
timeout | int64 Execution timeout |
CreateSubclusterConfigSpec
Field | Description |
---|---|
name | string Name of the subcluster. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9] . |
role | enum Role Required. Role of the subcluster in the Data Proc cluster.
|
resources | Resources Required. Resource configuration for hosts in the subcluster. |
subnet_id | string Required. ID of the VPC subnet used for hosts in the subcluster. The maximum string length in characters is 50. |
hosts_count | int64 Number of hosts in the subcluster. The minimum value is 1. |
assign_public_ip | bool Assign public ip addresses for all hosts in subcluter. |
autoscaling_config | AutoscalingConfig Configuration for instance group based subclusters |
Resources
Field | Description |
---|---|
resource_preset_id | string ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation. |
disk_type_id | string Type of the storage environment for the host. Possible values:
|
disk_size | int64 Volume of the storage available to a host, in bytes. |
AutoscalingConfig
Field | Description |
---|---|
max_hosts_count | int64 Upper limit for total instance subcluster count. Acceptable values are 1 to 100, inclusive. |
preemptible | bool Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines. |
measurement_duration | google.protobuf.Duration Required. Time in seconds allotted for averaging metrics. Acceptable values are 1m to 10m, inclusive. |
warmup_duration | google.protobuf.Duration The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected. The maximum value is 10m. |
stabilization_duration | google.protobuf.Duration Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should. Acceptable values are 1m to 30m, inclusive. |
cpu_utilization_target | double Defines an autoscaling rule based on the average CPU utilization of the instance group. Acceptable values are 0 to 100, inclusive. |
decommission_timeout | int64 Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120 Acceptable values are 0 to 86400, inclusive. |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any if operation finished successfully. |
CreateClusterMetadata
Field | Description |
---|---|
cluster_id | string ID of the cluster that is being created. |
Cluster
Field | Description |
---|---|
id | string ID of the cluster. Generated at creation time. |
folder_id | string ID of the folder that the cluster belongs to. |
created_at | google.protobuf.Timestamp Creation timestamp. |
name | string Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63. |
description | string Description of the cluster. The string length in characters must be 0-256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. |
monitoring[] | Monitoring Monitoring systems relevant to the cluster. |
config | ClusterConfig Configuration of the cluster. |
health | enum Health Aggregated cluster health.
|
status | enum Status Cluster status.
|
zone_id | string ID of the availability zone where the cluster resides. |
service_account_id | string ID of service account for the Data Proc manager agent. |
bucket | string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
ui_proxy | bool Whether UI Proxy feature is enabled. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups hosting VMs of the cluster. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true |
Monitoring
Field | Description |
---|---|
name | string Name of the monitoring system. |
description | string Description of the monitoring system. |
link | string Link to the monitoring system. |
ClusterConfig
Field | Description |
---|---|
version_id | string Image version for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific configuration options. |
Update
Updates the configuration of the specified cluster.
rpc Update (UpdateClusterRequest) returns (operation.Operation)
Metadata and response of Operation:
Operation.metadata:UpdateClusterMetadata
Operation.response:Cluster
UpdateClusterRequest
Field | Description |
---|---|
cluster_id | string ID of the cluster to update. To get the cluster ID, make a ClusterService.List request. The maximum string length in characters is 50. |
update_mask | google.protobuf.FieldMask Field mask that specifies which attributes of the cluster should be updated. |
description | string New description for the cluster. The maximum string length in characters is 256. |
labels | map<string,string> A new set of cluster labels as key:value pairs. No more than 64 per resource. The maximum string length in characters for each value is 63. Each value must match the regular expression [-_0-9a-z]* . The string length in characters for each key must be 1-63. Each key must match the regular expression [a-z][-_0-9a-z]* . |
config_spec | UpdateClusterConfigSpec Configuration and resources for hosts that should be created with the Data Proc cluster. |
name | string New name for the Data Proc cluster. The name must be unique within the folder. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9] . |
service_account_id | string ID of the new service account to be used by the Data Proc manager agent. |
bucket | string Name of the new Object Storage bucket to use for Data Proc jobs. |
decommission_timeout | int64 Timeout to gracefully decommission nodes. In seconds. Default value: 0 Acceptable values are 0 to 86400, inclusive. |
ui_proxy | bool Enable UI Proxy feature. |
security_group_ids[] | string User security groups. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service |
UpdateClusterConfigSpec
Field | Description |
---|---|
subclusters_spec[] | UpdateSubclusterConfigSpec New configuration for subclusters in a cluster. |
hadoop | HadoopConfig Hadoop specific options |
UpdateSubclusterConfigSpec
Field | Description |
---|---|
id | string ID of the subcluster to update. To get the subcluster ID make a SubclusterService.List request. |
name | string Name of the subcluster. Value must match the regular expression |[a-z][-a-z0-9]{1,61}[a-z0-9] . |
resources | Resources Resource configuration for each host in the subcluster. |
hosts_count | int64 Number of hosts in the subcluster. The minimum value is 1. |
autoscaling_config | AutoscalingConfig Configuration for instance group based subclusters |
Resources
Field | Description |
---|---|
resource_preset_id | string ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation. |
disk_type_id | string Type of the storage environment for the host. Possible values:
|
disk_size | int64 Volume of the storage available to a host, in bytes. |
AutoscalingConfig
Field | Description |
---|---|
max_hosts_count | int64 Upper limit for total instance subcluster count. Acceptable values are 1 to 100, inclusive. |
preemptible | bool Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines. |
measurement_duration | google.protobuf.Duration Required. Time in seconds allotted for averaging metrics. Acceptable values are 1m to 10m, inclusive. |
warmup_duration | google.protobuf.Duration The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected. The maximum value is 10m. |
stabilization_duration | google.protobuf.Duration Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should. Acceptable values are 1m to 30m, inclusive. |
cpu_utilization_target | double Defines an autoscaling rule based on the average CPU utilization of the instance group. Acceptable values are 0 to 100, inclusive. |
decommission_timeout | int64 Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120 Acceptable values are 0 to 86400, inclusive. |
HadoopConfig
Field | Description |
---|---|
services[] | enum Service Set of services used in the cluster (if empty, the default set is used). |
properties | map<string,string> Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property. For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml . |
ssh_public_keys[] | string List of public SSH keys to access to cluster hosts. |
initialization_actions[] | InitializationAction Set of init-actions |
InitializationAction
Field | Description |
---|---|
uri | string URI of the executable file |
args[] | string Arguments to the initialization action |
timeout | int64 Execution timeout |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any if operation finished successfully. |
UpdateClusterMetadata
Field | Description |
---|---|
cluster_id | string ID of the cluster that is being updated. |
Cluster
Field | Description |
---|---|
id | string ID of the cluster. Generated at creation time. |
folder_id | string ID of the folder that the cluster belongs to. |
created_at | google.protobuf.Timestamp Creation timestamp. |
name | string Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63. |
description | string Description of the cluster. The string length in characters must be 0-256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. |
monitoring[] | Monitoring Monitoring systems relevant to the cluster. |
config | ClusterConfig Configuration of the cluster. |
health | enum Health Aggregated cluster health.
|
status | enum Status Cluster status.
|
zone_id | string ID of the availability zone where the cluster resides. |
service_account_id | string ID of service account for the Data Proc manager agent. |
bucket | string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
ui_proxy | bool Whether UI Proxy feature is enabled. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups hosting VMs of the cluster. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true |
Monitoring
Field | Description |
---|---|
name | string Name of the monitoring system. |
description | string Description of the monitoring system. |
link | string Link to the monitoring system. |
ClusterConfig
Field | Description |
---|---|
version_id | string Image version for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific configuration options. |
Delete
Deletes the specified cluster.
rpc Delete (DeleteClusterRequest) returns (operation.Operation)
Metadata and response of Operation:
Operation.metadata:DeleteClusterMetadata
Operation.response:google.protobuf.Empty
DeleteClusterRequest
Field | Description |
---|---|
cluster_id | string Required. ID of the cluster to delete. To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50. |
decommission_timeout | int64 Timeout to gracefully decommission nodes. In seconds. Default value: 0 Acceptable values are 0 to 86400, inclusive. |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any if operation finished successfully. |
DeleteClusterMetadata
Field | Description |
---|---|
cluster_id | string ID of the Data Proc cluster that is being deleted. |
Start
Starts the specified cluster.
rpc Start (StartClusterRequest) returns (operation.Operation)
Metadata and response of Operation:
Operation.metadata:StartClusterMetadata
Operation.response:Cluster
StartClusterRequest
Field | Description |
---|---|
cluster_id | string Required. ID of the cluster to start. To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50. |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any if operation finished successfully. |
StartClusterMetadata
Field | Description |
---|---|
cluster_id | string ID of the Data Proc cluster that is being started. |
Cluster
Field | Description |
---|---|
id | string ID of the cluster. Generated at creation time. |
folder_id | string ID of the folder that the cluster belongs to. |
created_at | google.protobuf.Timestamp Creation timestamp. |
name | string Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63. |
description | string Description of the cluster. The string length in characters must be 0-256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. |
monitoring[] | Monitoring Monitoring systems relevant to the cluster. |
config | ClusterConfig Configuration of the cluster. |
health | enum Health Aggregated cluster health.
|
status | enum Status Cluster status.
|
zone_id | string ID of the availability zone where the cluster resides. |
service_account_id | string ID of service account for the Data Proc manager agent. |
bucket | string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
ui_proxy | bool Whether UI Proxy feature is enabled. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups hosting VMs of the cluster. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true |
Monitoring
Field | Description |
---|---|
name | string Name of the monitoring system. |
description | string Description of the monitoring system. |
link | string Link to the monitoring system. |
ClusterConfig
Field | Description |
---|---|
version_id | string Image version for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific configuration options. |
HadoopConfig
Field | Description |
---|---|
services[] | enum Service Set of services used in the cluster (if empty, the default set is used). |
properties | map<string,string> Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property. For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml . |
ssh_public_keys[] | string List of public SSH keys to access to cluster hosts. |
initialization_actions[] | InitializationAction Set of init-actions |
InitializationAction
Field | Description |
---|---|
uri | string URI of the executable file |
args[] | string Arguments to the initialization action |
timeout | int64 Execution timeout |
Stop
Stops the specified cluster.
rpc Stop (StopClusterRequest) returns (operation.Operation)
Metadata and response of Operation:
Operation.metadata:StopClusterMetadata
Operation.response:Cluster
StopClusterRequest
Field | Description |
---|---|
cluster_id | string Required. ID of the cluster to stop. To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50. |
decommission_timeout | int64 Timeout to gracefully decommission nodes. In seconds. Default value: 0 Acceptable values are 0 to 86400, inclusive. |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any if operation finished successfully. |
StopClusterMetadata
Field | Description |
---|---|
cluster_id | string ID of the Data Proc cluster that is being stopped. |
Cluster
Field | Description |
---|---|
id | string ID of the cluster. Generated at creation time. |
folder_id | string ID of the folder that the cluster belongs to. |
created_at | google.protobuf.Timestamp Creation timestamp. |
name | string Name of the cluster. The name is unique within the folder. The string length in characters must be 1-63. |
description | string Description of the cluster. The string length in characters must be 0-256. |
labels | map<string,string> Cluster labels as key:value pairs. No more than 64 per resource. |
monitoring[] | Monitoring Monitoring systems relevant to the cluster. |
config | ClusterConfig Configuration of the cluster. |
health | enum Health Aggregated cluster health.
|
status | enum Status Cluster status.
|
zone_id | string ID of the availability zone where the cluster resides. |
service_account_id | string ID of service account for the Data Proc manager agent. |
bucket | string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
ui_proxy | bool Whether UI Proxy feature is enabled. |
security_group_ids[] | string User security groups. |
host_group_ids[] | string Host groups hosting VMs of the cluster. |
deletion_protection | bool Deletion Protection inhibits deletion of the cluster |
log_group_id | string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true |
Monitoring
Field | Description |
---|---|
name | string Name of the monitoring system. |
description | string Description of the monitoring system. |
link | string Link to the monitoring system. |
ClusterConfig
Field | Description |
---|---|
version_id | string Image version for cluster provisioning. All available versions are listed in the documentation. |
hadoop | HadoopConfig Data Proc specific configuration options. |
HadoopConfig
Field | Description |
---|---|
services[] | enum Service Set of services used in the cluster (if empty, the default set is used). |
properties | map<string,string> Properties set for all hosts in *-site.xml configurations. The key should indicate the service and the property. For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property in the file /etc/hadoop/conf/hdfs-site.xml . |
ssh_public_keys[] | string List of public SSH keys to access to cluster hosts. |
initialization_actions[] | InitializationAction Set of init-actions |
InitializationAction
Field | Description |
---|---|
uri | string URI of the executable file |
args[] | string Arguments to the initialization action |
timeout | int64 Execution timeout |
ListOperations
Lists operations for the specified cluster.
rpc ListOperations (ListClusterOperationsRequest) returns (ListClusterOperationsResponse)
ListClusterOperationsRequest
Field | Description |
---|---|
cluster_id | string Required. ID of the cluster to list operations for. The maximum string length in characters is 50. |
page_size | int64 The maximum number of results per page to return. If the number of available results is larger than page_size , the service returns a ListClusterOperationsResponse.next_page_token that can be used to get the next page of results in subsequent list requests. Default value: 100. The maximum value is 1000. |
page_token | string Page token. To get the next page of results, set page_token to the ListClusterOperationsResponse.next_page_token returned by a previous list request. The maximum string length in characters is 100. |
ListClusterOperationsResponse
Field | Description |
---|---|
operations[] | operation.Operation List of operations for the specified cluster. |
next_page_token | string Token for getting the next page of the list. If the number of results is greater than the specified ListClusterOperationsRequest.page_size, use next_page_token as the value for the ListClusterOperationsRequest.page_token parameter in the next list request. Each subsequent page will have its own next_page_token to continue paging through the results. |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty |
ListHosts
Retrieves the list of hosts in the specified cluster.
rpc ListHosts (ListClusterHostsRequest) returns (ListClusterHostsResponse)
ListClusterHostsRequest
Field | Description |
---|---|
cluster_id | string ID of the cluster to list hosts for. To get a cluster ID, make a ClusterService.List request. The maximum string length in characters is 50. |
page_size | int64 The maximum number of results per page to return. If the number of available results is larger than page_size , the service returns a ListClusterHostsResponse.next_page_token that can be used to get the next page of results in subsequent list requests. Default value: 100. The maximum value is 1000. |
page_token | string Page token. To get the next page of results, set page_token to the ListClusterHostsResponse.next_page_token returned by a previous list request. The maximum string length in characters is 100. |
filter | string A filter expression that filters hosts listed in the response. The expression must specify:
name=my-host The maximum string length in characters is 1000. |
ListClusterHostsResponse
Field | Description |
---|---|
hosts[] | Host Requested list of hosts. |
next_page_token | string Token for getting the next page of the list. If the number of results is greater than the specified ListClusterHostsRequest.page_size, use next_page_token as the value for the ListClusterHostsRequest.page_token parameter in the next list request. Each subsequent page will have its own next_page_token to continue paging through the results. |
Host
Field | Description |
---|---|
name | string Name of the Data Proc host. The host name is assigned by Data Proc at creation time and cannot be changed. The name is generated to be unique across all Data Proc hosts that exist on the platform, as it defines the FQDN of the host. |
subcluster_id | string ID of the Data Proc subcluster that the host belongs to. |
health | enum Health Status code of the aggregated health of the host.
|
compute_instance_id | string ID of the Compute virtual machine that is used as the Data Proc host. |
role | enum Role Role of the host in the cluster.
|
ListUILinks
Retrieves a list of links to web interfaces being proxied by Data Proc UI Proxy.
rpc ListUILinks (ListUILinksRequest) returns (ListUILinksResponse)
ListUILinksRequest
Field | Description |
---|---|
cluster_id | string Required. ID of the Hadoop cluster. The maximum string length in characters is 50. |
ListUILinksResponse
Field | Description |
---|---|
links[] | UILink Requested list of ui links. |
UILink
Field | Description |
---|---|
name | string |
url | string |