Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Solutions
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Data Proc
  • Use cases
    • Configuring networks for Data Proc clusters
    • Using Apache Hive
    • Running Spark applications
    • Running applications from a remote host
    • Copying files from Yandex Object Storage
  • Step-by-step instructions
    • All instructions
    • Creating clusters
    • Connecting to clusters
    • Updating subclusters
    • Managing subclusters
    • Deleting clusters
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Auto scaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Quotas and limits
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Questions and answers
  1. API reference
  2. REST
  3. Subcluster
  4. get

Method get

  • HTTP request
  • Path parameters
  • Response

Returns the specified subcluster.

To get the list of all available subclusters, make a list request.

HTTP request

GET https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters/{clusterId}/subclusters/{subclusterId}

Path parameters

Parameter Description
clusterId Required. ID of the Data Proc cluster that the subcluster belongs to. The maximum string length in characters is 50.
subclusterId Required. ID of the subcluster to return. To get a subcluster ID make a list request. The maximum string length in characters is 50.

Response

HTTP Code: 200 - OK

{
  "id": "string",
  "clusterId": "string",
  "createdAt": "string",
  "name": "string",
  "role": "string",
  "resources": {
    "resourcePresetId": "string",
    "diskTypeId": "string",
    "diskSize": "string"
  },
  "subnetId": "string",
  "hostsCount": "string",
  "autoscalingConfig": {
    "maxHostsCount": "string",
    "preemptible": true,
    "measurementDuration": "string",
    "warmupDuration": "string",
    "stabilizationDuration": "string",
    "cpuUtilizationTarget": "number",
    "decommissionTimeout": "string"
  },
  "instanceGroupId": "string"
}

A Data Proc subcluster. For details about the concept, see documentation.

Field Description
id string

ID of the subcluster. Generated at creation time.

clusterId string

ID of the Data Proc cluster that the subcluster belongs to.

createdAt string (date-time)

Creation timestamp.

String in RFC3339 text format.

name string

Name of the subcluster. The name is unique within the cluster.

The string length in characters must be 1-63.

role string

Role that is fulfilled by hosts of the subcluster.

  • MASTERNODE: The subcluster fulfills the master role.

Master can run the following services, depending on the requested components:

  • HDFS: Namenode, Secondary Namenode
  • YARN: ResourceManager, Timeline Server
  • HBase Master
  • Hive: Server, Metastore, HCatalog
  • Spark History Server
  • Zeppelin
  • ZooKeeper
  • DATANODE: The subcluster is a DATANODE in a Data Proc cluster.

DATANODE can run the following services, depending on the requested components:

  • HDFS DataNode
  • YARN NodeManager
  • HBase RegionServer
  • Spark libraries
  • COMPUTENODE: The subcluster is a COMPUTENODE in a Data Proc cluster.

COMPUTENODE can run the following services, depending on the requested components:

  • YARN NodeManager
  • Spark libraries
resources object

Resources allocated for each host in the subcluster.

resources.
resourcePresetId
string

ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation.

resources.
diskTypeId
string

Type of the storage environment for the host. Possible values:

  • network-hdd — network HDD drive,
  • network-ssd — network SSD drive.
resources.
diskSize
string (int64)

Volume of the storage available to a host, in bytes.

subnetId string

ID of the VPC subnet used for hosts in the subcluster.

hostsCount string (int64)

Number of hosts in the subcluster.

autoscalingConfig object

Configuration for instance group based subclusters

autoscalingConfig.
maxHostsCount
string (int64)

Upper limit for total instance subcluster count.

Acceptable values are 1 to 100, inclusive.

autoscalingConfig.
preemptible
boolean (boolean)

Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines.

autoscalingConfig.
measurementDuration
string

Required. Time in seconds allotted for averaging metrics.

Acceptable values are 60 seconds to 600 seconds, inclusive.

autoscalingConfig.
warmupDuration
string

The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected.

The maximum value is 600 seconds.

autoscalingConfig.
stabilizationDuration
string

Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should.

Acceptable values are 60 seconds to 1800 seconds, inclusive.

autoscalingConfig.
cpuUtilizationTarget
number (double)

Defines an autoscaling rule based on the average CPU utilization of the instance group.

Acceptable values are 10 to 100, inclusive.

autoscalingConfig.
decommissionTimeout
string (int64)

Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120

Acceptable values are 0 to 86400, inclusive.

instanceGroupId string

ID of Compute Instance Group for autoscaling subclusters

In this article:
  • HTTP request
  • Path parameters
  • Response
Language / Region
Careers
Privacy policy
Terms of use
Brandbook
© 2021 Yandex.Cloud LLC