Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Data Proc
  • Use cases
    • Configuring networks for Data Proc clusters
    • Using Apache Hive
    • Running Spark applications
    • Running applications from a remote host
    • Copying files from Yandex Object Storage
  • Step-by-step instructions
    • All instructions
    • Creating clusters
    • Connecting to clusters
    • Updating subclusters
    • Managing subclusters
    • Deleting clusters
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Auto scaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Quotas and limits
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Questions and answers
  1. API reference
  2. REST
  3. Job
  4. create

Method create

  • HTTP request
  • Path parameters
  • Body parameters
  • Response

Creates a job for a cluster.

HTTP request

POST https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters/{clusterId}/jobs

Path parameters

Parameter Description
clusterId Required. ID of the cluster to create a job for. The maximum string length in characters is 50.

Body parameters

{
  "name": "string",

  //  includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
  "mapreduceJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",

    // `mapreduceJob` includes only one of the fields `mainJarFileUri`, `mainClass`
    "mainJarFileUri": "string",
    "mainClass": "string",
    // end of the list of possible fields`mapreduceJob`

  },
  "sparkJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",
    "mainJarFileUri": "string",
    "mainClass": "string"
  },
  "pysparkJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",
    "mainPythonFileUri": "string",
    "pythonFileUris": [
      "string"
    ]
  },
  "hiveJob": {
    "properties": "object",
    "continueOnFailure": true,
    "scriptVariables": "object",
    "jarFileUris": [
      "string"
    ],

    // `hiveJob` includes only one of the fields `queryFileUri`, `queryList`
    "queryFileUri": "string",
    "queryList": {
      "queries": [
        "string"
      ]
    },
    // end of the list of possible fields`hiveJob`

  },
  // end of the list of possible fields

}
Field Description
name string

Name of the job.

Value must match the regular expression \|[a-z][-a-z0-9]{1,61}[a-z0-9].

mapreduceJob object
Specification for a MapReduce job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

mapreduceJob.
args[]
string

Optional arguments to pass to the driver.

mapreduceJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.

mapreduceJob.
fileUris[]
string

URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.

mapreduceJob.
archiveUris[]
string

URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.

mapreduceJob.
properties
object

Property names and values, used to configure Data Proc and MapReduce.

mapreduceJob.
mainJarFileUri
string
mapreduceJob includes only one of the fields mainJarFileUri, mainClass

HCFS URI of the .jar file containing the driver class.

mapreduceJob.
mainClass
string
mapreduceJob includes only one of the fields mainJarFileUri, mainClass

The name of the driver class.

sparkJob object
Specification for a Spark job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

sparkJob.
args[]
string

Optional arguments to pass to the driver.

sparkJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.

sparkJob.
fileUris[]
string

URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.

sparkJob.
archiveUris[]
string

URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.

sparkJob.
properties
object

Property names and values, used to configure Data Proc and Spark.

sparkJob.
mainJarFileUri
string

The HCFS URI of the JAR file containing the main class for the job.

sparkJob.
mainClass
string

The name of the driver class.

pysparkJob object
Specification for a PySpark job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

pysparkJob.
args[]
string

Optional arguments to pass to the driver.

pysparkJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.

pysparkJob.
fileUris[]
string

URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.

pysparkJob.
archiveUris[]
string

URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.

pysparkJob.
properties
object

Property names and values, used to configure Data Proc and PySpark.

pysparkJob.
mainPythonFileUri
string

URI of the file with the driver code. Must be a .py file.

pysparkJob.
pythonFileUris[]
string

URIs of Python files to pass to the PySpark framework.

hiveJob object
Specification for a Hive job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

hiveJob.
properties
object

Property names and values, used to configure Data Proc and Hive.

hiveJob.
continueOnFailure
boolean (boolean)

Flag indicating whether a job should continue to run if a query fails.

hiveJob.
scriptVariables
object

Query variables and their values.

hiveJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Hive driver and each task.

hiveJob.
queryFileUri
string
hiveJob includes only one of the fields queryFileUri, queryList

URI of the script with all the necessary Hive queries.

hiveJob.
queryList
object
List of Hive queries to be used in the job.
hiveJob includes only one of the fields queryFileUri, queryList

hiveJob.
queryList.
queries[]
string

List of Hive queries.

Response

HTTP Code: 200 - OK

{
  "id": "string",
  "description": "string",
  "createdAt": "string",
  "createdBy": "string",
  "modifiedAt": "string",
  "done": true,
  "metadata": "object",

  //  includes only one of the fields `error`, `response`
  "error": {
    "code": "integer",
    "message": "string",
    "details": [
      "object"
    ]
  },
  "response": "object",
  // end of the list of possible fields

}

An Operation resource. For more information, see Operation.

Field Description
id string

ID of the operation.

description string

Description of the operation. 0-256 characters long.

createdAt string (date-time)

Creation timestamp.

String in RFC3339 text format.

createdBy string

ID of the user or service account who initiated the operation.

modifiedAt string (date-time)

The time when the Operation resource was last modified.

String in RFC3339 text format.

done boolean (boolean)

If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available.

metadata object

Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any.

error object
The error result of the operation in case of failure or cancellation.
includes only one of the fields error, response

The error result of the operation in case of failure or cancellation.

error.
code
integer (int32)

Error code. An enum value of google.rpc.Code.

error.
message
string

An error message.

error.
details[]
object

A list of messages that carry the error details.

response object
includes only one of the fields error, response

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is the standard Create/Update, the response should be the target resource of the operation. Any method that returns a long-running operation should document the response type, if any.

In this article:
  • HTTP request
  • Path parameters
  • Body parameters
  • Response
Language
Careers
Privacy policy
Terms of use
© 2021 Yandex.Cloud LLC