Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Solutions
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Data Proc
  • Use cases
    • Configuring networks for Data Proc clusters
    • Using Apache Hive
    • Running Spark applications
    • Running applications from a remote host
    • Copying files from Yandex Object Storage
  • Step-by-step instructions
    • All instructions
    • Creating clusters
    • Connecting to clusters
    • Updating subclusters
    • Managing subclusters
    • Deleting clusters
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Auto scaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Quotas and limits
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Questions and answers
  1. API reference
  2. REST
  3. Job
  4. get

Method get

  • HTTP request
  • Path parameters
  • Response

Returns the specified job.

HTTP request

GET https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters/{clusterId}/jobs/{jobId}

Path parameters

Parameter Description
clusterId Required. ID of the cluster to request a job from. The maximum string length in characters is 50.
jobId Required. ID of the job to return. To get a job ID make a list request. The maximum string length in characters is 50.

Response

HTTP Code: 200 - OK

{
  "id": "string",
  "clusterId": "string",
  "createdAt": "string",
  "startedAt": "string",
  "finishedAt": "string",
  "name": "string",
  "createdBy": "string",
  "status": "string",

  //  includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
  "mapreduceJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",

    // `mapreduceJob` includes only one of the fields `mainJarFileUri`, `mainClass`
    "mainJarFileUri": "string",
    "mainClass": "string",
    // end of the list of possible fields`mapreduceJob`

  },
  "sparkJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",
    "mainJarFileUri": "string",
    "mainClass": "string"
  },
  "pysparkJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",
    "mainPythonFileUri": "string",
    "pythonFileUris": [
      "string"
    ]
  },
  "hiveJob": {
    "properties": "object",
    "continueOnFailure": true,
    "scriptVariables": "object",
    "jarFileUris": [
      "string"
    ],

    // `hiveJob` includes only one of the fields `queryFileUri`, `queryList`
    "queryFileUri": "string",
    "queryList": {
      "queries": [
        "string"
      ]
    },
    // end of the list of possible fields`hiveJob`

  },
  // end of the list of possible fields

}

A Data Proc job. For details about the concept, see documentation.

Field Description
id string

ID of the job. Generated at creation time.

clusterId string

ID of the Data Proc cluster that the job belongs to.

createdAt string (date-time)

Creation timestamp.

String in RFC3339 text format.

startedAt string (date-time)

The time when the job was started.

String in RFC3339 text format.

finishedAt string (date-time)

The time when the job was finished.

String in RFC3339 text format.

name string

Name of the job, specified in the create request.

createdBy string

The id of the user who created the job

status string
Job status.
  • PROVISIONING: Job is logged in the database and is waiting for the agent to run it.
  • PENDING: Job is acquired by the agent and is in the queue for execution.
  • RUNNING: Job is being run in the cluster.
  • ERROR: Job failed to finish the run properly.
  • DONE: Job is finished.
mapreduceJob object
Specification for a MapReduce job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

mapreduceJob.
args[]
string

Optional arguments to pass to the driver.

mapreduceJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.

mapreduceJob.
fileUris[]
string

URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.

mapreduceJob.
archiveUris[]
string

URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.

mapreduceJob.
properties
object

Property names and values, used to configure Data Proc and MapReduce.

mapreduceJob.
mainJarFileUri
string
mapreduceJob includes only one of the fields mainJarFileUri, mainClass

HCFS URI of the .jar file containing the driver class.

mapreduceJob.
mainClass
string
mapreduceJob includes only one of the fields mainJarFileUri, mainClass

The name of the driver class.

sparkJob object
Specification for a Spark job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

sparkJob.
args[]
string

Optional arguments to pass to the driver.

sparkJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.

sparkJob.
fileUris[]
string

URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.

sparkJob.
archiveUris[]
string

URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.

sparkJob.
properties
object

Property names and values, used to configure Data Proc and Spark.

sparkJob.
mainJarFileUri
string

The HCFS URI of the JAR file containing the main class for the job.

sparkJob.
mainClass
string

The name of the driver class.

pysparkJob object
Specification for a PySpark job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

pysparkJob.
args[]
string

Optional arguments to pass to the driver.

pysparkJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.

pysparkJob.
fileUris[]
string

URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.

pysparkJob.
archiveUris[]
string

URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.

pysparkJob.
properties
object

Property names and values, used to configure Data Proc and PySpark.

pysparkJob.
mainPythonFileUri
string

URI of the file with the driver code. Must be a .py file.

pysparkJob.
pythonFileUris[]
string

URIs of Python files to pass to the PySpark framework.

hiveJob object
Specification for a Hive job.
includes only one of the fields mapreduceJob, sparkJob, pysparkJob, hiveJob

hiveJob.
properties
object

Property names and values, used to configure Data Proc and Hive.

hiveJob.
continueOnFailure
boolean (boolean)

Flag indicating whether a job should continue to run if a query fails.

hiveJob.
scriptVariables
object

Query variables and their values.

hiveJob.
jarFileUris[]
string

JAR file URIs to add to CLASSPATH of the Hive driver and each task.

hiveJob.
queryFileUri
string
hiveJob includes only one of the fields queryFileUri, queryList

URI of the script with all the necessary Hive queries.

hiveJob.
queryList
object
List of Hive queries to be used in the job.
hiveJob includes only one of the fields queryFileUri, queryList

hiveJob.
queryList.
queries[]
string

List of Hive queries.

In this article:
  • HTTP request
  • Path parameters
  • Response
Language / Region
Careers
Privacy policy
Terms of use
Brandbook
© 2021 Yandex.Cloud LLC