Data Proc API, REST: Job.get

Written by

Updated at May 26, 2023

HTTP request
Path parameters
Response

Returns the specified job.

HTTP request

GET https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters/{clusterId}/jobs/{jobId}

Path parameters

Parameter	Description
clusterId	Required. ID of the cluster to request a job from. The maximum string length in characters is 50.
jobId	Required. ID of the job to return. To get a job ID make a list request. The maximum string length in characters is 50.

Parameter

Description

clusterId

Required. ID of the cluster to request a job from.

The maximum string length in characters is 50.

jobId

Required. ID of the job to return.

To get a job ID make a list request.

The maximum string length in characters is 50.

Response

HTTP Code: 200 - OK

{
  "id": "string",
  "clusterId": "string",
  "createdAt": "string",
  "startedAt": "string",
  "finishedAt": "string",
  "name": "string",
  "createdBy": "string",
  "status": "string",
  "applicationInfo": {
    "id": "string",
    "applicationAttempts": [
      {
        "id": "string",
        "amContainerId": "string"
      }
    ]
  },

  //  includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
  "mapreduceJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",

    // `mapreduceJob` includes only one of the fields `mainJarFileUri`, `mainClass`
    "mainJarFileUri": "string",
    "mainClass": "string",
    // end of the list of possible fields`mapreduceJob`

  },
  "sparkJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",
    "mainJarFileUri": "string",
    "mainClass": "string",
    "packages": [
      "string"
    ],
    "repositories": [
      "string"
    ],
    "excludePackages": [
      "string"
    ]
  },
  "pysparkJob": {
    "args": [
      "string"
    ],
    "jarFileUris": [
      "string"
    ],
    "fileUris": [
      "string"
    ],
    "archiveUris": [
      "string"
    ],
    "properties": "object",
    "mainPythonFileUri": "string",
    "pythonFileUris": [
      "string"
    ],
    "packages": [
      "string"
    ],
    "repositories": [
      "string"
    ],
    "excludePackages": [
      "string"
    ]
  },
  "hiveJob": {
    "properties": "object",
    "continueOnFailure": true,
    "scriptVariables": "object",
    "jarFileUris": [
      "string"
    ],

    // `hiveJob` includes only one of the fields `queryFileUri`, `queryList`
    "queryFileUri": "string",
    "queryList": {
      "queries": [
        "string"
      ]
    },
    // end of the list of possible fields`hiveJob`

  },
  // end of the list of possible fields

}

A Data Proc job. For details about the concept, see documentation.

Field	Description
id	string ID of the job. Generated at creation time.
clusterId	string ID of the Data Proc cluster that the job belongs to.
createdAt	string (date-time) Creation timestamp. String in RFC3339 text format. The range of possible values is from `0001-01-01T00:00:00Z` to `9999-12-31T23:59:59.999999999Z`, i.e. from 0 to 9 digits for fractions of a second. To work with values in this field, use the APIs described in the Protocol Buffers reference. In some languages, built-in datetime utilities do not support nanosecond precision (9 digits).
startedAt	string (date-time) The time when the job was started. String in RFC3339 text format. The range of possible values is from `0001-01-01T00:00:00Z` to `9999-12-31T23:59:59.999999999Z`, i.e. from 0 to 9 digits for fractions of a second. To work with values in this field, use the APIs described in the Protocol Buffers reference. In some languages, built-in datetime utilities do not support nanosecond precision (9 digits).
finishedAt	string (date-time) The time when the job was finished. String in RFC3339 text format. The range of possible values is from `0001-01-01T00:00:00Z` to `9999-12-31T23:59:59.999999999Z`, i.e. from 0 to 9 digits for fractions of a second. To work with values in this field, use the APIs described in the Protocol Buffers reference. In some languages, built-in datetime utilities do not support nanosecond precision (9 digits).
name	string Name of the job, specified in the create request.
createdBy	string The id of the user who created the job
status	string Job status. PROVISIONING: Job is logged in the database and is waiting for the agent to run it. PENDING: Job is acquired by the agent and is in the queue for execution. RUNNING: Job is being run in the cluster. ERROR: Job failed to finish the run properly. DONE: Job is finished. CANCELLED: Job is cancelled. CANCELLING: Job is waiting for cancellation.
applicationInfo	object Attributes of YARN application.
applicationInfo. id	string ID of YARN application
applicationInfo. applicationAttempts[]	object YARN application attempts
applicationInfo. applicationAttempts[]. id	string ID of YARN application attempt
applicationInfo. applicationAttempts[]. amContainerId	string ID of YARN Application Master container
mapreduceJob	object Specification for a MapReduce job. includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
mapreduceJob. args[]	string Optional arguments to pass to the driver.
mapreduceJob. jarFileUris[]	string JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.
mapreduceJob. fileUris[]	string URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.
mapreduceJob. archiveUris[]	string URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.
mapreduceJob. properties	object Property names and values, used to configure Data Proc and MapReduce.
mapreduceJob. mainJarFileUri	string `mapreduceJob` includes only one of the fields `mainJarFileUri`, `mainClass` HCFS URI of the .jar file containing the driver class.
mapreduceJob. mainClass	string `mapreduceJob` includes only one of the fields `mainJarFileUri`, `mainClass` The name of the driver class.
sparkJob	object Specification for a Spark job. includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
sparkJob. args[]	string Optional arguments to pass to the driver.
sparkJob. jarFileUris[]	string JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.
sparkJob. fileUris[]	string URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.
sparkJob. archiveUris[]	string URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.
sparkJob. properties	object Property names and values, used to configure Data Proc and Spark.
sparkJob. mainJarFileUri	string The HCFS URI of the JAR file containing the `main` class for the job.
sparkJob. mainClass	string The name of the driver class.
sparkJob. packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
sparkJob. repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
sparkJob. excludePackages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.
pysparkJob	object Specification for a PySpark job. includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
pysparkJob. args[]	string Optional arguments to pass to the driver.
pysparkJob. jarFileUris[]	string JAR file URIs to add to CLASSPATH of the Data Proc driver and each task.
pysparkJob. fileUris[]	string URIs of resource files to be copied to the working directory of Data Proc drivers and distributed Hadoop tasks.
pysparkJob. archiveUris[]	string URIs of archives to be extracted to the working directory of Data Proc drivers and tasks.
pysparkJob. properties	object Property names and values, used to configure Data Proc and PySpark.
pysparkJob. mainPythonFileUri	string URI of the file with the driver code. Must be a .py file.
pysparkJob. pythonFileUris[]	string URIs of Python files to pass to the PySpark framework.
pysparkJob. packages[]	string List of maven coordinates of jars to include on the driver and executor classpaths.
pysparkJob. repositories[]	string List of additional remote repositories to search for the maven coordinates given with --packages.
pysparkJob. excludePackages[]	string List of groupId:artifactId, to exclude while resolving the dependencies provided in --packages to avoid dependency conflicts.
hiveJob	object Specification for a Hive job. includes only one of the fields `mapreduceJob`, `sparkJob`, `pysparkJob`, `hiveJob`
hiveJob. properties	object Property names and values, used to configure Data Proc and Hive.
hiveJob. continueOnFailure	boolean (boolean) Flag indicating whether a job should continue to run if a query fails.
hiveJob. scriptVariables	object Query variables and their values.
hiveJob. jarFileUris[]	string JAR file URIs to add to CLASSPATH of the Hive driver and each task.
hiveJob. queryFileUri	string `hiveJob` includes only one of the fields `queryFileUri`, `queryList` URI of the script with all the necessary Hive queries.
hiveJob. queryList	object List of Hive queries to be used in the job. `hiveJob` includes only one of the fields `queryFileUri`, `queryList`
hiveJob. queryList. queries[]	string List of Hive queries.

Data Proc API, REST: Job.get

HTTP requestHTTP request

Path parametersPath parameters

ResponseResponse

Was the article helpful?

HTTP request

Path parameters

Response