Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Yandex Data Proc
  • Practical guidelines
    • All practical guidelines
    • Working with jobs
      • Overview
      • Working with Hive jobs
      • Working with MapReduce jobs
      • Working with PySpark jobs
      • Working with Spark jobs
      • Using Apache Hive
      • Running Spark applications
      • Running applications from a remote host
    • Configuring networks for Data Proc clusters
    • Using Yandex Object Storage in Data Proc
    • Using initialization scripts to configure GeeseFS in Data Proc
    • Exchanging data with Managed Service for ClickHouse
    • Importing databases using Sqoop
  • Step-by-step instructions
    • All instructions
    • Information about existing clusters
    • Creating clusters
    • Connecting to clusters
    • Editing clusters
    • Updating subclusters
    • Managing subclusters
    • Sqoop usage
    • Connecting to component interfaces
    • Managing jobs
      • All jobs
      • Spark jobs
      • PySpark jobs
      • Hive jobs
      • MapReduce jobs
    • Deleting clusters
    • Working with logs
    • Monitoring the state of clusters and hosts
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Jobs in Data Proc
    • Automatic scaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Maintenance
    • Quotas and limits
    • Storage in Data Proc
    • Component properties
    • Logs in Data Proc
    • Initialization scripts
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • cancel
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Revision history
    • Service updates
    • Images
  • Questions and answers
  1. Revision history
  2. Images

Images Yandex Data Proc

Written by
Yandex Cloud
  • 2.0
    • Base components
    • Python and machine learning libraries
  • 1.4
    • Base components
    • Python and machine learning libraries

For a complete listing of current and deprecated Data Proc images, please see Runtime environment.

2.0

Base components

The following components have been updated:

  • HBase — 2.2.7.
  • Hadoop — 3.2.2.
  • Hive — 3.1.2.
  • Livy — 0.8.0.
  • Oozie — 5.2.1.
  • Spark — 3.0.2.
  • Tez — 0.10.0.
  • Zeppelin — 0.9.0.

Deprecated components have been removed:

  • Flume
  • Sqoop

Python and machine learning libraries

Python has been updated to version 3.8.10.

The following libraries have been updated:

  • IPython — 7.19.0.
  • ipykernel — 5.3.4.
  • Matplotlib — 3.2.2.
  • pandas — 1.1.3.
  • PyArrow — 1.0.1.
  • PyHive — 0.6.1.
  • scikit-learn — 0.23.2.

The following libraries have been deleted:

  • CatBoost
  • LightGBM
  • TensorFlow
  • XGBoost

1.4

Base components

The following components have been updated:

  • HBase — 1.3.5.
  • Hadoop — 2.10.0.
  • Hive — 2.3.6.
  • Flume — 1.9.0.
  • Livy — 0.7.0.
  • Oozie — 5.2.0.
  • Spark — 2.4.6.
  • Sqoop — 1.4.7.
  • Tez — 0.9.2.
  • Zeppelin — 0.8.2.
  • Zookeeper — 3.4.14.

Python and machine learning libraries

Python has been updated to version 3.7.9.

The following libraries have been updated:

  • CatBoost — 0.20.2.
  • IPython — 7.9.0.
  • ipykernel — 5.1.3.
  • LightGBM — 2.3.0.
  • Matplotlib — 3.1.1.
  • pandas — 0.25.3.
  • PyArrow — 0.13.0.
  • PyHive — 0.6.1.
  • scikit-learn — 0.21.3.
  • TensorFlow— 1.15.0.
  • XGBoost — 0.90.

Was the article helpful?

Language / Region
© 2022 Yandex.Cloud LLC
In this article:
  • 2.0
  • Base components
  • Python and machine learning libraries
  • 1.4
  • Base components
  • Python and machine learning libraries