Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Solutions
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Data Proc
  • Use cases
    • Configuring networks for Data Proc clusters
    • Using Apache Hive
    • Running Spark applications
    • Running applications from a remote host
    • Copying files from Yandex Object Storage
  • Step-by-step instructions
    • All instructions
    • Creating clusters
    • Connecting to clusters
    • Updating subclusters
    • Managing subclusters
    • Deleting clusters
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Auto scaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Quotas and limits
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Questions and answers
  1. Concepts
  2. Hadoop and component versions

Subcluster runtime environment

  • Current images
  • Deprecated images

When creating a Data Proc cluster, you can choose the image version (versions of Hadoop and additional components).

The components for both current and deprecated image versions are listed below. Each image version includes Conda (a Python environment management system) and a set of machine learning tools.

Current images

Components Image 1.2 Image 1.3 Image 1.4
Hadoop and component versions
Hadoop 2.10.0 2.10.0 2.10.0
Tez 0.9.2 0.9.2 0.9.2
Hive 2.3.6 2.3.6 2.3.6
Zookeeper 3.4.14 3.4.14 3.4.14
HBase 1.3.5 1.3.5 1.3.5
Sqoop 1.4.7 1.4.7 1.4.7
Oozie 5.2.0 5.2.0 5.2.0
Spark 2.4.6 2.4.6 2.4.6
Flume 1.9.0 1.9.0 1.9.0
Zeppelin 0.8.2 0.8.2 0.8.2
Livy — 0.7.0 0.7.0
Python and machine learning library versions
Python 3.7.7 3.7.9 3.7.9
PyArrow 0.13.0 0.13.0 0.13.0
ipykernel 5.1.3 5.1.3 5.1.3
TensorFlow 1.15.0 1.15.0 1.15.0
CatBoost 0.20 0.20.2 0.20.2
PyHive 0.6.1 0.6.1 0.6.1
LightGBM 2.3.0 2.3.0 2.3.0
XGBoost 0.90 0.90 0.90
scikit-learn 0.21.3 0.21.3 0.21.3
pandas 0.25.3 0.25.3 0.25.3
IPython 7.9.0 7.9.0 7.9.0
Matplotlib 3.1.1 3.1.1 3.1.1

Deprecated images

Note

These image versions are deprecated. Consider using the recent image versions.

Components Image 1.0 Image 1.1
Hadoop and component versions
Hadoop 2.8.5 2.10.0
Tez 0.9.1 0.9.2
Hive 2.3.4 2.3.6
Zookeeper 3.4.6 3.4.14
HBase 1.3.3 1.3.5
Sqoop 1.4.6 1.4.7
Oozie 4.3.0 4.3.1
Spark 2.2.1 2.4.4
Flume 1.8.0 1.8.0
Zeppelin 0.7.3 0.8.2
Python and machine learning library versions
Python 3.7 3.7.5
PyArrow 0.11.1 0.13.0
ipykernel 5.1.0 5.1.3
TensorFlow 1.13.1 1.15.0
CatBoost 0.14.2 0.20
PyHive — 0.6.1
LightGBM 2.2.3 2.3.0
XGBoost 0.82 0.90
scikit-learn 0.21.1 0.21.3
pandas 0.24.2 0.25.3
IPython 7.5.0 7.9.0
Matplotlib 3.0.3 3.1.1
In this article:
  • Current images
  • Deprecated images
Language / Region
Careers
Privacy policy
Terms of use
Brandbook
© 2021 Yandex.Cloud LLC