Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Data Proc
  • Use cases
    • Configuring networks for Data Proc clusters
    • Using Apache Hive
    • Running Spark applications
    • Running applications from a remote host
    • Copying files from Yandex Object Storage
  • Step-by-step instructions
    • All instructions
    • Creating clusters
    • Connecting to clusters
    • Updating subclusters
    • Managing subclusters
    • Deleting clusters
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Auto scaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Quotas and limits
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Questions and answers
  1. Concepts
  2. Hadoop and component versions

Subcluster runtime environment

  • Version 1.1
  • Version 1.0

When creating a Data Proc cluster, you can choose the image version (versions of Hadoop and additional components). Each image version also includes Conda (a Python environment management system) and a set of machine learning tools.

Version 1.1

Hadoop and component versions

  • Hadoop 2.10.0
  • Tez 0.9.2
  • Hive 2.3.6
  • ZooKeeper 3.4.14
  • HBase 1.3.5
  • Sqoop 1.4.7
  • Oozie 4.3.1
  • Spark 2.4.4
  • Flume 1.8.0
  • Zeppelin 0.8.2

Python and machine learning library versions:

  • Python 3.7.5
  • PyArrow 0.13.0
  • ipykernel 5.1.3
  • TensorFlow 1.15.0
  • CatBoost 0.20
  • PyHive 0.6.1
  • LightGBM 2.3.0
  • XGBoost 0.90
  • scikit-learn 0.21.3
  • pandas 0.25.3
  • IPython 7.9.0
  • Matplotlib 3.1.1

Version 1.0

Hadoop and component versions:

  • Hadoop 2.8.5
  • Tez 0.9.1
  • Hive 2.3.4
  • ZooKeeper 3.4.6
  • HBase 1.3.3
  • Sqoop 1.4.6
  • Oozie 4.3.0
  • Spark 2.2.1
  • Flume 1.8.0
  • Zeppelin 0.7.3

Python and machine learning library versions:

  • Python 3.7
  • PyArrow 0.11.1
  • ipykernel 5.1.0
  • TensorFlow 1.13.1
  • CatBoost 0.14.2
  • LightGBM 2.2.3
  • XGBoost 0.82
  • scikit-learn 0.21.1
  • pandas 0.24.2
  • IPython 7.5.0
  • Matplotlib 3.0.3
In this article:
  • Version 1.1
  • Version 1.0
Language
Careers
Privacy policy
Terms of use
© 2021 Yandex.Cloud LLC