Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Yandex Data Proc
  • Practical guidelines
    • Working with jobs
      • Overview
      • Working with Hive jobs
      • Working with MapReduce jobs
      • Working with PySpark jobs
      • Working with Spark jobs
      • Using Apache Hive
      • Running Spark applications
      • Running applications from a remote host
    • Configuring networks for Data Proc clusters
    • Using Yandex Object Storage in Data Proc
  • Step-by-step instructions
    • All instructions
    • Information about existing clusters
    • Creating clusters
    • Connecting to clusters
    • Editing clusters
    • Updating subclusters
    • Managing subclusters
    • Managing jobs
      • All jobs
      • Spark jobs
      • PySpark jobs
      • MapReduce jobs
      • Hive jobs
    • Deleting clusters
    • Monitoring the state of a cluster and hosts
    • Working with logs
  • Concepts
    • Data Proc overview
    • Host classes
    • Hadoop and component versions
    • Component interfaces and ports
    • Component web interfaces
    • Jobs in Data Proc
    • Autoscaling
    • Decommissioning subclusters and hosts
    • Network in Data Proc
    • Quotas and limits
    • Component properties
    • Logs in Data Proc
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • JobService
      • ResourcePresetService
      • SubclusterService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listOperations
        • listUILinks
        • start
        • stop
        • update
      • Job
        • Overview
        • cancel
        • create
        • get
        • list
        • listLog
      • ResourcePreset
        • Overview
        • get
        • list
      • Subcluster
        • Overview
        • create
        • delete
        • get
        • list
        • update
  • Questions and answers
  1. Concepts
  2. Component interfaces and ports

Working with component network interfaces

Written by
Yandex.Cloud
  • Port forwarding
  • Components and ports

Data Proc enables you to create clusters accessible from the internet or only from a cloud network. However, we recommend making service component interfaces inaccessible from outside Yandex Cloud in any configuration. To connect externally to components like HDFS NameNode and YARN ResourceManager, you can route traffic via an intermediate VM with a public IP address.

Port forwarding

To access the network interface of a component from the web, create an intermediate virtual machine in Yandex Compute Cloud.

Requirements for an intermediate VM:

  • An assigned public IP address.
  • Hosted in the same network as the required Data Proc cluster.
  • Security group settings that allow traffic exchange with the cluster via the corresponding components' ports.

For step-by-step instructions on how to configure security groups for port forwarding, see Connecting to clusters Data Proc.

To connect to the desired Data Proc host port, run the following command:

ssh -A -J <VM public IP address> -L <port number>:<FQDN of Data Proc host>:<port number> root@<FQDN of Data Proc host>

You can find the FQDN of the Data Proc host on the Data Proc cluster page, in the Hosts tab, under the Hostname column.

The port numbers used for Data Proc components are given below.

Components and ports

Service Port
HBase Master 16010
HBase REST 8085
HDFS Name Node 9870
Hive Server2 10002
MapReduce Application History 19888
Oozie 11000
Spark History 18080
YARN Application History 8188
YARN Resource Manager 8088
Zeppelin 8890

Was the article helpful?

Language / Region
© 2022 Yandex.Cloud LLC
In this article:
  • Port forwarding
  • Components and ports