Working with component network interfaces
Data Proc enables you to create clusters with just Yandex.Cloud internal addresses. However, you can't access component network or web interfaces externally. To connect externally to components like HDFS NameNode and YARN ResourceManager, you have to route traffic via an intermediate VM with a public IP address.
To access the network interface of a component from the web, create an intermediate virtual machine in Yandex Compute Cloud.
Requirements for an intermediate VM:
- An assigned public IP address.
- Hosted in the same network as the required Data Proc cluster.
- Security group settings that allow traffic exchange with the cluster via the corresponding components' ports.
For step-by-step instructions on how to configure security groups for port forwarding, see Connecting to Data Proc clusters.
To connect to the desired Data Proc host port, run the following command:
ssh -A -J <VM public IP address> -L <port number>:<FQDN of Data Proc host>:<port number> root@<FQDN of Data Proc host>
You can find the FQDN of the Data Proc host on the Data Proc cluster page, in the Hosts tab, under the Hostname column.
The port numbers used for Data Proc components are given below.
Components and ports
|HDFS Name Node||9870|
|MapReduce Application History||19888|
|YARN Application History||8188|
|YARN Resource Manager||8088|