Configuring a Apache Kafka® source endpoint
When creating or editing an endpoint, you can define:
- Yandex Managed Service for Apache Kafka® cluster connection or custom installation settings, including those based on Yandex Compute Cloud VMs. These are required parameters.
- Additional parameters.
Managed Service for Apache Kafka® cluster
Connection with the cluster ID specified in Yandex Cloud. Available only for clusters deployed in Yandex Managed Service for Apache Kafka®.
-
Managed Kafka: Select the cluster to connect to.
-
Security groups: Select the cloud network to host the endpoint and security groups for network traffic.
This will let you apply the specified security group rules to the VMs and clusters in the selected network without changing the settings of these VMs and clusters. For more information, see Network in Yandex Data Transfer.
-
Authentication: Select the
SASL
orWithout authentication
type.If
SASL
is selected, specify the hashing mechanism and the password and username of the account that Data Transfer will connect to the topic under. -
Topic: Specify the name of the topic to connect to.
Custom installation
Connection with the Apache Kafka® cluster with explicitly specified network addresses and broker host ports.
-
Broker URI: Specify broker host IPs or FQDNs.
If the Apache Kafka® port number differs from the standard one, specify it with a colon after the host name:
<broker host IP or FQDN>:<port number>
-
PEM certificate: If transmitted data needs to be encrypted, for example, to meet the requirements of PCI DSS, upload the certificate file or add its contents as text.
-
SSL: Use encryption to protect the connection.
-
Endpoint network interface: Select or create a subnet in the desired availability zone.
If the source and target are geographically close, connecting via the selected subnet speeds up the transfer.
-
Security groups: Select the cloud network to host the endpoint and security groups for network traffic.
This will let you apply the specified security group rules to the VMs and clusters in the selected network without changing the settings of these VMs and clusters. For more information, see Network in Yandex Data Transfer.
-
Authentication: Select the
SASL
orWithout authentication
type.If
SASL
is selected, specify the hashing mechanism and the password and username of the account that Data Transfer will connect to the topic under. -
Topic: Specify the name of the topic to connect to.
Additional parameters
Warning
Data is processed in the following order:
- Transformation.
- Conversion.
-
Transformation rules
The rules used by Cloud Function to process an incoming stream:
-
Processing function: Select one of the functions created in Yandex Cloud Functions.
- Service account: Select or create a service account that the processing function will start under.
-
Number of attempts: Set the number of attempts to invoke the processing function.
-
Buffer size to send: Set the size of the buffer (in bytes) which when full data will be transferred to the processing function.
The maximum buffer size is 3.5 MB. For more information about restrictions that apply when working with functions in Cloud Functions, see the corresponding section.
-
Sending interval: Set the duration of the interval (in seconds) after the expiration of which the data from the stream should be transferred to the processing function.
Note
If the buffer becomes full or the sending interval expires, the data is transferred to the processing function.
-
Call timeout: Set the allowed timeout of the response from the processing function (in seconds).
Warning
Values in the Sending interval and Call timeout fields are specified with the
s
postfix, for example,10s
. -
-
Conversion rules:
-
Data format: Select one of the available formats:
JSON
CSV
-
Data schema: Specify the schema as a list of fields or upload a file with a description of the schema in JSON format.
Sample data schema[ { "name": "request", "type": "string" } ]
-
Add unmarked columns: Select this option to have the fields missing in the schema appear in the
_rest
column. -
Allow null in key columns: Select this option to allow the
null
value in key columns.
-