Connecting external dictionaries

You can add external dictionaries to a cluster and remove them. For more information about dictionaries, see the ClickHouse documentation.

Adding a dictionary

  1. Select the cluster:

    1. Go to the folder page and select Managed Service for ClickHouse.
    2. Click on the name of the cluster and open the Dictionaries tab.
    3. Click Add dictionary.
  2. Configure the dictionary source parameters:

    Managed Service for ClickHouse supports several types of dictionary sources: HTTP(s), PostgreSQL, MySQL, ClickHouse, and MongoDB. Different types of sources take different connection parameters.

    • URL: HTTP(s) source URL.
    • File format: HTTP(s) source file format. For more information about formats, see the ClickHouse documentation.
    • Replicas: List of MySQL replicas to use as the dictionary source. You can configure replica-specific connection parameters (port, username, and password) or set general parameters.
    • Host: Source host name. The host must be in the same network as the ClickHouse cluster.
    • Priority: MySQL replica priority. During a connection attempt, ClickHouse reads from replicas based on their priority. The lower the number, the higher the priority.
    • Port: Port for connecting to the source.
    • User: Source database username.
    • Password: Password for accessing the source database.
    • Database: Name of source database.
    • Selection condition: Condition for selecting rows to generate a dictionary from. For example, the selection condition id=10 is the same as the SQL command WHERE id=10. This parameter is available for MySQL and ClickHouse.
    • Table: Source table name.
    • Collection: Name of the collection for the MongoDB source.
    • Dictionary status check: Optional parameter. An SQL query to check changes in a dictionary. ClickHouse updates the dictionary only if the results of this query change. For more information, see the ClickHouse documentation.
    • SSL mode: Mode for establishing a secure SSL TCP/IP connection to the PostgreSQL database. For more information, see the PostgreSQL documentation.

    For more information about dictionary sources and their connection parameters, see the ClickHouse documentation.

  3. Configure the structure and how the dictionary is stored in memory:

    Managed Service for ClickHouse supports multiple methods for storing dictionaries in memory: flat, hashed, cache, range_hashed, complex_key_hashed, and complex_key_cache. For more information about how to store dictionaries in memory, see the ClickHouse documentation.

    Parameters vary for the different methods of storing dictionaries in memory:

    • Cache size: Number of cache cells for the cache and complex_key_cache methods . For more information about the cache, see the ClickHouse documentation.
    • Numeric key: Dictionary key column name. The key column must be in the UInt64 data type. Used for flat, hashed, cache, and range_hashed methods. For more information about keys, see the ClickHouse documentation.
    • Composite key: Columns that make up the composite key of a dictionary. Specify the column IDs and their data types. Used for the complex_key_hashed and complex_key_cache methods. For more information about composite keys, see the ClickHouse documentation.
    • Start of range column and End of range column: Columns indicating the start and end of the range for the range_hashed method. Specify the column IDs and their data types. For more information about ranges, see the ClickHouse documentation.
    • Data columns: List of columns with dictionary data:
      • Name: Column name.
      • Type: Column data type.
      • Default value: Default value for an empty element. When loading a dictionary, all empty elements are replaced with this value. You can't use NULL in this field. Optional parameter.
      • Expression: Expression that ClickHouse executes with the column value. Optional parameter.
      • Hierarchical: Hierarchical support flag.
      • Injective: Flag that shows whether the id -> attribute image is injective

    For more information about column parameters, see the ClickHouse documentation.

  4. Set how often the dictionary updates:

    ClickHouse loads dictionaries to the RAM and periodically updates them. You can set a fixed update interval in seconds or specify a range and ClickHouse chooses a uniformly random time within this range. This is necessary for distributing the load on the dictionary source when upgrading on a large number of servers.

    For more information about dictionary updates, see the ClickHouse documentation.

If you don't have the Yandex.Cloud command line interface yet, install it.

The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

To add an external dictionary to a ClickHouse cluster:

  1. View a description of the CLI command for adding dictionaries:

    $ yc managed-clickhouse cluster add-external-dictionary --help
    
  2. Run the add dictionary command. Example of the command for a PostgreSQL dictionary:

    $ yc managed-clickhouse cluster add-external-dictionary \
       --name <ClickHouse cluster name> \
       --dict-name <dictionary name> \
       --structure-id <key column name> \
       --structure-attribute name=<data column name>,type=<data type>,null-value=<empty element value >,expression=<expression>,hierarchical=<true|false>,injective=<true|false> \
       --fixed-lifetime <update interval in seconds> \
       --layout-type <flat|hashed|cache|range_hashed|complex_key_hashed|complex_key_cache> \
       --postgresql-source db=<source DB name>,table=<source table name>,port=<connection port>,user=<source DB user name>,password=<source DB password>,ssl-mode=<disable|allow|prefer|require|verify-ca|verify-full> \
       --postgresql-source-hosts <source DB hosts> \
    

You can add a dictionary using the createExternalDictionary method.

Viewing a list of dictionaries

  1. Go to the folder page and select Managed Service for ClickHouse.
  2. Click on the name of the cluster and open the Dictionaries tab.

If you don't have the Yandex.Cloud command line interface yet, install it.

The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

To add an external dictionary to a ClickHouse cluster:

  1. View a description of the CLI command for getting detailed cluster information:
$ yc managed-clickhouse cluster get --help
  1. Run the command:
$ yc managed-clickhouse cluster get <cluster name>

The added dictionaries are displayed in the dictionaries: section of command execution results.

You can view a list of dictionaries and other cluster parameters using the get method.

Removing a dictionary

  1. Go to the folder page and click Managed Service for ClickHouse.
  2. Click on the name of the cluster and open the Dictionaries tab.
  3. Click image in the line of the necessary host and select Delete.

If you don't have the Yandex.Cloud command line interface yet, install it.

The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

To remove an external dictionary:

  1. View a description of the CLI command to remove dictionaries:
$ yc managed-clickhouse cluster remove-external-dictionary --help
  1. Run the command to remove the dictionary:
$ yc managed-clickhouse cluster remove-external-dictionary \
  --name <cluster name> \
  --dict-name <dictionary name> \

You can remove a dictionary using the deleteExternalDictionary method.