The function is at the Preview stage.
You can improve the quality of machine translations in a specific field of expertise by using your own data to train the model. This won't degrade the quality of translations of everyday language.
What data is required for retraining
For retraining, original-translation segments are required in TMX format. For the effect to have any significance, tens of thousands of such segments are required.
Texts should match the target knowledge domain as closely as possible (such as legal documents, medicine, or oil and gas). Mixing subjects leads to worse results.
How to retrain a model
Fill in the model retraining application. Enter information about your cloud and attach the TMX file. The model will be retrained within approximately 2 weeks. The model ID will be sent to the email address specified in the application.
To use a model, enter its ID in the
model parameter when sending a request.
Who will have access to the received model
Yandex.Cloud doesn't use transmitted data to train its own models. The resulting model will only be available for the folder specified in the application.