Yandex DataSphere is a machine learning (ML) development environment that combines the familiar Jupyter® Notebook interface, serverless computing technology, and seamless use of different computing resource configurations. Yandex DataSphere helps significantly reduce the cost of machine learning compared to computing on your own hardware or other cloud platforms.
If you never used Jupyter Notebook, try it: notebooks are convenient as they help you execute code sequentially and immediately visualize the results. Notebooks are also helpful for making analytical reports and articles: you can add explanations between the code cells in Markdown.
Advantages of the service
Ready-to-use development environment
You don't need to spend time creating and maintaining VMs: when you create a new project, computing resources are automatically allocated for implementing it.
The VM comes ready with the JupyterLab development environment and pre-installed packages for data analysis and ML (such as TensorFlow, Keras, and NumPy), which you can start using immediately. The full list of pre-installed packages.
If you're missing a package, you can install it right from the notebook.
Automatic maintenance of computing resources
The service automatically manages resource allocation. If you don't perform any computations, no resources are allocated. If you use early access features, the amount of vCPU and memory usage is shown directly in the notebook interface.
Saving states at shutdown
If you close the notebook tab, the state of the interpreter, all variables, and computation results are saved. You can continue working when you reopen your project.
Some variables aren't serialized and therefore can't be saved. For example, a variable with a file open for writing:
f = open("file.txt", "w").
A warning is shown for these variables during the assignment:
The following variables cannot be serialized:.
Managing computing resources
Different computing resources are required for different tasks. For some of them, a regular processor is enough, but for others, you need a GPU.
DataSphere supports different computing resource configurations. By default, projects run with the minimal
c1.4 configuration (32 GB RAM and 4 vCPUs).
You can change the configuration at any time when working in the notebook. The state of the interpreter is maintained.
You can share your results
You can export your notebook as HTML with all your calculation results and cell explanations and share a link to the report in this format. Exporting in other formats is currently not supported.
Current service limitations
For more information about service limits, see Quotas and limits.