DataSphere projects

Written by

Updated at March 26, 2024

Project storage
Configuring a project runtime environment
DataSphere Notebook
- JupyterLab console

A project is a user's main workspace that serves as a single entry point for all DataSphere features. A project allows you to perform computations on Yandex Cloud VMs with standard configurations and stores DataSphere user resources.

A notebook is an *.ipynb file that you work with in the JupyterLab development environment. In a notebook, you write code in cells and can add Markdown comments between them. The code is run for each cell separately. Cells can be run in any order.

Project storage

DataSphere provides 10 GB of free storage for each project. You can increase the storage size, but this will result in additional charges. See the costs of expanding the main storage in DataSphere pricing.

You can upload small amounts of data (up to 100 MB) to your DataSphere project through the UI. If you want to upload larger amounts of data, use your network storage or databases. For large data, it is also handy to use datasets.

Configuring a project runtime environment

Projects are created with a preset development environment and pre-installed packages. DataSphere provides several Docker images of the environment with a choice of Python versions and libraries. The DS Default (Python 3.10) image is used by default, but you can select another standard image. For a list of all pre-installed packages, see List of pre-installed software. If you are missing a package, you can install it right from the notebook cell or build a Docker image.

DataSphere Notebook

DataSphere Notebook allows running computations on a VM as a local JupyterLab notebook. DataSphere Notebook provides the selected configuration for long-term use and assigns the VM to the project notebook until you forcibly return it to the pool of available VMs or until the timeout expires. By default, the VM is released if no computations are run in the project within three hours. You can change this value in the project settings.

Cell code changes will be saved automatically. You can disable notebook autosaves in the JupyterLab settings by selecting Settings ⟶ Autosave Documents in the top menu. If you want to save an interpreter state or output, you will need to do that yourself.

You can link multiple VM configurations to a single project. When running computations in your notebook for the first time, select a configuration to use for them.

The DataSphere Notebook billing will start once the first computations are run in a notebook and will continue as long as the VM is assigned to the project. You can learn more about DataSphere usage cost here.

JupyterLab console

In DataSphere Notebook, you can use the JupyterLab console with an interactive Python interpreter. The console is run on a separate VM instance with the c1.4 configuration. To open the console, on the JupyterLab home page, select DataSphere Kernel under Console. You enter commands in the console input line and run them using the Shift + Enter keyboard shortcut.

If you just close the console, the VM instance will keep running. To shut down the console VM and stop paying for it, use the widget in the top-right corner of the screen or on the project home page and shut down the console VM.

JupyterLab extensions

The following JupyterLab extensions are available in Dedicated mode:

JupyterLab-latex
JupyterLab-widgets ipywidgets
JupyterLab-code-formatter black isort
JupyterLab-execute-time
JupyterLab-limit-output
JupyterLab-spellchecker
JupyterLab-templates

DataSphere projects

Project storageProject storage

Configuring a project runtime environmentConfiguring a project runtime environment

DataSphere NotebookDataSphere Notebook

JupyterLab consoleJupyterLab console

JupyterLab extensionsJupyterLab extensions

Was the article helpful?

Project storage

Configuring a project runtime environment

DataSphere Notebook

JupyterLab console

JupyterLab extensions