Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Yandex Compute Cloud
  • Getting started
    • Overview
    • Creating a Linux VM
    • Creating a Windows VM
    • Creating instance groups
  • Step-by-step instructions
    • All instructions
    • Creating VMs
      • Creating a Linux VM
      • Creating a Windows VM
      • Creating a VM from a set of disks
      • Creating a VM with disks from snapshots
      • Creating a VM from a custom image
      • Creating a VM with a GPU
      • Making a VM preemptible
    • DSVM
      • Overview
      • Creating a VM from a public DSVM image
    • Placement groups
      • Creating a placement group
      • Deleting a placement group
      • Creating a VM in a placement group
      • Adding a VM to a placement group
      • Removing a VM from a placement group
    • Images with pre-installed software
      • Creating a VM from a public image
      • Configuring software
      • Running a VM based on a public image
      • Getting a list of public images
    • Getting information about a VM
      • Getting information about a VM
      • Getting serial port's output
    • Managing VMs
      • Stopping and starting a VM
      • Resetting a VM user password
      • Attaching a disk to a VM
      • Detaching a disk from a VM
      • Moving a VM to a different availability zone
      • Moving a VM to another folder
      • Attaching a public IP address to a VM
      • Detaching a public IP address from a VM
      • Making a VM's public IP address static
      • Updating a VM
      • Changing VM computing resources
      • Deleting a VM
    • Working on VMs
      • Connecting to a VM via SSH
      • Connecting to a VM via RDP
      • Connecting to a VM via PowerShell
      • Working with Yandex Cloud from inside a VM
      • Installing NVIDIA drivers
      • Restoring access to a VM
    • Managing the password reset agent
      • Verifying agent operation
      • Installing the agent
      • Deleting the agent
    • Creating new disks
      • Creating an empty disk
      • Creating an empty disk with a large block
      • Creating a non-replicated disk
    • Disk management
      • Creating a disk snapshot
      • Updating a disk
      • Moving a disk to another folder
      • Deleting a disk
      • Deleting a disk snapshot
    • Disk placement groups
      • Creating a disk placement group
      • Removing a disk from a placement group
    • Creating new images
      • Preparing a disk image
      • Uploading your image
      • Creating an image from a disk
      • Creating an image from a snapshot
      • Creating an image from other custom image
    • Managing images
      • Getting a list of images
      • Deleting a disk image
    • File storage
      • Creating file storage
      • Attaching file storage to a VM
      • Detaching file storage from a VM
      • Updating file storage
      • Deleting file storage
    • Managing the serial console
      • Getting started
      • Connecting to a serial console via SSH
      • Connecting to a serial console via CLI
      • Starting your terminal in the Windows serial console (SAC)
      • Disabling access to the serial console
    • Creating instance groups
      • Creating a fixed-size instance group
      • Creating a fixed-size instance group with a network load balancer
      • Creating a fixed-size instance group with an L7 load balancer
      • Creating an automatically scaled instance group
      • Creating an instance group from a Container Optimized Image
      • Creating an instance group based on the YAML specification
    • Getting information about instance groups
      • Getting a list of instance groups
      • Getting information about an instance group
      • Getting a list of instances in a group
    • Managing instance groups
      • Editing an instance group
      • Edit an instance group based on the YAML specification
      • Configuring application health check on the VM
      • Updating a group
        • Incremental update
        • Uninterrupted updates
      • Pausing an instance group
      • Resuming an instance group
      • Stopping an instance group
      • Starting an instance group
      • Deleting an instance group
    • Dedicated hosts
      • Creating a VM in a group of dedicated hosts
      • Creating a VM on a dedicated host
  • Yandex Container Solution
  • Practical guidelines
    • Configuring NTP time synchronization
    • Running instance groups with auto scaling
    • Automatically scaling an instance group for handling messages from a queue
    • Updating an instance group under load
    • Deploying Remote Desktop Gateway
    • Transferring logs from a VM instance to Yandex Cloud Logging
    • Creating a VM backup with Hystax Acura Backup
  • Concepts
    • Relationship between resources
    • Virtual machines
      • Overview
      • Platforms
      • vCPU performance levels
      • Preemptible VMs
      • Network on a VM
      • Software-accelerated network
      • Live migration
      • Placement groups
      • Statuses
      • Metadata
      • Resetting a Windows VM password
    • Graphics accelerators
    • Disks and file storage
      • Overview
      • Disks
      • Disk snapshots
      • Non-replicated disk placement groups
      • File storage
      • Read and write operations
    • Images
    • Instance groups
      • Overview
      • Access
      • YAML specification
      • Instance template
      • Variables in an instance template
      • Policies
        • Overview
        • Allocation policy
        • Deployment policy
        • Scaling policy
      • Scaling types
      • Auto-healing
      • Updating
        • Overview
        • Allocating instances across zones
        • Deployment algorithm
        • Rules for updating virtual machines
        • Changing secondary disks in an instance template
      • Stopping and pausing an instance group
      • Statuses
    • Dedicated host
    • Backups
    • Quotas and limits
  • Access management
  • Pricing policy
    • Current pricing policy
    • Archive
      • Before January 1, 2019
      • From January 1 to March 1, 2019
      • From March 1 to May 1, 2019
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • DiskPlacementGroupService
      • DiskService
      • DiskTypeService
      • FilesystemService
      • HostGroupService
      • HostTypeService
      • ImageService
      • InstanceService
      • PlacementGroupService
      • SnapshotScheduleService
      • SnapshotService
      • ZoneService
      • InstanceGroupService
      • OperationService
    • REST
      • Overview
      • DiskPlacementGroup
        • Overview
        • create
        • delete
        • get
        • list
        • listDisks
        • listOperations
        • update
      • Disk
        • Overview
        • create
        • delete
        • get
        • list
        • listOperations
        • move
        • update
      • DiskType
        • Overview
        • get
        • list
      • Filesystem
        • Overview
        • create
        • delete
        • get
        • list
        • listOperations
        • update
      • HostGroup
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listInstances
        • listOperations
        • update
      • HostType
        • Overview
        • get
        • list
      • Image
        • Overview
        • create
        • delete
        • get
        • getLatestByFamily
        • list
        • listOperations
        • update
      • Instance
        • Overview
        • addOneToOneNat
        • attachDisk
        • attachFilesystem
        • create
        • delete
        • detachDisk
        • detachFilesystem
        • get
        • getSerialPortOutput
        • list
        • listOperations
        • move
        • removeOneToOneNat
        • restart
        • start
        • stop
        • update
        • updateMetadata
        • updateNetworkInterface
      • PlacementGroup
        • Overview
        • create
        • delete
        • get
        • list
        • listInstances
        • listOperations
        • update
      • SnapshotSchedule
        • Overview
        • create
        • delete
        • disable
        • enable
        • get
        • list
        • listDisks
        • listOperations
        • listSnapshots
        • update
        • updateDisks
      • Snapshot
        • Overview
        • create
        • delete
        • get
        • list
        • listOperations
        • update
      • Zone
        • Overview
        • get
        • list
      • Operation
        • Overview
        • get
      • InstanceGroup
        • Overview
        • list
        • get
        • listLogRecords
        • updateFromYaml
        • updateAccessBindings
        • pauseProcesses
        • stop
        • start
        • delete
        • listInstances
        • createFromYaml
        • update
        • setAccessBindings
        • listOperations
        • create
        • listAccessBindings
        • resumeProcesses
  • Questions and answers
    • General questions
    • Virtual machines
    • Connection
    • Disks, snapshots, and images
    • Disaster recovery
    • Monitoring
    • Licensing
    • All questions on the same page
  1. Step-by-step instructions
  2. Creating VMs
  3. Creating a VM with a GPU

Creating a VM with a GPU

Written by
Yandex Cloud
,
improved by
amatol

    This section provides guidelines for creating VMs with a GPU. For more information about VM configurations, see Graphics accelerators.

    By default, the cloud has a zero quota for creating VMs with GPUs. To change the quota, contact technical support.

    You can create VMs on Intel Broadwell with NVIDIA® Tesla® V100 and Intel Cascade Lake with NVIDIA® Tesla® V100 in the ru-central1-a and ru-central1-b availability zones.

    You can only create virtual machines on the AMD EPYC™ with NVIDIA® Ampere® A100 and the Intel Ice Lake with NVIDIA® Tesla® T4 platforms in the ru-central1-a availability zone.

    Management console
    CLI
    API
    Terraform

    To create a VM:

    1. In the management console, select the folder to create the virtual machine in.

    2. In the list of services, select Compute Cloud.

    3. Click Create VM.

    4. Under Basic parameters:

      • Enter a name and description for the VM. Naming requirements:

        • The length can be from 3 to 63 characters.
        • It may contain lowercase Latin letters, numbers, and hyphens.
        • The first character must be a letter. The last character can't be a hyphen.

        Note

        The VM name is used to generate an internal FQDN only once: when creating a VM. If the internal FQDN is important to you, choose an appropriate name for the VM at the creation stage.

      • Select an availability zone to put your virtual machine in.

    5. Under Image/boot disk selection, click the Cloud Marketplace tab and select a GPU-oriented image and an operating system version.

      For VM instances running on the Intel Broadwell with NVIDIA® Tesla® V100 and Intel Cascade Lake with NVIDIA® Tesla® V100 platforms, a special Ubuntu images is available in Cloud Marketplace: Windows — 2016 Datacenter GPU (windows-2016-gvlk-gpu), Ubuntu — 16.04 LTS GPU (ubuntu-1604-lts-gpu) and 20.04 LTS GPU (ubuntu-2004-lts-gpu). The images have NVIDIA drivers pre-installed.

      For VM instances running on the Intel Ice Lake with NVIDIA® Tesla® T4 platform, an Ubuntu image is available: 20.04 LTS GPU A100 (ubuntu-2004-lts-gpu).

      For VM instances running on the AMD EPYC™ with NVIDIA® Ampere® A100 platform, a special Ubuntu image is available: 20.04 LTS GPU A100 (ubuntu-2004-lts-gpu-a100). We recommend using a standard image from Yandex Cloud. You can also install the drivers on another standard image yourself or create a custom image with preinstalled drivers.

    6. (optional) Under Disks and file storage, configure a boot disk:

      • Select the disk type.
      • Specify the necessary disk size.
    7. (optional) Under Disks and file storage, select the File storage tab, connect a file store, and enter the device name.

    8. Under Computing resources:

      • Choose a platform:
        • Intel Broadwell with NVIDIA® Tesla® V100.
        • Intel Cascade Lake with NVIDIA® Tesla® V100.
        • AMD EPYC™ with NVIDIA® Ampere® A100.
      • Select a virtual machine configuration specifying the required number of GPUs.
      • If necessary, make your VM preemptible.
    9. Under Network settings:

      • Enter a subnet ID or select a cloud network from the list.
        If you don't have a network, click Create network to create one:
        • In the window that opens, enter the network name and folder to host the network.
        • (optional) To automatically create subnets, select the Create subnets option.
        • Click Create.
          Each network must have at least one subnet. If there is no subnet, create one by selecting Add subnet.
      • In the Public IP field, choose a method for assigning an IP address:
        • Auto: Assign a random IP address from the Yandex Cloud IP pool. With this, you can enable DDoS protection using the option below.
        • List: Select a public IP address from the list of previously reserved static addresses. For more information, see Making a dynamic public IP address static.
        • No address: Don't assign a public IP address.
      • In the Internal address field, select the method for assigning internal addresses: Auto or Manual.
      • (optional) Create a record for the VM in the DNS zone. Expand the DNS settings for internal addresses section, click Add record and specify the zone, FQDN and TTL for the record. For more detail, please see Cloud DNS integration with Compute Cloud.
      • Select appropriate security groups (if there is no corresponding field, the virtual machine will be enabled for all incoming and outgoing traffic).
    10. Under Access, specify the data required to access the VM:

      • (optional) Select or create a service account. By using a service account, you can flexibly configure access rights for your resources.

        For VMs with a Linux-based operating system:

        • Enter the username in the Login field.

        Alert

        Don't use the username root or other names reserved by the operating system. To perform operations that require superuser permissions, use the command sudo.

        • In the SSH key field, paste the contents of the public key file.

        For VMs with a Windows-based operating system:

        • When you create a VM, the Administrator user is automatically created in the operating system. In the Password field, set a password for this user to log in to the VM via RDP.

          Do not use passwords that are easy to guess. Passwords must meet the Windows security policy.

          In Windows Server images from Yandex Cloud, the Administrator user's password expiration is disabled by default.

        • (optional) If necessary, enable access to the serial console.

    11. Click Create VM.

    The virtual machine appears in the list.

    If you don't have the Yandex Cloud command line interface yet, install and initialize it.

    The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

    1. View a description of the CLI create VM command:

      yc compute instance create --help
      
    2. Prepare the key pair (public and private keys) for SSH access to the VM.

    3. Select a public image.

      To get a list of available images, run the following command:

      yc compute image list --folder-id standard-images
      +----------------------+----------------------------------+-----------------------+----------------------+--------+
      |          ID          |               NAME               |         FAMILY        |     PRODUCT IDS      | STATUS |
      +----------------------+----------------------------------+-----------------------+----------------------+--------+
      ...
      | fdv7ooobjfl3ts9gqp0q | windows-2016-gvlk-gpu-1548913814 | windows-2016-gvlk-gpu | dqnnc72gj2ist3ktjj1p | READY  |
      | fdv4f5kv5cvf3ohu4flt | ubuntu-1604-lts-gpu-1549457823   | ubuntu-1604-lts-gpu   | dqnnb6dc7640c5i968ro | READY  |
      ...
      +----------------------+----------------------------------+-----------------------+----------------------+--------+
      

      For VM instances running on the Intel Broadwell with NVIDIA® Tesla® V100 and Intel Cascade Lake with NVIDIA® Tesla® V100 platforms, a special Ubuntu images is available in Cloud Marketplace: Windows — 2016 Datacenter GPU (windows-2016-gvlk-gpu), Ubuntu — 16.04 LTS GPU (ubuntu-1604-lts-gpu) and 20.04 LTS GPU (ubuntu-2004-lts-gpu). The images have NVIDIA drivers pre-installed.

      For VM instances running on the Intel Ice Lake with NVIDIA® Tesla® T4 platform, an Ubuntu image is available: 20.04 LTS GPU A100 (ubuntu-2004-lts-gpu).

      For VM instances running on the AMD EPYC™ with NVIDIA® Ampere® A100 platform, a special Ubuntu image is available: 20.04 LTS GPU A100 (ubuntu-2004-lts-gpu-a100). We recommend using a standard image from Yandex Cloud. You can also install the drivers on another standard image yourself or create a custom image with preinstalled drivers.

    4. Create a VM in the default folder:

      yc compute instance create \
        --name gpu-instance \
        --zone ru-central1-a \
        --platform=gpu-standard-v3 \
        --cores=8 \
        --memory=96 \
        --gpus=1 \
        --network-interface subnet-name=default-ru-central1-a,nat-ip-version=ipv4 \
        --create-boot-disk image-folder-id=standard-images,image-family=ubuntu-1604-lts-gpu \
        --ssh-key ~/.ssh/id_rsa.pub
      

      Where:

      • name: VM name.

        Note

        The VM name is used to generate an internal FQDN only once: when creating a VM. If the internal FQDN is important to you, choose an appropriate name for the VM at the creation stage.

      • zone: Availability zone.

      • platform: Platform ID:

        • gpu-standard-v1 for Intel Broadwell with NVIDIA® Tesla® V100.
        • gpu-standard-v2 for Intel Cascade Lake with NVIDIA® Tesla® V100.
        • gpu-standard-v3 for AMD EPYC™ with NVIDIA® Ampere® A100.
      • cores: Number of vCPUs.

      • memory: RAM.

      • gpus: Number of GPUs.

      • preemptible: If required, make your VM preemptible.

      • create-boot-disk: Image of the OS. ubuntu-1604-lts-gpu: Ubuntu 16.04 LTS GPU with CUDA drivers.

      • nat-ip-version: Public IP. To create a VM without a public IP address, disable the nat-ip-version=ipv4 option.

      This creates a VM named gpu-instance with a single GPU, 8 vCPUs, and 96 GB RAM:

      yc compute instance get --full gpu-instance
      

      Result:

      name: gpu-instance
      zone_id: ru-central1-a
      platform_id: gpu-standard-v3
      resources:
        memory: "103079215104"
        cores: "8"
        core_fraction: "100"
        gpus: "1"
      status: RUNNING
      ...
      

    To create a VM, use the Create method for the Instance resource.

    If you don't have Terraform, install it and configure the Yandex Cloud provider.

    1. In the configuration file, describe the parameters of resources that you want to create:

      resource "yandex_compute_instance" "vm-1" {
      
        name        = "vm-with-gpu"
        platform_id = "gpu-standard-v3"
      
        resources {
          cores  = <number of vCPU cores>
          memory = <RAM amount, GB>
          gpus   = <number of GPUs>
        }
      
        boot_disk {
          initialize_params {
            image_id = "fdv4f5kv5cvf3ohu4flt"
          }
        }
      
        network_interface {
          subnet_id = "${yandex_vpc_subnet.subnet-1.id}"
          nat       = true
        }
      
        metadata = {
          ssh-keys = "<username>:<SSH key contents>"
        }
      }
      
      resource "yandex_vpc_network" "network-1" {
        name = "network1"
      }
      
      resource "yandex_vpc_subnet" "subnet-1" {
        name       = "subnet1"
        zone       = "<availability zone>"
        network_id = "${yandex_vpc_network.network-1.id}"
      }
      

      Where:

      • yandex_compute_instance: Description of the VM:
        • name: VM name.

        • platform_id: ID of the platform:

          • gpu-standard-v1 for Intel Broadwell with NVIDIA® Tesla® V100.
          • gpu-standard-v2 for Intel Cascade Lake with NVIDIA® Tesla® V100.
          • gpu-standard-v3 for AMD EPYC 7662 with NVIDIA® Ampere® A100.
        • resources: The number of vCPU cores and the amount of RAM available to the VM. The values must match the selected platform.

        • boot_disk: Boot disk settings. Specify the ID of the selected image. You can get the image ID from the list of public images.

          For VM instances running on the Intel Broadwell with NVIDIA® Tesla® V100 and Intel Cascade Lake with NVIDIA® Tesla® V100 platforms, a special Ubuntu images is available in Cloud Marketplace: Windows — 2016 Datacenter GPU (windows-2016-gvlk-gpu), Ubuntu — 16.04 LTS GPU (ubuntu-1604-lts-gpu) and 20.04 LTS GPU (ubuntu-2004-lts-gpu). The images have NVIDIA drivers pre-installed.

          For VM instances running on the Intel Ice Lake with NVIDIA® Tesla® T4 platform, an Ubuntu image is available: 20.04 LTS GPU A100 (ubuntu-2004-lts-gpu).

          For VM instances running on the AMD EPYC™ with NVIDIA® Ampere® A100 platform, a special Ubuntu image is available: 20.04 LTS GPU A100 (ubuntu-2004-lts-gpu-a100). We recommend using a standard image from Yandex Cloud. You can also install the drivers on another standard image yourself or create a custom image with preinstalled drivers.

        • network_interface: Network settings. Specify the ID of the selected subnet. To automatically assign a public IP address to the VM, set nat = true.

        • metadata: In the metadata, pass the public key for VM access via SSH. For more information, see VM instance metadata.

      • yandex_vpc_network: Description of the cloud network.
      • yandex_vpc_subnet: Description of the subnet your VM will connect to.

      Note

      If you already have suitable resources, such as a cloud network and subnet, you don't need to describe them again. Use their names and IDs in the appropriate parameters.

      For more information about resources that you can create with Terraform, please see the provider documentation.

    2. Make sure that the configuration files are correct.

      1. In the command line, go to the directory where you created the configuration file.

      2. Run the check using the command:

        terraform plan
        

      If the configuration is described correctly, the terminal displays a list of created resources and their parameters. If there are errors in the configuration, Terraform points them out.

    3. Deploy the cloud resources.

      1. If the configuration doesn't contain any errors, run the command:

        terraform apply
        
      2. Confirm that you want to create the resources.

      Afterwards, all the necessary resources are created in the specified folder. You can check that the resources are there with the correct settings using the management console.

    When a VM is created, it is assigned an IP address and hostname (FQDN). This data can be used for SSH access.

    You can make a public IP address static. For more information, see Making a VM's public IP address static.

    See also

    • Learn how to change the VM configuration.

    Was the article helpful?

    Language / Region
    © 2022 Yandex.Cloud LLC