Installation Tutorial for Google Cloud Platform Virtual Machine

Table of Contents

Context

When setting up a virtual machine, building an adequate developer environment can be challenging or fail for obscure reasons. This repository uses the Nix Package Manager as a solution.

This guide assumes the following Prerequisites: simply having a virtual machine running. While this tutorial uses a Google Cloud Virtual Machine as an example, it should (please open an issue if you encounter any problems) work on any Linux machine.

The steps are as follows:

  1. Connect to the virtual machine.

  2. Install Nix Package Manager.

  3. Pull the configuration file.

  4. Apply your credentials.

  5. Apply your environment.

Once finished, you should have a working environment with all the basic utilities you need.

To learn how to use this configuration and add project templates, see the Workflow section.

Installation Guidelines

0. Prerequisites

This tutorial assumes:

  • A GCP virtual machine has already been set up.

  • A GPU is available, but the drivers may not be pre-installed from GCP.

  • The disk is already configured.

  • The server is running Ubuntu 22.04 with x86_64 architecture.

1. Connect to the Virtual Machine

gcloud compute ssh ["name of the machine"] --zone ["name of the Zone"] --project ["name of the project"] --tunnel-through-iap

You can configure your gcloud settings so you don’t have to enter the project and zone every time. See the gcloud config set documentation.

2. Install Nix Package Manager

Here, we do not seek to switch completely from Linux to NixOS. We simply install Nix as a package manager, particularly to specify our user configuration with home-manager. A good tutorial about home-manager and the installation steps can be found evertras/simple-homemanager. If you are only interested in getting your server running, continue with this guide.

Run the following command to install Nix as a package manager in multi-user mode (see NixOS documentation):

sh <(curl -L https://nixos.org/nix/install) --daemon

Next, you must manually add some extra functionality to your Nix config at /etc/nix/nix.conf. In particular, you should:

  • Enable the experimental-features nix-command and flakes:

    • nix-command provides a more convenient CLI for Nix.

    • flakes allows you to work with a flake.nix file, where the core of your environment is specified.

  • Add root and your username (the result of the whoami command) as trusted users.

    • This allows the user to have their own ~/.config/nix/nix.conf. Hence, this Nix configuration will prevail, and you won’t have to manually modify the Nix config anymore as long as it is declared in your flake.nix.

To do so, run:

sudo nano /etc/nix/nix.conf

And add:

experimental-features = nix-command flakes
trusted-users = root hhakem

Save your changes and close the file. Note that sudo is required because, by default, on your GCP machine you can only install things at the user level, not as root. You need superuser access to modify things at the root level.

3. Pull the Configuration File

Now that the Nix package manager is available, pull the configuration file from this repo. Nix makes this convenient and allows you to run flakes hosted on git repositories. Here, you will run the #install command specified in apps/x86_64-linux/install as a shell script. It is made available as a Nix app with the mkApp function in flake.nix.

Run the following:

nix --extra-experimental-features 'nix-command flakes' run github:HugoHakem/nix-os.config?ref=main#install

A few things will happen here:

cleanup             # Remove any folder named: "nix-os.config-main.zip" or "nix-os.config-main"
check_installer     # Verify `nix` is available
download_config     # Download the repo as a zip file and unzip it using both `curl` and `unzip`. 
                    # No need to install beforehand! If you don't have it, I did it for you with `nix shell`
                
cleanup             # Again remove the installation folder that we don't need anymore.
                    # The main config has been renamed under `nixos-config/`
check_nvidia        # Check if NVIDIA drivers are installed
prompt_reboot       # Ask whether you want to reboot your machine (recommended if NVIDIA drivers are installed)

When checking for NVIDIA drivers, the installer will check if nvidia-smi is running as it should. If not, you will be prompted to install them through the installer. Please see the note on NVIDIA drivers installation for a detailed explanation. If your machine does not have a GPU, or you prefer to handle this yourself, simply answer no.

4. Apply Your Credentials

Before actually applying your environment configuration, you need to define your credentials. In particular, you will define:

  • GIT_NAME

    • This does not have to be the same as your GitHub username. It will be the name used to sign your commits.

  • GIT_EMAIL

    • This should be the email associated with your GitHub account.

Also, user will be pulled from whoami.

Run the following commands:

  1. Go into the nixos-config directory:

    cd nixos-config/
    
  2. Run the apply function (which, if you are curious, is a bash script detailed in apps/x86_64-linux/apply):

    nix run .#apply
    

This will override the following lines in flake.nix:

user = "hhakem";
git_name = "Hugo";
git_email = "hhakem@broadinstitute.org";

If your git is already configured, it will pull that information. If you are not satisfied with this behavior, you must change those lines manually. You will not be able to change it from the git CLI (because this is how Nix works: immutable, except when you specify otherwise, for perfect reproducibility).

5. Apply Your Environment

You are now ready to apply your environment configuration. Run this command (while still in the nixos-config/ folder):

nix run .#build-switch

Again, if you are curious, the build-switch app is defined in apps/x86_64-linux/build-switch.

Workflow

Adding New System Packages

  • The standard way to add new packages is by updating modules/shared/packages.nix. Please visit modules/shared/README.md for more details. You will find explanations and examples on how to add new packages, create files directly (which is less common), or configure programs.

  • If you believe the package you want to install is Linux-specific, note that this configuration is intended to be compatible with both macOS and Linux. In such cases, you should modify the modules/linux/ configuration.

  • If a Nix package does not work for some reason, patches should be applied in the overlays directory. Another use case for overlays is to override certain package attributes. For example, after updating your macOS version, some packages might break and require patches.

Additionally, note that hosts/linux.nix exists, but you typically won’t need to modify this file on a day-to-day basis.

After making changes to your configuration, run the following command to apply those changes to your system:

nix run .#build-switch

There are utilities to roll back to a previous version, which can be run with:

nix run .#rollback

You will be prompted to select which generation to roll back to. This can be useful for restoring a working system, but it will not restore your nixos-config files. Therefore, it is recommended to initialize a GitHub repository with your nixos-config/ folder.

Maintenance

When performing multiple builds or making significant changes to your configuration, some packages may remain in your /nix/store, or previous versions of your home-manager configuration may still be saved for rollback purposes. It is advisable to periodically run:

nix-collect-garbage -d

Then, in your nixos-config/ folder, run:

nix run .#build-switch

If you want to try installing a package with a specific version, you can test it first with:

nix shell nixpkgs#[name-of-the-package]

If you are able to get a certain version of a package using nix shell but not through this configuration, it may be because your flake.lock refers to previous versions of your flake.nix inputs. In that case, try:

nix flake update 
# Optionally, specify the input you want to update

Please refer to the documentation on nix flake update. Be aware that updating your inputs may cause breaking changes due to unsupported options or syntax changes. Only update if necessary. It is recommended to initialize a Git repository for your nixos-config/ so you can revert lock changes if needed.

Using Templates

The goal of setting up your environment is ultimately to work on coding projects. In the templates/ folder, you will find a template for a Python Machine Learning project to get started.

Connecting to the VM with VSCode

Connecting to the Virtual Machine through VSCode can be done in two ways:

  • Code Tunnel is great and requires minimal setup, but every time you turn off the VM, you’ll have to repeat the connection steps.

  • Remote SSH is best if you want a one-time setup and is actually not difficult to put in place.

Code Tunnel

This requires a GitHub or Microsoft account.

  • Connect to the Virtual Machine:

    gcloud compute ssh ["name of the machine"] --zone ["name of the Zone"] --project ["name of the project"] --tunnel-through-iap
    
  • Create a background process with your preferred method (tmux or screen). For example, with tmux:

  • Launch the code tunnel:

    code tunnel
    

    Follow the instructions from there; you may have to visit a web link, connect to your GitHub or Microsoft account, and enter an authentication code.

  • Exit the process (without killing it):

    • tmux: <Ctrl + b> d

    • screen: <Ctrl + a> d

  • You are now ready to connect with VSCode:

    • Open the command panel with <Cmd + Shift + P>

    • Enter: Remote-Tunnels: Connect to Tunnel...

    • You can now use your GitHub or Microsoft account to connect.

Remote SSH

This tutorial is inspired by this blog post.

I will present both the standard way to add a new Host in your SSH config and my preferred way. I recommend following the standard way first, and my preferred way if you are doing this for the first time. Otherwise, jump to My Preferred Way.

Standard way:

When connecting to the VM, you usually run:

gcloud compute ssh ["name of the machine"] --zone ["name of the Zone"] --project ["name of the project"] --tunnel-through-iap

If you add the --dry-run flag, this will return the actual ssh command, which will look like this (simplified with variables in angle brackets <...>):

/usr/bin/ssh -t -i ~/.ssh/google_compute_engine -o CheckHostIP=no -o HashKnownHosts=no -o HostKeyAlias=compute.<VM_ID> -o IdentitiesOnly=yes -o StrictHostKeyChecking=yes -o UserKnownHostsFile=~/.ssh/google_compute_known_hosts -o "ProxyCommand <PYTHON-BIN> -S -W ignore <GCLOUD.py> compute start-iap-tunnel '<VM_NAME>' '%p' --listen-on-stdin --project=<PROJECT_NAME> --zone=<ZONE> --verbosity=warning" -o ProxyUseFdpass=no <USER>@compute.<VM_ID>

This SSH command can be added in VSCode so that it recognizes your machine via SSH. However, the command is not formatted properly for VSCode:

  • /usr/bin/ssh should be ssh

  • "ProxyCommand ..." should be ProxyCommand="..."

This command does the following:

gcloud compute ssh ["name of the machine"] --zone ["name of the Zone"] --project ["name of the project"] --tunnel-through-iap --dry-run | \
sed 's/-o "ProxyCommand \([^"]*\)"/-o ProxyCommand="\1"/' | \
sed 's/\/usr\/bin\/ssh/ssh/'

You can now copy the output and do the following in VSCode:

  • Open the command panel with <Cmd + Shift + P>

  • Enter: Remote-SSH: Add New SSH Host...

  • Paste the output you copied.

This will generate an entry in your ~/.ssh/config file like:

Host compute.<VM_ID>
  HostName compute.<VM_ID>
  IdentityFile ~/.ssh/google_compute_engine
  CheckHostIP no
  HashKnownHosts no
  HostKeyAlias compute.<VM_ID>
  IdentitiesOnly yes
  StrictHostKeyChecking yes
  UserKnownHostsFile ~/.ssh/google_compute_known_hosts
  ProxyCommand <PYTHON-BIN> -S -W ignore <GCLOUD.py> compute start-iap-tunnel '<VM_NAME>' '%p' --listen-on-stdin --project=<PROJECT_NAME> --zone=<ZONE> --verbosity=warning
  ProxyUseFdpass no
  User <USER>

You are now ready to connect to your machine with SSH in VSCode:

SSH Connect
  • Open the command panel with <Cmd + Shift + P>

  • Enter: Remote-SSH: Connect to Host...

  • Select the Host variable of your VM, e.g., compute.<VM_ID>

Additionally, you no longer have to run the traditional gcloud compute ssh ... command. You can simply do:

ssh <USER>@compute.<VM_ID>
My Preferred Way

The previous method works, but there is one underlying problem when using the google-cloud-sdk provided by Nix: the <PYTHON-BIN> and <GCLOUD.py> paths will look like this:

<PYTHON-BIN> : /nix/store/90myxg4ckim260mw8mv741b4knykzx50-python3-3.12.9-env/bin/python
<GCLOUD.py> : /nix/store/h618r2jp07djzgsh7ymbpgy6vy1yvwcl-google-cloud-sdk-515.0.0/google-cloud-sdk/lib/gcloud.py

If you hardcode these paths in your SSH config, there is a high chance that, if you update your nixos-config, those paths will change (due to hash or version changes). You don’t want to always update your ~/.ssh/config whenever you update your nixos-config. Ideally, based on the gcloud you have, you should be able to retrieve where the <PYTHON-BIN> and <GCLOUD.py> are located.

Fortunately, this is specified under:

gcloud info

This displays a long config for gcloud, and in particular, you can read:

Python Location: [<PYTHON-BIN>]
...
Installation Root: [<GCLOUD_PATH>] # Note: <GCLOUD.py> = <GCLOUD_PATH>/lib/gcloud.py

Hence, the idea of creating a custom SSH Host config, where the ProxyCommand points to a script that fetches <PYTHON-BIN> and <GCLOUD_PATH> from gcloud info and then composes the actual ProxyCommand.

This is what modules/shared/config/gcp-ssh-script.sh does. It is supposedly the default, but you may have to make it executable. Run in your home directory:

chmod +x nixos-config/modules/shared/config/gcp-ssh-script.sh

Here is the suggested SSH Host config:

Host <HOST_NAME> 
  HostName compute.<VM_ID>
  IdentityFile ~/.ssh/google_compute_engine
  HostKeyAlias compute.<VM_ID>
  IdentitiesOnly yes
  StrictHostKeyChecking yes
  CheckHostIP no
  UserKnownHostsFile ~/.ssh/google_compute_known_hosts
  ProxyCommand ~/nixos-config/modules/shared/config/gcp-ssh-script.sh <VM_NAME> <PROJECT> <ZONE> %p
  User <USER>

Note: The “%p” in the ProxyCommand is very important and specifies the port of your local machine (usually port 22, see documentation). Some variables are omitted compared to the SSH Host config above, as they are defaults (see documentation).

You only need to specify the following:

  • <HOST_NAME>: any name you want to use to refer to your VM.

The rest of the info can be found in the details of the VM instance on the Google Cloud Console.

  • <VM_ID>

    • You can also run:

      gcloud compute instances describe ["name of the machine"] --zone ["name of the Zone"] --project ["name of the project"] --format="value(id)"
      
  • <VM_NAME>

  • <PROJECT>

  • <ZONE>

You are now ready to connect to your VM from VSCode or using a simple SSH command as