Format Data Disks with Azure VM Custom Script Extension for Linux

bash-add-fstab
There is often a need to add data disks to your virtual machines in Microsoft Azure to store… well data, and a lot of good reasons to place data on a separate data disk. Adding data disks is a pretty easy and straight forward process, and for more information about Azure disks and images see the documentation. Attaching new disks to a VM be accomplished through the management portal or one of the various automated approaches; REST API, Powershell, or cross-platform node.js based scripting API xPlat. Once the new disk is provisioned and attached to your VM you then need to initialize it; partition, format, mount, and ensure it is once again mounted when the system boots. This is pretty easy to do by following a series of steps that the Azure folks have nicely documented in How to Attach a Data Disk to a Linux Virtual Machine. This is fine when configuring a virtual machine and maybe you then snapshot it and go from there, then rarely need to do this again.  Steps like this are begging for automation scripts, so one Friday evening we put together a quick script.  The script is currently available as a gist on my github account and there are a number of improvements I have made that still need to be tested before I publish the updates there.  Please have a look and any suggestions and are appreciated.

autopart.sh – azure disk initialization script
https://gist.github.com/trentmswanson/9c22bb71182e982bd36f

The script is a simple Linux shell script and has only been tested with the recent Ubuntu image on Azure (Ubuntu 14.04 LTS).  I have no idea if and how it works with other distributions or versions. Basically this script looks at the attached disks and then if they have not been partitioned it will partition, format, mount, and adds an fstab entry. You can run the script multiple times and it should only initialize new drives, and do nothing if there no new drives to initialize.

Adding Data Disks

Once we have a virtual machine and data disks attached we need to execute our script to initialize the disks. There are a number of ways you can run the script on the remote machine, and for the most part I would just execute scripts from my local development machine against the remote VM using something like ‘ssh user@vmext.cloudapp.net ‘sudo bash -s’ < ~/Documents/autopart.sh. You could also use plink if you are working from a windows machine. You could copy the script using scp or sftp, and easily download it from github or wherever you want using curl or wget. A lot of different options for running scripts on your virtual machine instances. Using ssh, scp, sftp, plink, wget, curl, are easy enough but we also have custom script extensions. Integrating this in to my powershell or xplat provisioning script was easy enough Azure VM CustomScript extension. The extension also makes it easy for me to download and run scripts from secure BLOB storage account. Let’s have a look at this process.

Create a Linux VM and attach disks

To get started you will need to provision a Linux VM and attach some disks (I have only tested this with recent Ubuntu image in Azure). This can be accomplished through the management portal, scripts, or API, however you like. For information on creating a Linux virtual machine in Azure and attaching data disks have a look the documentation to Create a Virtual Machine Running Linux.
Make sure the defaulted ‘Install the VM Agent’ options remains checked.
new-linux-virtual-machine
Attach some data disks to your virtual machine.
virtual-machine-data-disks

Initialize the disks

We have a new virtual machine with some data disks that need to be initialized so that we can start putting data in them. Instead of working through a bunch of steps we simply run our script whenever we add a new data disk. As mentioned previously this can be done via SSH, but we will have a look at the custom script VM extension. We first need to place the script somewhere that the extension can download it and since it’s already a gist in github we can simply use the raw URL for the gist. You might want to fork or move this to some other location if you are going to continue using it. The xPlat scripts do not currently support VM extension today, but they are coming soon, and so we will use powershell. You can setup your powershell environment for working with Azure management API however you like but I tend to stick with using management certificates and loading them in using something like the following. I usually have one or more console windows open with this state or shortcuts to open one to my various environments.
powerhsell-subscription
Our script then needs to simply:

  1. Get a reference to the virtual machine
  2. Set script files location and command to execute settings
  3. Set some vm extension settings
  4. Run the script, grab a drink and celebrate

execute-vm-script
I used the ‘raw’ url of the gist and the script extension runs it with SU permissions as is required by this script

If you happen to see “Provision Guest Agent must be enabled on the VM object before setting IaaS VM Access Extension.” exception and you know the correct version of the agent is installed you can get around this today with the following “$vm.GetInstance().ProvisionGuestAgent = $true“. I’m not sure if this is a bug in the Azure powershell cmdlets or not, but this seems to work for me.

Have a look

Let’s have a look at our instance now. You can query for the results of the script execution using the powershell commands or xPlat scripts. We will use SSH here and as we can see the newly attached data disks should be initialized and we should see a new folder for each data disk at whatever location we configured.
I see my data disks, this is good
mediafolder-list
Entries have been added to /etc/fstab
fstab-view
Nice, with minimal effort our drives are partitioned, formatted, and configured to be mounted when the system boots.

Logs

The VM extensions logs information which can be useful when getting things setup in case anything might have gone wrong with loading or executing the script, like wrong URL, windows line endings, or even a bug in the script.
extension-logs-folder
extension-log

Sum of things

In parting we have created and shared a useful script for initializing data disks for Linux virtual machine in Microsoft Azure, and looked at some ways for running the script on our VM, including VM CustomScript Extension. I for one am excited about the possibilities VM extensions create, especially the CustomScript extension. Have a look at the script and let me know if this was helpful or if you have any suggestions. I have integrated this in with some other scripts and have nearly completely automated the provisioning of an Elaticsearch cluster on Azure. This will be covered in another upcoming post.

Trent Swanson