Metering a SaaS Multi-tenant Solution on Azure–BLOB SIZES

In the Cloud Ninja sample, we tried tackling one of the often asked questions when running a multi-tenant application in the SaaS model for keeping track of the resource usage.

This is a longer explanation of why we just used Length property on the blob itself to approximate the usage. Following is the snippet showing how we did that:

For starters, please see this post by the storage team for better understanding how Azure Storage billing works.

One important caveat is, we can only estimate the usage, but cannot guarantee that it will be the same figure Azure billing uses to generate the bills, reason being no official billing APIs have been published by Microsoft yet.

First question is, what is the best way to understand the storage size use? Well answer is, it really depends. So, starting with that question, let’s see the situation with Cloud Ninja sample.

In our implementation we have two containers for each tenant, one is private, the other is public. Public containers are intended to store things like logos (as implemented in the sample). Private containers are for blobs containing items like attachments to the tasks (not implemented in the sample).

Let’s start with the containers. Following is a cutout from the article referenced above.

The container names for Cloud Ninja start with tntp_ or tnts_ (for private or shared), and continues with the tenant ID. In Azure Storage the container names can be between 3 to 63 characters long (please see here). Let’s go with the longest possible name for a container.  In the solution, we do not have any custom metadata, plus, only application itself accesses the blobs, so we have one signed identifier for each tenant. This makes:

48 + 63 x 2 + 0 + 512 = 686 bytes per container, and 1,372 bytes per tenant.

Assuming the sample Task Application can be very popular and we might have 10,000 customers all up, this makes approximately 0.01 GB consumed for the storage on container names. If you have a quick look at the current pay-as-you-go pricing (most expensive plan) you can see you will be charged at $0.15 per GB stored per month. I will stop doing the arithmetic here.

Let’s move on to the blobs themselves. Cloud Ninja uses only block blobs, not page blobs. So related with that, another cutout from the before mentioned post on this part is below:

Blob names cannot be longer than 1024 characters, according to the MSDN documentation so let’s assume our tenants can upload files with really long names, but also be reasonable, because they will likely prefer human readable names, thus having file name lengths around 50.

We do not maintain any metadata on the blobs themselves either, and the API we use commits the data as soon as the blob is uploaded. The maximum size a block can have is 4 MB. Assuming that most of the files uploaded to the containers by the customers will be less than 4 MB, we can conclude that each blob will occupy only one block. According to the storage services whitepaper maximum size of a block ID can be 64 bytes.

To dissect the formula above, we have overhead + size in bytes for committed blocks + size in bytes for uncommitted blocks.

Overhead becomes:  124 + 50 x 2 + 0 + 8 + 1 x 64 = 296 bytes.

Since we will not have any uncommitted blocks, as our files will be smaller than 4 MB most of the time, the estimated size to be occupied for our blob becomes 296 byte + length in bytes. Where the overhead can safely be ignored.

Engineering is an art of approximations, and there are no one size fits all solutions. Cloud Ninja is just an attempt to demonstrate what could be a possible solution for the same type of questions we have been hearing from our customers.

Let your days be filled with building new things!

Leave a Reply