🤔 Explain Hot & Cold Storage (like S3 and Glacier)

By Eugene Cheah | March 23, 2019

With so many distributed systems disrupted the past few days. I thought it would be a good idea to expand on the topic as a series.

#hugops to all those sysadmins who work on these things.

You might also like: 🤔 Explain Distributed Storage - and how it goes down for GitHub / UIlicious / cloud / etc - DEV Community 👩‍💻👨‍💻

Background Context

If you go around distribution storage talks, conferences, or articles, you may commonly hear the following two terms:

hot
cold

What makes it extremely confusing is that these terms are used inconsistently across technology platforms.

And due to the complete absence of an official definition: SSD, for example, can be considered "hot" in one, and "cold" in another.

Of which at this point, even experienced sysadmins who are unfamiliar with the terms would go...

So what the heck is Hot & Cold ???

With the lack of an official definition, I would suggest the following...

Data temperature terms (hot / warm / cold), are colourful terms used to describe data or storage hardware, in relative terms of each other, along with a chosen set of performance criteria (Latency, IOPS, and/or Bandwidth) on the same platform (S3, Gluster, etc).
In general, the "hotter" it is, the "faster" it is
~ Eugene Cheah

Important to note: these temperature metrics are used as relative terms for a platform. The colorful analogy of red hot and cold blue is not interchangeable across technology platforms.

The following table describes in very inaccurate terms, which would be considered hot or cold for a specific platform technology. Along with common key criteria (besides cost, as that is always a criterion).

Platform	"Hot"	"Cold"	Common Key Criteria
High-Speed Caching (For HPC computing)	Computer RAM	SSD	Data Latency
File Servers	SSD	HDD	IOPS
Backup Archive	HDD	Tape	Bandwidth

Hence due to the relative difference of terms and criteria. I re-emphasize, do not mix these terms across different platforms.

For example, a very confusing oddball: Intel Optane memory, which is designed specifically for high-speed low latency data transfer, and is slower in total bandwidth, when compared to SSD.

This makes it a good candidate for high capacity caching server. But "in most workloads", is ill-suited for file servers, due to its lower bandwidth, and high price per gig, when compared to SSD.

If you're confused by Optane, its alright, move on - almost everyone including me is.

Why not put everything into RAM or SSD?

After all, ain't you a huge "keep it simple stupid" supporter?

For datasets below 1 TB, you could and probably should for simplicity, choose a single storage medium and stick with it. As mentioned in the previous article in the series - Distributed systems do have performance overheads that only makes it worth it at "scale" (especially one with hybrid storage)

However, once you cross that line, hot and cold storage, slowly becomes a serious consideration, after all...

In general, the faster the storage is in 2019, the more expensive it gets. The following table shows the approximate raw material cost for the storage components without their required overheads (like motherboards, networking, etc).

Technology	Reference Type	Price per GB	Price per TB	Price per PB
ECC RAM	16 GB DDR3	$2.625	$2,625	$2.6 million
SATA SSD	512 GB SSD	$0.10	$100	$100,000
Hard Drive	4TB 5900RPM	$0.025	$25	$25,000
Tape Storage	6TB Uncompressed LTO 7	$0.01	$10	$10,000

"Reference Type" pricing for each technology is already intentionally chosen for the lowest price per GB, during 2017-2018

When it comes to Terabyte or even Petabyte worth of data, the raw cost in ram sticks alone would be $2.6k per Terabyte or $2.6 million per Petabyte.

This is even before an easy 10x+ multiplier for any needed equipment (eg. motherboard), infrastructure, redundancy, replacements, electricity, and finally its manpower that cloud providers give "as a service".

And since I'm neither bill nor mark, it's a price out of reach for me. Or most of us for that matter.

As a result: for practical reasons, once the data set hits a certain scale, most sysadmins of distributed systems will use some combination of "hot" storage specced out to the required performance workload (which is extremely case by case). With a mix of colder storage, to cut down on storage cost.

In such a setup, the most actively used files, which would need high performance, would be in quick and fast hot storage. While the least actively used file would be in slow cheap cold storage.

Such a mix of hot and cold storage can be done on both a single server, or (as per this series) on a much larger scale across multiple servers, or even multiple nations. Where one can take advantage of much cheaper servers in Germany or Frankfurt, where electricity and cooling are cheaper.

It is pretty cool, that UIlicious cold storage is literally being cooled by snow

How is the data split across fast and slow then?

Distribution of data can be performed either by the storage technology chosen itself is supported. Such as GlusterFS, or ElasticSearch. Where it can be done automatically, based on its predefined configuration.

Cloud storage technologies, such as S3, similarly have the configuration to automatically migrate such storage into a "colder" state, according to a predefined age rule.

Alternatively, the distribution of "hot & cold" storage could also be done within the application, instead of single platform technology. Which would let it have very fine-grained control over as it moves its storage workload from one system or another according to its expected use case.

In such a case the term "hot or cold" would refer to how the application developers plan this out with their DevOps and sysadmins. A common example would be storing hot data in a network file server or even SQL database, with cold data inside S3 like blob storage.

Why do some articles define it (confusingly) by the age of data then?

Because this would represent a commonly, over-generalized use case. Nothing more, nor less.

Statistically speaking, and especially over a long trend line, your users are a lot more likely to read (or write) a recently created file. Then an old 5+ year file.

You can easily see this in practice if you have a really good internet connection. By randomly (for the first time in a long time) jumping back in facebook or google to open up images that are over 5+ years old. And compare the load times to your immediate news feed.

Though if you are unlucky, while everyone's "hot" working files are loading blazing fast, you can find that your year-old "cold" files are sadly gone for a few hours.

This is unfortunately what happened to me recently during the Google outage...

While such a rule of thumbs like "if older than 31 days, put into cold storage", is convenient...

Do not fall for an easy rule of thumbs (for distributed storage, or any technology)

The flaw is that it completely ignores a lot of the nuances that one should be looking into for one use case.

Such as situations where you would want to migrate cold data to hot data. Or to keep it in hot data way longer than a simple 31 days period. Which is very workload-specific.

Small nuances if not properly understood, and planned for, can come back to either bring services down to a crawl, when heavy "hot" data like traffic hits the "cold" data storage layer. Or worse, for cloud blob storage, an unexpected giant bill.

And perhaps the worse damage is done when: starting new learners down a confusing wrong path of learning.

I have met administrators who were obsessed over configuring their hot/cold structure purely by days, without understanding their workload in detail.

OK, how about showing some example workloads then?

For example, uilicious.com heaviest file workload consists of test scripts and results, where the user writes and runs 1000's test scripts like these...

// Lets go to dev.to
I.goTo("https://dev.to")

// Fill up search
I.fill("Search", "uilicious")
I.pressEnter()

// I should see myself or my co-founder
I.see("Shi Ling")
I.see("Eugene Cheah")

... which then get shared publicly on the internet, or team issue tracker 😎.

The resulting workload ends up being the following :

When a test result is viewed only once (when executed), it is very likely to be never viewed again.
When a test result is shared and viewed multiple times (>10) in a short period, it means it's being "shared" within a company or even social media. And will get several hundred or even thousands more requests soon after. In such a case, we would want to immediately upgrade to "hot" data storage. Even if the result was originally over a month+ old.
When used with our monitoring scheduling feature. Which automates the execution of scripts. Most people view such "monitoring" test results, only within the first 2 weeks. And never again soon after.

As a result, we make tweaks to glusterfs (our distributed file store) specifically to support such a workload. In particular to the logic in moving data from hot to cold, or vice visa.

Other notable examples of workloads are ...

Facebook like "On This Day", showcasing images from the archive in the past
Accounting and audit of company data, at the start of the year, for previous year data
Seasonal promotion content (New Year, Christmas, etc)

Because good half of the workloads I have seen so far would benefit from unique configurations that would not make sense to other workloads. And even at times fly against the simple rule of thumb.

My personal advice is to stick to the concept of hot and cold, being fast data or slow data accordingly, allowing you to better visualize your workload. Instead of date creation time, which oversimplifies the concept.

A quick reminder of one of my favorite quotes of all time:

Alternatively : There is no silver bullet

In the end, it is important to understand your data workload, especially as you scale into the petabyte scale.

So why didn't we just call them Fast N Slow

> And don't get me started on how much I hate the term "warm" 😡

Oh damn, I so wish they did so. And we would be never needing this whole article.

However, while I could not trace the true history of the term (this would be an interesting topic on its own)

If you rewind back before the era of SSD, and to the predecessor to SSD storage for the early '20s.

These high-speed 15,000 RPM HDD are so blazing hot, they needed their own dedicated fans once you stack them side by side.

These hot drives would run multiple laps around the slower cheaper drives spinning at 5,400 RPM, which served as cold storage.

Hot data was literally, blazing hot spinning wheels of data.

btw: these collectables go for $2,000 each

For random context, your car wheels go at around 1,400 RPM when speeding along 100 mp/h (160 km/h). So imagine 10x that - it's a speed that you wouldn't want your fingers to be touching. (not that you can, all the hard-drive images you see online are lies, these things come shipped with covers)

The temperature terms are also used to describe different levels of backup sites, a business would run their server infrastructure with. With hot backup sites, having every machine live and running, waiting to take over when the main servers fail. To cold backup sites, which would have the power unplugged, or may even be empty room space for servers to be placed later.

So yea, sadly there are some good reasons why the term caught on 😂 (as much as I hate it)

So what hardware should I be using for hot and cold, for platform X?

Look at the platform guide, or alternatively, wait for part 3 of this series 😝

In the meantime: if you are stuck deciding, take a step outside the developer realm, and play yourself a pop song as you think about it.

Happy Shipping 🖖🏼🚀

About Eugene Cheah

Does UI test automation, web app development, and part of the GPUJS team.