Making sense of EC2 Instance Types
Do AWS EC2 Instance Types confuse you? Do you wonder why the 'Storage Optimized' type is called a 'D' rather than a more intuitive 'S'? Or, 'M' should logically stand for 'Memory Optimized' but how on earth it means 'General Purpose', whereas 'R' is the 'Memory Optimized' one? This weird naming convention made it harder for me to remember which letter belonged to which family and when to use what; and if you too feel the same way, read on.
After going through a lot of AWS documentation, I got a sense of what AWS folks had in mind when they decided to attribute a certain letter to an instance type. The table below summarizes the instance types and their typical use cases.
Note: The second column here is not the official word from Amazon (in fact, Amazon does not mention it anywhere what the letters stand for). Nonetheless, it should help you recall the properties of an instance type by just looking at its name.
Now, while many of these are pretty self-explanatory, some need a little detail:
ARM Based (A): ARM is a RISC (Reduced Instruction Set Computer) based CPU architecture, as opposed to x86 processors which are CISC (Complex Instruction Set Computer) based. ARM is the primary architecture behind most of the mobile and embedded operating systems such as Android, Windows Mobile, Chrome OS etc. As the name suggests, these instance types should be used for ARM based applications. In addition, architecture agnostic applications can also potentially run on A instances.
Burstable (T): These are similar to the Main choice (M) instances, except that these can have short 'bursts' of higher-than-their-baseline performance. For example, the baseline performance for a t2.micro is 10% of the 1 vCPU it gets. But if need be, it can have 100% of the vCPU for 6 minutes in every hour (more details on how this works can be found here). The advantage is that you don't have to provision your instances for the 'extreme' cases. You can provision based on your 'average' case while resting assured that it can still handle occasional sudden peaks in usage.
FPGA (F): FPGA stands for Field Programable Gate Array. Well, that doesn't make it any clearer, does it? Unlike a CPU which has its entire circuitry hardwired, an FPGA chip has multiple logic blocks (or 'arrays' of logic 'gates') and their connections can be 'reprogrammed' (rewired at the hardware level) out in the 'field' (after the chip has left the factory and is in use). It's like having a customizable CPU purpose-built for your application and it can give you performance gains that would be difficult for a general purpose CPU to match.
High Frequency (Z): The newest ones on the block, these instance types provide the fastest single-thread performance. These are suitable if your application cannot really benefit much from parallelization and instead of multiple cores, you'd rather have a super-fast single core CPU. Also, some software (e.g. integrated circuit design software, or relational databases) have per core licensing fees. It makes sense to use Z instances in such cases.
Dense Storage (D): These instances are Hard Disk Drive (HDD) based and hence better suited for high throughput sequential read operations. These offer the highest throughput for the price (hence the name 'Dense'), and are suitable for moderate compute, low-to-moderate memory applications.
HDD Storage (H): These are similar to Dense Storage family (in the sense that these too are HDD based), but these offer a balance of higher compute and memory (along with high throughput, of course). This makes them suitable for applications such as MapReduce workloads, distributed file systems, log processing etc.
High IOPS (I): These instances are Solid State Drive (SSD) based and hence offer very high random IO performance. This makes them a great choice for transaction processing databases and NoSQL databases.