Your first project on AWS — the basics
Chapter 1 — regions, availability zones, VPC, subnets, cidr, nat, and security groups.
The series
- Chapter 1 — The Basics
- Chapter 2 — Elastic Container Service
- Chapter 3 — Application Load Balancer
- Chapter 4 — S3 and CloudFront
If you are stepping into the cloud world for the first time it can be quite intimidating and scary. My first project aims to give you a taste of AWS cloud through a very simple first project. Before we jump into creating our cloud components, let's understand some basic concepts.
What is cloud computing?
As defined by Wikipedia, cloud computing is the on-demand availability of computer system resources especially data storage and computing power without direct active management by the user. In simple words, it means you can buy software or hardware on-demand at the click of a button and instantly start using it.
Why should we use it? Topmost reason to use the cloud would be not having to manage any hardware, you can spin up new instances any time, as and when you need them. Apart from this cloud also gives you a host of different services that you can use like user management, image recognition, load balancing, etc. At present, there are over 200 services on AWS alone.
Regions and Availability Zones
Regions are physical locations in the world where AWS data centers are located. A Region can have multiple physically separated data centers which are called availability zones. AWS allows you to set up servers across availability zones thereby having a backup in case of an emergency.
VPC
A virtual private cloud is a virtual network inside your AWS account. This is nothing but a range of IP addresses that are allocated to you or CIDR (Classless inter-domain routing).
Ip address is represented as a 32-bit number divided into 4 blocks of 8 bit each, for example, 10.0.0.0, each block here can have a value from 0 to 255 i.e. 2⁸. The above notation represents a single IP address whereas CIDR represents a range of IP addresses for example 10.0.0.0/20, the /20 here means that the first 20 bits of this address must be on or 1, giving us 12 bits to play with which means 2¹² = 4096 addresses. /20 is also called the subnet mask and would look like this in binary 11111111 11111111 11110000 00000000.
Subnets
The 4096 addresses can be further divided into more groups called subnets which are also represented by CIDR blocks.
How do we divide 10.0.0.0/20? To do this we need to know the starting address and the ending address of the range. The starting address is the IP address specified in the CIDR block. To find the ending address apply bitwise OR on the start address with the bitwise binary inverse of the subnet mask. 00001010000000000000000000000000 OR 00000000000000000000111111111111 = 00001010000000000000111111111111 = 10.0.15.255
Now that we know the ending address, we can create a CIDR for our subnet, Lets choose 10.0.14.0/28, which gives us 32–28 = 4 remaining bits and 2⁴=16 IP addresses or 10.0.14.0/24 for 2⁸=256 addresses. We can create a 2nd subnet with CIDR as 10.0.13.0/24 and so on.
A subnet can be public or private, a public subnet is the one which is connected to AWS internet gateway ( resources in your VPC can connect to the internet through the internet gateway ) and a private subnet is the one that is not, there are however instances when you would want your private resources to connect to the internet, for example in case of software updates. So, how do we do it? This is achieved by connecting the private subnet to a NAT gateway which then connects to the internet gateway.
NAT Gateway
NAT or network address translation gateway is hosted on a public subnet and routes internet bound packets from the private subnets to the internet gateway but the reverse is not true, resources on the internet cannot initiate connections to the NAT gateway. For this to work, a private subnet route table will be defined like this
Assuming the VPC CIDR block is 10.0.0.0/20, we say that any destination IP in this range is considered local and we look for it within the VPC, anything else must be routed to the NAT gateway.
NAT gateway works by maintaining a mapping of private source IP, source port to the NAT gateways public IP, and available source port.
In the above example, we see an incoming internet-bound connection from a virtual machine to NAT gateway with a private IP and a port (also called a source tuple). The NAT gateway replaces the private IP with its public IP(185.221.69.47) and a newly available source port and calls the target IP on the specified port. Response from the target is also translated back to the actual source tuple which makes it back to the initiator virtual machine inside the private subnet.
Internet Gateway
There is not much to talk about this, the internet gateway allows internet traffic to enter or leave the VPC. Only 1 internet gateway is allowed per VPC.
Security Groups
Security groups or SG as depicted in the diagram above work as a virtual firewall on your resources controlling inbound and outbound connections. Each security group has 2 sets of rules, inbound rules, and outbound rules. Inbound rules specify which sources are allowed to connect with the resource and outbound rules specify the destinations this resource can connect to.
In the above example, we allow inbound connections only on port 5432 as this is where our postgre database is listening for new connections. The first rule has the source as the id of another security group which means that only outbound connections from that security group are allowed as input here, the second rule has an IP address in CIDR format allowing only that particular IP to connect.
185.221.69.47/32 means a single IP. Why? since /32 will leave us 0 bits (32 bits must be on, so no bits left to create a range).
sg-lambda is a security group defined for lambda or a serverless function of AWS, this rule allows the lambda to connect with the postgre database.
If you liked my work, buy me a coffee.