Metabase Up and Running
上QQ阅读APP看书,第一时间看更新

Creating a Virtual Private Cloud

As mentioned earlier, AWS configures a default VPC for you in each region when you create your account. You can also create your own VPC, which is exactly what we will learn to do in this section. Even if your organization already has an AWS account with a VPC configured, going through this section will still be a valuable learning experience. It will also help you understand what configuration changes may be necessary for your existing VPC to launch Metabase.

In this section, we'll learn how to create a VPC configured with the required network infrastructure for Metabase. Specifically, that means creating a VPC with two public subnets in two different availability zones, both with internet gateways in their routing table. We'll also learn what all this means, and why it's important. To get started, search for the VPC service in the AWS Management Console, just as we did with the IAM service. This will take you to the VPC Management Console:

  1. At the top of this page, you will click a blue button reading Launch VPC Wizard. This is the easiest way to create the kind of VPC we'll need, and although it abstracts away some of the finer points on configuring a VPC, that is fine for our purposes.
  2. On the next page, you'll be presented with four different VPC configurations, as in Figure 2.7. Choose the first one, VPC with a Single Public Subnet. We actually will need two public subnets in our VPC, but since that is not an option in the wizard, we'll go with this one and create another subnet in the next section:

    Figure 2.7 – The 4 VPC configurations. You want to choose the highlighted VPC with a single public subnet

  3. On the next page, Step 2: VPC with a Single Public Subnet, enter a name in the VPC name field, for example, vpc-metabase.
  4. Pick an Availability Zone for the public subnet, for example, us-east-1a. Make a note of the availability zone you pick.
  5. You may leave the subnet's IPv4 CIDR address defaulted to 10.0.0.0/24. Make a note of this address.
  6. Click Create VPC.

Since those steps were quite technical, let's summarize what we just did.

First, we created a VPC, which can just be thought of as a range of IP addresses that we can use to have resources communicate with one another.

Next, we created a subnet, which is a subset of the IP addresses in our VPC.

Our subnet is a public subnet, which just means that in addition to being able to communicate with other resources in our VPC, it can also receive requests from the public internet.

Lastly, we specified that all resources in our VPC will be launched in the us-east-1a Availability Zone, which is a physical region in the Northern Virginia data center.

Now that we've created our VPC with a single public subnet, the next thing we need to do is add another public subnet in a different availability zone. It turns out that this is a requirement for running Metabase. Let's get started by navigating to the VPC dashboard page, which you should be redirected to anyway after creating your VPC:

  1. On the left rail, find and click the Subnets option.
  2. At the top of the screen, click the blue Create Subnet button.
  3. On the next page, enter the following:

    a. Name tag: Put anything you like here, for example, Subnet 2.

    b. VPC: Pick vpc-metabase, which we just created.

    c. Availability zone: Pick any zone except the zone you already created the public subnet in. If you chose us-east-1a for the first one, you could pick us-east-1b here.

    d. IPv4 CIDR block: If you used 10.0.0.0/24 in the first step, you can use 10.0.2.0/24 in this step. Understanding CIDR blocks is beyond the scope of the tutorial, although if you want to learn more, the official documentation can be found at https://tools.ietf.org/html/rfc4632.

You have just created your second subnet in your VPC. To make this a public subnet we need to allow it to receive traffic outside of our VPC. We will do this by adding an internet gateway to our subnet's routing table. An internet gateway was already created when we made our first subnet, so we will just reuse it again here.

Let's learn how to make our subnet a public one:

  1. From the VPC dashboard, select Subnets again.
  2. Click on the button next to Subnet 2, the subnet you just created.
  3. Scroll down to the bottom of the page. You will see a tab called Route Table.
  4. The Route Table should have a single row in it, with a Destination of 10.0.0.0/16 and a Target of local. Click the blue button titled Edit route table association.
  5. On the Edit route table association page, go to the dropdown. Your route table ID will be in there by default. Click the dropdown. There will only be one option to choose from in the dropdown. The option is for a route table associated with our first public subnet and is what we want to use for our second public subnet as well. Select it from the dropdown. Because this route table has an internet gateway associated with it, it will allow resources in your subnet to communicate with the public internet.
  6. To confirm you associated the route table correctly, your route table should now show two rows. The second row will have a Destination of 0.0.0.0/0 and a link to the internet gateway in the Target column, similar to Figure 2.8, but of course with different IDs. Click Save:

Figure 2.8 – Route table associations

Congratulations, your VPC is now properly configured with the required network infrastructure for Metabase, which is two subnets with internet gateways in their routing tables in two different availability zones.

Important note

If your organization is already using AWS, you will want to work with the IT or operations team to make sure your VPC has what is needed, which is at least two public subnets in two different availability zones.

Now that we have our user account and VPC set up, we are all but ready to deploy our application.

Deciding on scalability, availability, and cost

In this section, we will learn about scalability, availability, and financial cost. This will involve weighing how scalable and available our app needs to be against the monetary costs that come with a more resource-intensive deployment.

First, let's learn about scalability, or our app's ability to grow or shrink based on how many users are accessing it. If you were wondering why the service we use is called Elastic Beanstalk, you should be able to connect the dots here.

Scalability

Scalability refers to how your app's EC2 instances will grow (or shrink) as more (or fewer) users use it. While the default configuration for your application is to start with a single EC2 instance, a single load balancer, and a single database, it is set up to automatically launch up to three additional EC2 instances to handle increased traffic.

Depending on the size of your organization and how many users you expect will be using your Metabase instance at one time, Elastic Beanstalk may add additional instances. You can customize the rules for how you want your app to scale up and down, too, and we will learn how to do that when we configure our app in the next section.

Deciding how scalable to make your app doesn't require you to decide how many compute instances you want; it just requires you to decide the minimum and maximum number of instances that can be deployed by Elastic Beanstalk. The actual deployment of these resources will be handled automatically (or based on rules you decide) to ensure your app is in good health.

Availability

Availability refers to how accessible your app is when things like outages happen. Each AWS data center has a number of Availability Zones in it. From time to time, these Availability Zones can go down. Downtime is infrequent and usually does not last long. Services can avoid the negative consequences of these outages by being available in multiple Availability Zones. That way, if one goes down, all traffic is routed to the other one.

Tolerance for an outage is a complex equation. For example, a service with many users such as Netflix or Facebook may have zero-tolerance – they need their service running 24/7. If you typically have 4 or 5 users per day using your Metabase instance for an average of 30 minutes, you can likely tolerate an outage, although it's up to you to determine your tolerance.

Cost

The more instances you have running your app, the higher your app's cost will be. Recall that on the AWS Free tier, you get 750 hours a month of free compute (EC2, size t2.micro) and database service. There are 744 hours in a 31-day month, so you have enough coverage on the Free tier to run one EC2 instance and one database instance all year long. Should you exceed 750 hours per month, know that at the time of this writing, the on-demand t2.micro EC2 instance costs $0.0116 per hour and the t2.small I recommend for the app costs $0.023 per hour (https://aws.amazon.com/ec2/pricing/on-demand/).

I recommend using the following 2x2 matrix framework, as pictured in Figure 2.9, to decide on the best solution for you. If you are not sure what is best for you, I recommend the High Scalability and Low Availability option. I recommend high scalability because I think for most organizations a single instance will be adequate, but the option to scale it up if needed for a few dollars a month is worth it. I recommend low availability because I think an application like Metabase doesn't need 100% uptime like Netflix or Facebook.

Figure 2.9 – Decision-making framework for scalability, availability, and cost