Preview environment for Code reviews, part 1: Concept
As the preparation of an environment (especially in implementation) requires a lot of explanation and justification, this blog post is the first part of a series of articles that cover the topic of building such an environment.
For whom this article is addressed?
The information contained here shows the path to building a staging environment from scratch (separately for each individual Code Review), where we are progressing through the various problems we discover until we reach our goal.
The information will be useful for both the Web Developer and Junior/Mid DevOps with a little bit of knowledge about AWS (Amazon Web Services) and GCP (Google Cloud Platform) clouds, who would like to set up such an environment for themselves (and this is a fairly common case with new projects).
Goal: To test a feature before deployment to the production environment (definition of "what")
In the day-to-day work of a web application developer, we often encounter situations where we want to check how the application works. This happens after the source code has been modified, but before it's deployed to the production environment (or pushed to the production VCS branch).
We want the tested environment to be as separate as possible from the changes made by other features that are being developed in parallel. Basically, we want to avoid the scenario: another person developed a new feature, which is still under test (and not finished), but its changes to the code may disrupt our new feature in the same staging environment.
First attempt: A single (shared) staging environment (but is it enough?)
Usually, at the beginning of the development of IT projects, the choice falls on the simple solution of building a single shared environment known as staging.
Deployment to the staging environment takes place in reaction to all changes on a single branch of the VCS repository (e.g. dev
in Git), even before the code reaches the production branch (main
).
In this environment, it is possible to test in several ways (with e2e, automated or manual tests) an already running application. In addition, we can see how our new feature has integrated with the code changes that other developers have made in the meantime.
The advantage of this approach is, of course, simplicity and relatively low cost. We only need 1 copy of the production environment, which will serve as staging, usually with reduced data volumes (100% data copy not needed) and anonymization of personal data.
On the other hand, it also has considerable disadvantages:
We are already testing the application after the source code has been integrated with everything else (i.e. after the Code Review the merge has already been carried out, accepted, and executed). So we cannot prevent this code from getting into the dev
branch, in case the application would work differently than expected after the changes nor change the operation of other new features that are being tested.
To resolve such a dev
fire, we can introduce an additional bug fix as a separate code review or do a revert of the changes in the branch, which are additional steps to perform. Often, separate database migrations are also required to restore the environment to its previous state.
The single shared staging environment makes it almost impossible to test two different features separately in this way (each feature must be merged into a dev
branch before we test it). This is a bottleneck when it comes to testing in separation, where sometimes we do not know if code changes made by another person have caused errors in our feature.
Here is an example of what such a process might look like:
Second attempt: Separate staging environment (so-called “preview”) for each Code Review
To avoid the problems mentioned above, building a staging environment can be used, even before the Code review is done and before the code is merged into the dev
or main
branch. This involves building a unique environment (called “preview”) separately for each Code review (which is often done in the form of a proposed Merge/Pull Request).
This way, anyone checking out the code will be able to see a preview of the application and test it, without affecting other parallel features (or being affected by them) even before the code gets into the dev
(or main
) continuous integration branch.
It would seem that building a separate environment requires a lot of servers, and costs in the form of running separate clusters. Which in some cases is true.
However, there are situations where this can be cleverly optimized (this is precisely the case we will describe in this article). This way, we still only need one cluster of the staging environment to handle multiple separate versions of the application at once.
An example of how such a process might look:
Description of a sample application from the BigPicture perspective
The application will be simplified to static HTML files, which should be served by some HTTP server.
We will not encounter databases or other additional tools here. For the purpose of this article, we will not focus on the application architecture itself, but on the infrastructure around it.
The process of creating new features involves changes to static HTML files on a separate branch of the Git repository named, for example, feature/new-subpage
. To carry out Code Review, this branch will have a Merge Request created to the main
branch (straight into production for maximum simplicity).
It is worth recalling that we aim to build a preview for each Code Review separately before the code goes into production.
Let's additionally assume that:
- the hosting on which the application is hosted is a single bucket in the AWS S3 service
- the domain and DNS server is managed by AWS Route 53
- the Git repository hosting for the "Nice-app" is GitLab, so we also have Merge Requests and CI in GitLab
- GitHub is also available as an option (as the example company, which is our client, also has access to it for its other projects)
In terms of infrastructure, we can choose between tools and services from two clouds: AWS (Amazon Web Services) and GCP (Google Cloud Platform).
In the GCP cloud, we additionally have a running Kubernetes cluster (GKE), which is used for other purposes within the company for which the application is being developed.
Choice of the route to reach the goal (definition of "how")
We have already chosen what we want to achieve (second attempt), and we have gathered the requirements and information about how to run the example application and an initial list of available tools. It is now time to move on to thinking about how we will achieve this on the technical side.
Building URLs for separate previews
Let's assume that the application's root domain is "nice-app.com" (we will use this name from now on in a series of articles on this topic).
If each Merge Request and the changes therein (commits) are triggering the generation of a separate preview, a unique identifier (hash) would have to be generated for them, which could be used as a subdomain prefix element that would serve the preview according to the pattern:
"<hash>.staging.nice-app.com"
We have specifically added the staging
prefix to leave the address pool free for the main domain (such as "*.nice-app.com") and not littered with identifiers.
This unique hash can be taken from the hash value of a commit from a feature-branch in Git (for consistency and to avoid redundant hash generation operations).
It would look as follows:
Hosting and deployment
Since the production environment is based on hosting in a single AWS S3 bucket, we can reproduce the same for the staging environment.
To reduce the costs (financial and operational) of creating and handling multiple buckets at once, we should definitely reject the scenario where one bucket per preview would be created.
Instead, we would create one subdirectory per Preview in the same staging bucket. The name of the subdirectory could coincide with the hash of the subdomain and commit in Git.
For example, a commit with the hash a4ffbc
inside Merge Request that wants to add a feature to the production branch main
will have a Preview available under the URL: "http://a4ffbc.staging.nice-app.com".
This subdomain would redirect traffic to a subdirectory inside the AWS S3 staging bucket “http://nice-app-staging.s3-website.eu-central-1.amazonaws.com/a4ffbc”.
This would avoid creating a separate bucket per Merge Request.
Deployment would rather consist of:
- creating a subdomain,
- assigning it to the environment for the Merge Request,
- uploading the static HTML files to a subdirectory in the AWS S3 staging bucket.
Redirecting the subdomain to a subdirectory in the AWS S3 bucket
We are left with the question of the redirection to the subdirectory itself. Subdomains and DNS servers for a type A record (subdomain), redirect traffic only to the IP address of the server. We want the subdomain to be preserved because all links on the static HTML page are of absolute type, and therefore do not have a prefix with the application domain.
The implication is that we need some kind of reverse-proxy server that, being available at a given IP address, will redirect the subdomain traffic underneath to a given subdirectory in the AWS S3 staging bucket.
Approach 1: Reverse-proxy adapted to the current infrastructure (possible costs reduction)
One solution could be to use the existing k8s cluster in the GCP cloud that the example company has. One of the Pods in the cluster available at a public IP address could be a reverse-proxy server. A container would need to be defined, e.g. in Docker, which would use a ready-made server implementation, such as Nginx. Such a container would be deployed inside this k8s cluster.
The advantage of such a solution is that the cost is relatively low, as you don't have to start up a separate cluster, and you have control of the Pod with a reverse-proxy server in an already prepared environment. In addition, the team behind this knows the infrastructure, so would be able to maintain such a Pod (both in terms of monitoring and potential failures).
Approach 2: A flexible reverse-proxy equivalent (but requiring additional tools)
Another solution could be to use services built into the AWS cloud.
If the application could be exposed to the world using the AWS CloudFront service, then instead of putting a reverse-proxy server into the k8s cluster, a custom Lambda could be used that could intercept and redirect traffic between CloudFront and the AWS S3 staging bucket. This is a type of Lambda that has been named as Lambda@Edge (for a detailed description see the documentation).
The advantage is not having to worry about the maintenance of such a server (serverless), while the disadvantage is the close dependence on the AWS cloud (dedicated solution) and the additional costs associated with new services that the company has not used so far (team members should learn how to handle the CloudFront service and AWS Lambda).
Selected solution for reverse-proxy server
As the company is keen to keep costs relatively low, and be familiar with, maintain and use existing solutions (such as the k8s cluster), the choice was made for a server solution running as a Pod with a Docker container in k8s (Approach no. 1).
Retention policies and closing the environments
Finally, it is worth mentioning an often-overlooked fact. Creating new previews for each commit results in an increment of data to be stored, the number of subdomains in the DNS server, etc.
The customer for whom we work should not pay for the storage of data that is no longer needed, used, or out of date. Therefore, the assumption is that we should delete any preview environment for which a Merge Request has been closed or merged.
In addition, we should have a retention policy where the preview would also be automatically deleted 7 days after creation (we assume that this is enough time to check the Merge Request).
In this way, the customer will pay for cloud data that is used and needed, and the assets in the services will be well organized, without making it a garbage dump.
Summary
Thanks to the information gathered and the decisions made, we have crystallized the concept of building separate environments for each Code Review (Merge Request).
The "Nice-App" application for which they are to be created was assumed to be simplified to the necessary minimum to focus only on the infrastructure. We based our walkthrough on use cases in the environment and with the help of tools from a sample customer (reminiscent of a Case Study). In this way, we came closer to simulating working with a real customer.
In the next article (part 2), we will already focus on the configuration and implementation of the concept. Be prepared to be challenged with different tools, technologies, and unexpected problems!