System as Code - introduction to GitOps
We like nowadays to have everything as code. There is a good reason for it - having infrastructure as code has a lot of advantages: consistency, modularity, resistance to human errors, version control, and more. One can tell the same about other “things as code” - configuration as code, pipelines as code, and others.
So the natural question appears: is it possible to have the whole system as code?
The answer is simple: Yes, it is! Thanks to Kubernetes and GitOps approach one can declaratively describe the desired state of the whole system. In this short article, I’ll try to elaborate on what GitOps actually is and how it can make your life easier.
What is GitOps?
Normally, when someone has a similar question it’s quite obvious how to find an answer: let’s google it and examine a few answers:
GitOps uses Git repositories as a single source of truth to deliver infrastructure as code.
Pretty easy. Put the terraform code into the git repository and that’s it.
GitOps is an operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance, and CI/CD tooling, and applies them to infrastructure automation.
Very similar, but we have to use the CI/CD.
GitOps upholds the principle that Git is the one and only source of truth.
It’s very important. Everyone who maintained the documentation knows - the second law of thermodynamics really works, and the entropy of the Universe increases.
Developers already use Git for the source code of the application – GitOps extends this practice to an application’s configuration, infrastructure, and operational procedures.
OK, things are getting more interesting - I can imagine keeping in git configuration, infrastructure (as code) but... operational procedures?
Is there really no one definition of GitOps? Actually - it is. Some time ago a few big players like GitHub, Azure or Amazon (there are more) started a GitOps Working Group under the Cloud Native Computing Foundation. As outcome this WG announced the OpenGitops 1.0 principles.
Open GitOps 1.0
The desired state of a GitOps managed system must be:
A system managed by GitOps must have its desired state expressed declaratively.
- Versioned and Immutable
The desired state is stored in a way that enforces immutability, versioning and retains a complete version history.
- Pulled Automatically
Software agents automatically pull the desired state declarations from the source.
- Continuously Reconciled
Software agents continuously observe the actual system state and attempt to apply the desired state.
As you can see - OpenGitOps is much more than storing the infrastructure code in git.
Let’s analyze these principles.
Describing the system declaratively separates the desired state from the implementation - commands, API calls, scripts etc. used to achieve that state. Basically - we write some facts about the system, not the procedures.
Versioned and Immutable
The desired system state must be stored in a system that enforces immutability. It means - once the state is created it can’t be changed. If one wants to change the state of the system - a new version is created, keeping the previous version in history. The part “git” in the word GitOps - obviously - came from the fact the system state is stored in the git repository. But according to this principle, it can be any system that fulfills this requirement - for example S3 bucket with versioning enabled.
In most cases the git repository is a natural choice - the commits are immutable, and new commits describe the new versions.
This is something specific to Open GitOps. Most definitions of the GitOps approach tell us only about storing the desired state. This principle requires “something” - the software agents running on the managed system - to actively watch the storage and pull the latest version. It implies that after setting up such a system the last step the operator needs to worry about is the commit. It’s not his responsibility how the state is applied to the running system.
Similarly to the previous principle, no one - meaning of person - needs to worry when and how the desired state is applied to the running system. It’s done automatically by the software. It works similarly as on the Kubernetes cluster - if you apply the pod manifest to the cluster it doesn’t mean it’s guaranteed the pod will run - it depends on several things like for example available resources, node selector, affinity, etc. But the system will do its best to fulfill the operator's requests.
Do I want to use GitOps?
TL;DR - yes, you want ;-).
Let’s consider a few advantages of using this approach and check how it can make a developer's life easier.
Git as a single source of the truth
Everybody with some experience in IT knows this: there is no such thing as documentation that is up to date. Well, it is, but after a few New York seconds it becomes outdated (The New York second is the shortest possible measurement of time. Standardised as the time between the lights turning green and the taxi behind you beeping his horn).
The GitOps approach solves this problem - the desired state of the system stored in the git repository is not only the documentation about the system, but it is also the system state itself! It’s probably possible to write a book about all the advantages of having a single source of the truth, but most of them are obvious.
Git as a single tool to manage the system.
Since the desired state is pulled and applied to the running system automatically the only operation the developer needs to do is commit and push the changes to the git repository. It eliminates the necessity of learning new tools, and you as a developer already know the git, don’t you?
The history of the system
In most cases, it’s very important to know what is the actual state of the system but also what was the previous state, and generally - the history. It’s especially important when something goes wrong and the operator needs to know what was changed in the system. Basically, the git repository provides the full auditability of the system.
(Relatively) easy rollback
The important consequence of the previous - what to do if something goes wrong indeed? It’s code, isn’t it? And what do developers do if they want to restore the previous state of the code? Of course, they do
In most cases, it will work for the whole system too! Well, unless we made a database migration etc, but for stateless services, it will work without any problems.
PRs and code reviews
We have exactly the same tools as any other developer working with the code. We can create PRs, ask for code review, run some automatic tests with CI to eliminate obvious syntax errors, etc. It significantly increases the reliability and security of the managed system.
What to do if a real disaster affects the data center?
Of course the most valuable part of the system is the data, this is why it’s so important to have a reliable backup policy, storing the backups in several, geographically distant places etc.
But assuming you have the backup of the data - recreating the whole system in the new place is a matter of minutes. Exactly the same services, in exactly the same versions.
Of course, GitOps doesn’t solve all the problems of the world. Let’s go through some potential issues with GitOps (and possible solutions).
It’s common nowadays we have multiple environments for the system - dev, staging, uat, production, etc.
How to implement it using GitOps? There is no best practice for this. Some use multiple branches, some just multiple directories on git, some use simply multiple repositories. Each approach has its pros and cons, it’s a good idea to experiment a bit with it. When using GitOps together with Kubernetes it’s a good idea to leverage the Customization - it allows to keep the code DRY.
The GitOps approach postulates keeping everything on the git repository, but - as probably everybody knows - it’s a terrible idea to store the secrets (credentials) in the git encrypted, even if the repository is private.
There are several workarounds to it. Some of them - like Bitnami SealedScrets or KOPS - help to keep the secrets encrypted in the git, and automatically decrypt them on the cluster. Others just avoid keeping the secrets on the git, and just point to the secrets stored elsewhere - the most important example is the External Secrets operator.
Easy for technical stuff, difficult for business
Everybody knows and uses git. Well, all technical people. If someone from the business asks the common question “what version, and which features do we have on the production now?” they probably won’t be happy with the “you can check it on the GitHub” answer.
This forces us to create some additional tools which “translate” the git repository to something easy to read by non-technical people.
Implementations of the OpenGitops 1.0
There are probably two most important tools which implement the Open GitOps principles: FluxCD and ArgoCD (there are possibly more but these two are probably the most widely used).
Which is better? Which to choose? Like for every such question, the answer can be only one: it depends. Both of them have excellent documentation, and quick-start guides, which help to get familiar with the tool.
There are several articles and videos comparing these tools - I encourage you to familiarize yourself with them.
I personally prefer the FluxCD, but it absolutely doesn’t mean it’s better than ArgoCD. For me, ArgoCD seems to be a bit simpler, so it has a flatter learning curve.
On the other hand, FluxCD has multiple controllers and multiple CRDs, which in my very humble opinion better reflects the objects it manages.
One killer feature of the ArgoCD is the graphical user interface, which is probably the reason why developers choose this tool.
If you still don’t use GitOps in your project - I really think you should! It really simplifies life, especially when used together with Kubernetes. The larger the project, the more difficult it is to harness all its components. It’s possible to start small, with only part of the system, and check if it fits you. But I’m pretty sure you’ll love it 🙂.