Code Modularity and Clear Boundaries Matter Now More Than Ever
These are interesting times to live in, right? Some say it’s the biggest change in the industry so far: the era of AI writing code by itself - or at least the era of AI-assisted coding. It seems everyone has an opinion on how AI will shape our lives and our profession as software engineers. I'd like to share my view on a specific topic: modularity, encapsulation, and proper software boundaries in the age of AI coding.
The importance of modularity
When discussing modularity, we often consider microservices - or at least a modular monolith. However, code modularity, in general, is grounded in the ability to build something from smaller, well-defined blocks, much like LEGO bricks. Ultimately, it doesn’t really matter if you deploy it as a microservice or as a regular monolithic app.
Poorly split microservices can lead to even worse issues than non-modularized monoliths: tightly coupled systems that are difficult to maintain and scale, where a change in one part leads to cascading modifications in others - the so-called ‘shotgun surgery.’ For example, this happens when you split your code based on technical functions or team organization rather than the business capabilities you need to support.
In contrast, when modularity is applied correctly - with encapsulation, maintaining a minimal API surface, and proper boundaries - it allows for more flexible and maintainable systems. The potential area and scope of changes are usually much smaller.
The human role in AI-assisted development
Some argue that modularity and proper boundaries are only necessary for us, mere human developers. As AI will soon be able to regenerate even massive codebases with each change, maybe we should stop caring about it? However, someone will still need to review these changes, right? You don’t want to push massive changes to production just hoping for the best.
Even with full test coverage to ensure AI-generated code is correct, maintaining such a test suite becomes a nightmare, especially when changes are frequent and cover the entire application or many dependent services.
Who tests the tests?
Imagine having to update hundreds of tests every time AI regenerates a significant portion of your codebase. This happens because changes are required in various areas due to responsibilities leaking, and because things aren’t private or hidden behind well-defined interfaces. It’s exactly this ‘shotgun surgery’ mentioned above that occurs in your codebase, where a single change requires modifications across multiple unrelated areas. Sure, you may ask AI to generate the tests, but how do you make sure they are correct? Who tests the tests?
Assuming we, as software engineers, will transition to being only AI code reviewers, we’d better keep the area of changes focused to minimize complexity. It’s in our best interest. It will make our lives easier, as the amount of code to review will be reduced to a bare minimum, and what’s more important, it will be focused, with changes applied in a specific module, package, or component - you name it.
It will be easier for us to load and keep the context in our heads while reviewing, allowing for a more conscious review and a more informed acceptance or rejection of the changes.
Being the ultimate gatekeepers
On the other hand, what if LLM happily throws one more ‘if’ to the already massive pile of ‘if’s in our logic? It’s on us, being the ultimate gatekeepers, to decide if it’s enough of a spaghetti code and conduct or order a rework of the entire solution that may be slightly more complicated than just the original change itself, but with a different goal in mind - readability for our future selves.
And I’m not even starting about reviewing and ensuring security of the LLM-generated code. It’s just a matter of time when we’ll have attacks, widely spread, where malicious code gets suggested and injected by LLM, or supply chain attacks get attempted when installing suggested dependencies, etc. That topic, for sure, deserves another blog post.
The role of LLMs as a teammates
LLMs are powerful tools - there is no doubt about that - but they can’t write modular code yet.
Modularity is not something you can measure. There is no way to tell if one codebase is more modular than the other. It all depends on the context the code is used in, constraints and architecture drivers, the purpose the code serves, etc. It’s hard to train AI models with such fuzzy qualities and expect them to be correct. Again, some may say LLMs don’t need modularity, as they don’t care about code readability - it’s all machine, after all.
Human in the loop
They also don’t care about our architectural drivers or the goals we optimize a given code path for. While we can express some of these drivers in tests, someone must maintain these tests over time, and here we’re back to the point above - there is still a human in the loop.
This inability of LLMs to write correctly crafted, modular code shouldn’t give us the right to stop caring about it. Besides the fact that we still read the code, another reason is that the more focused the change area is, the better the results from LLMs will be - similar to how concrete prompts yield better outcomes when we chat with, for example, ChatGPT.
Modern LLM models
For instance, asking an LLM to implement a feature or perform refactoring, where you know the impact will be scoped to a package, module, or component, is likely to yield better results than when the change is spread across the entire application or multiple microservices repositories.
You may say that ‘modern’ LLM models are good with large contexts and can keep entire codebases in there while assisting us. Sure, they are, but it’s like giving a model a well-crafted, precise prompt with well-specified boundaries versus throwing a general question at it and expecting great results. LLMs are trained on human-produced datasets and problems that are naturally constrained in scope so people can work on them effectively. There is a reason the entire prompt engineering discipline was born.
Setting guardrails for AI-generated code
As engineers, we are responsible for setting up guardrails for AI-generated code and adjusting whatever the AI proposes accordingly. LLMs don't consider our project’s business context unless it is explicitly provided (can you even give the entire context to the LLM? Is it doable?) They simply generate code based on patterns and prompts.
It means it’s on us to ensure that the code they generate aligns with our architectural goals, quality, and maintainability requirements. To keep our codebase in shape, we’d have to review the code, make suggestions, iterate on the proposed solution, etc. The ability to make these changes within a relatively narrow scope - where most of the changes are confined to 'private' areas and are implementation details - gives us extra comfort, fewer compatibility issues, and allows us to truly understand and accept the changes.
Foundations of software engineering don’t change
I don't believe that good software engineers are doomed in the age of AI. While our daily tasks will change significantly, the foundations of software architecture and engineering will remain crucial.
In fact, these principles will become even more important as we rely more and more on AI in our development workflows. By genuinely focusing on modularity, encapsulation, and proper boundaries, we can increase the chances that AI-generated code is correct in one place and doesn’t unexpectedly change things in other parts of the system, because we can fully understand and validate the changes.
AI is reshaping how we build software. There is no doubt about that. But in my opinion, it won't replace the need for good design and architecture, nor will it invalidate all the foundations and practices of good software craftsmanship. Instead, it will amplify the importance of these principles.