How to support software architect skills development in your company
At the beginning of this year we conducted an internal survey among our developers related to skills development and career goals. One of its conclusions was that there is a group of people who, in the more or less defined future, would like to take more architect-type roles in projects and focus on software design. We discussed how we can support them and help them gain skills that enable to become an Architectural Engineer. One of the key insights we came up with was to start the Software Architecture Discussion Group.
What is Software Architecture Discussion Group
Architectural Katas
The original idea for the group was to meet fortnightly and solve Architectural Katas. What are those? Exercises which goal is to discuss technology options and create a rough solution to a task that is given upfront. Simply put such Katas teach us how to make software architecture. Originally Architectural Katas come from Ted Neward and were made to be solved in small groups. We have modified this concept a bit. Our company is fully distributed, so we have also taken this fact into account. First, we chose a Kata to be solved, then everyone (who wants to participate) prepared their own, rough solutions to the given problem in a form allowing them to present it in about 5 minutes. After two weeks we organized a meeting on Zoom where we showed our solutions and discussed their advantages and disadvantages. This way we have covered the following problems so far:
Where’s Fluffy
The problem shown in this Kata was to design an application for finding missing pets. The most important topics discussed considered research how many active users such applications may have, how many pets are going missing daily. That important information allowed us to plan the right solution without too much overengineering (somehow developers have such tendencies). Quite an important decision was also the choice of the database, allowing for search using GPS coordinates, like e.g. PostgreSQL with PostGIS or MongoDB.
Who’s your daddy
This time the topic was building the world's largest genealogical graph in history. Did you know that more than 108 billion people have ever been born, of which 12 billion during the last 120 years? We discussed what are the probable numbers of people added to the application, how many users and records the competition has and what graph database could be proper for such amounts - e.g. Neo4j or ArangoDB.
Going Going Gone
The task was to design a system for an auction company. The described auction model allowed participation of people live and online, so it didn’t require low latencies. An additional challenge was to support live streaming of auctions with thousands of participants - WebRTC with CDN could be used for that.
Fantasy Fantasy NFL
Application for Fantasy football - game where participants are managers of virtual teams consisting of real players. The main challenges for this task were the scale and requirements related to creating video footages with highlights of every fantasy team. Our research showed that there are 256 matches during one season, so it may not be valuable to build a complicated Machine Learning for automatic extraction of best moments of specific players, but to just hire people to mark them. Later the application would show users videos for all players from the virtual team one by one. CDN (Content Delivery Network) can be quite useful here due to the amount of users. It is worth to think additionally about licences concerning video codecs - if they are free or if they may include additional royalties.
Intrested in how to design software architecture? Check out "Software Architecture Guide" by Martin Fowler and my blog post about "How to design microservices architecture"
Technology research meetings
In order to add some variety we have introduced another meeting format. One of the simplest definitions of an architect is that they are a person who is not an ultra-expert in any specific technology (knowing all possible configuration options), but a person who knows a lot of things on a level allowing to determine if they are a right choice to solve a given problem and what are their advantages and disadvantages - breadth of knowledge. Nowadays it is often difficult to be up to date with every technology which appears on the market. Additionally, thorough research is difficult and takes time. In technology research meetings we choose a technology, write down questions we want to find answers for and divide the work among a group of a few people. Later, after 2 weeks, we meet, share our findings and discuss who has used the technology and is it a fit to any ongoing or past project.
The usual questions we ask are:
- What are the use cases?
- What is the architecture?
- Does it scale, how is it clustered?
- Safety: is replication and/or automatic fail-over supported?
- What happens during network partition/split brain?
- Does it support at-least-once or effectively exactly-once delivery?
- What are the alternative solutions?
- Are there any hosted offerings?
- What’s the cost?
So far we have analyzed and discussed:
Apache Ignite
Ignite is an in memory computing platform. It can offer a cache, data grid or in-memory database capabilities. In order to learn more, take a look at our blogpost summing up our findings “Cure your FOMO - what is Apache Ignite in 5 minutes”.
Apache Pulsar
Pulsar is a messaging and streaming platform often compared to Apache Kafka. You can find a detailed comparison of them in our blogpost “Comparing Apache Kafka and Apache Pulsar”.
Istio
Nowadays you can observe a rising popularity of service meshes. You may already have heard about Istio, Linkerd, Consul or AWS App Mesh. We decided to talk about Istio, its use cases and what it offers in terms of security, observability, traffic management, resiliency and testing. Sorry, no SoftwareMill blogpost here! (yet?)
Apache Spark
Spark is not a new tool (its history dates back to 2009!) but still popular on the market. It offers quite a lot of features, so we organized two meetings to cover all of them. We talked about:
- Architecture, what happens during node failure?
- Use cases, when to use it, with what amounts of data?
- RDDs, Datasets, DataFrames - advantages and disadvantages
- Shuffling and when it happens
- Spark SQL
- Deployments possibilities & cloud offerings
- SparkR, PySpark and notebooks (Jupyter, Zeppelin, …)
- Stream processing, realtime and micro batches, guarantees, checkpoints, persistence, windowing
- MLib
- GraphX
- Testing
- Monitoring
- Spark alternatives and how it compares to Hadoop and Kafka Streams
Auth0
Auth0 is described as an identity management platform and offers tons of features. During the meeting we discussed authentication and authorization capabilities, conformance to the OAuth2 and OpenID standards, features related to anomaly detection, branding, hooks, rules, extensions & others. To read a bit more about Auth0 and in order to decide if it's for you, take a look at our blogpost “Never write a UserService again, or when to use external microservices”.
EventStore
Event Store is a database built for Event Sourcing invented by Greg Young. So what is Event Sourcing? It is an architecture type, where instead of current state, the events describing its modifications are stored. Quite a popular option is to leverage Event Sourcing with the Akka Persistence or just a plain SQL database.
Discussions
Apart from Katas and Technology related meetings we actually have a third format: general discussion about project problems or over topics mentioned in read blog posts.
Config service
During one of the meetings we talked about building a service for handling dynamic business-related parameters. Among the requirements there was a full audit log. We discussed pros and cons and how to integrate it with various technologies. Apache Kafka was mentioned as one of the ways how services can be notified about parameters changes.
Entity service
This meeting was a discussion about a set of articles related to the Entity service model. The question was: is it a pattern or an antipattern?
What was the result of our discussion? The famous: it depends ;)
Software Development Reading Club
Apart from the Architecture discussion group, we have one much older and much popular initiative - the reading club. We start from voting on a book title and later we read it chapter by chapter. During bi-weekly meetings we discuss what we read, explain less clear parts and think about applications. Currently we’re reading the Database Internals by Alex Petrov - a quite difficult book about data structures & algorithms used in databases. During previous editions we read various publications about microservices, architecture and functional programming.
Conclusions
There are a lot of options to improve your skills. Of course not everyone has time to attend all possible meetings, so we try to record them, and allow colleagues to watch them later. If you’d like to organize your own Architecture Kata, you can choose one of the topics available on Ted Newards website. If you have any questions about our meeting formats, just let us know! Remember, like Stephen Covey said, it is worth to Sharpen the Saw.