3 pillars of machine learning projects
What are the most important competencies that are required for an ML project’s success? How are machine learning projects different from software development in terms of project management?
We talked to Paweł Morkisz about building successful ML solutions and managing machine learning projects.
Paweł Morkisz — Deep Learning Algorithms Manager at Nvidia and Assistant Professor at AGH University of Science and Technology in Cracow. Former CTO & Chief Researcher at ReliaSol, a startup providing predictive maintenance solutions. He has a PhD in computational mathematics and co-authored a variety of research papers within Mathematical Analysis, Computational Mathematics and Probabilistic Methods. His work at Nvidia is focused on deep learning for time-series and recommender systems.
Can you tell us a little about your experience with machine learning projects so far?
In my history, I've been a part of ML projects from all angles: I had the opportunity to be the CTO of a startup where we built a system predicting failures based on sensor data, I also work at the university where we conduct research in the field of ML, and now I work for Nvidia where we provide tools to enable others building ML projects.
What are the skills that you pay attention to when building a machine learning team?
I make sure that in a team, there are skills in 3 domains. First, expert knowledge is necessary — we need someone who knows what problem we're solving. If we are building a system for industry 4.0, we need someone who understands the machines we work with. If we're working on a medical project, we need someone who understands this data — probably a doctor or someone who can interpret medical records. A domain expert is always needed because these competencies are necessary to present the problem the right way and build a machine learning system well.
In addition to that, skills related to data science are also a must. There needs to be someone who can, to put it simply, convert data into a model, generalize what’s in the data into a specific piece of software. For example, if we want to have a system that will detect machine failures, we need a person who, when they receive data from sensors, will be able to build a model that will report it when the system detects something dangerous.
Last but not least, and that's a thing that sometimes seems to be forgotten, every team needs someone who can deploy all that. Machine learning doesn't end with model creation. Machine learning ends when the system is working and is integrated with other parts of the IT system so that it is automatically fed, automatically processed. It's best if there's still some control over quality and possibility to recalibrate or improve the model. We need to have these 3 pillars to build something meaningful. Of course, we may find a person who has all these competencies, but it is worth understanding that these are orthogonal problems that we're solving.
If I were to give advice to a manager who will work with an ML project for the first time, I would say they should understand and perform an in-depth business analysis of the problem: what they're actually solving. There are many examples where someone forgot to do their homework, built a system, and it turned out that it didn't work.
Let's take a common example: a system that recognizes whether a person is sick, where we test it for some rare disease, let's say its incidence is 1 in 10,000. An algorithm that tells everyone they are healthy would only be wrong once in 10k, so it's practically never wrong. One mistake in ten thousand? That's a pretty good result. But the usefulness of such a model? None because it does the opposite of what we wanted. We would prefer it to be wrong 5 times out of 10,000 and send 5 people for further tests than to send everyone home and that 1 person in 10 thousand would die.
It really is crucial to understand what problem we are solving exactly, what the purpose is, what the costs and benefits of what is returned to us are. It's about presenting the problem in a way that the objective function is understandable to our model. There are a lot of examples of mistakes made at this stage of the project. I had a situation at the university when one of my students was submitting his assignment and said that he was working on a project where the model recognizes animals in photos. That's a pretty standard problem for Deep Learning. The student then said that for various animal species, the accuracy is around 90%, but for wolves — 100%. Are wolves 'easier'? No, that's not possible. So we took a look at the training and validation sets. It turned out that the reason behind this was very simple: all the wolves in the data set were photographed on snow, with a white background. And they were the only animals on a white background, so the model has learned how to effectively detect white backgrounds, and this can actually be done with 100% accuracy. This is a very good example of a training set that isn't representative — because if we gave the model a wolf on green grass, there is no chance it would classify it as a wolf when it was trained only on wolves on the snow.
In practice, it is very important to understand that for supervised learning, the model will only detect in new data the things we've previously taught it. It's crucial to take this into account when preparing the requirements for the system.
What's unique about ML projects compared to other software solutions that they are sometimes considered difficult in the context of project management?
The role of the project manager is to make sure that the business analysis and the industry expert's perspective are well understood and described by what the data scientist wants to model. If they don't reach an agreement and just think they know it all, then they're following bad examples, like the issues I've mentioned before.
I believe that most projects have a problem at this stage where the data scientist got the wrong information about what is expected from them. There's another interesting example: there was a system built in China that called people and offered a new phone subscription. The system's only goal was to sell as many subscriptions as possible, so the system began to lie. It promised customers unlimited calls, roaming, the whole package at extremely low prices — so people agreed. Who wouldn’t? The system was 100% successful. But after a month, complaints came flooding in: what was on the invoice did not match what was sold, so the company checked the call recordings, and they found that the system learned that if it follows this path and lies, it will definitely sell. This system had a wrong objective function. It maximized the likelihood of selling a product but there were no specific requirements or constraints that there's only a range of options that can be offered. This is again an example of how important it is for business analysis to go hand in hand with modeling.
Machine learning often involves experimental work and sometimes things don't go right the first time around. Is there a specific way to manage such an uncertain process?
There are different approaches. Depending on what our budget is and how important the problem is, we can start with a pilot — and this is something I observed in projects that were successful. In such projects, we tried to identify some pareto at the business analysis level, e.g. 90% of problems are generated by 10% of cases, attempted to solve something only for this group and prove that our methods can work and be rolled out. If that went well, then we extended it to other aspects. This approach is usually safer, less painful, and helps us quickly assess whether there's anything missing in the data, pipeline or software. On the other hand, it also drastically extends the entire implementation process because doing something like this can take as much time as building the entire system. There is also some risk that the pilot works out but we will not be able to extend it further.
More in the 'Managing Machine Learning Projects' series:
Starting with a pilot is justified, though, especially in projects where we don't know if something will work. If I were to build a license plate recognition system, I wouldn't take the pilot path because we know how it's solved, we know that these systems work. But if it's something completely new, no one in the world has done something like that before, I would advise doing a pilot.
If we have a very large budget and the problem is weighty, we have an investor who’s willing to back us, then going 'in full swing' increases the probability of success. Startups, however, sometimes have this problem that they promise too much to investors, funding comes in, and then the startup does not deliver, and everyone is disappointed. Building such solutions by milestones is a good approach and it prevents the startup from diluting too much, because taking subsequent financial rounds tailored to a given stage means that each time we implement a solution to the problem that is the least risky at that given moment. If I only have an idea and I want $10 million, someone can say "Ok, give me 50% of the company in exchange". But if I took 1 million and gave 10% of the shares, built an operating MVP or PoC, then I would give another 10% for further funding — so in the end, I give away 20%, not half of the shares, because the company's value increases dramatically from the moment when all I have are an idea and a pitch to when I have a working system. But there are also situations when someone has a vision that putting 10 times the money into a project will speed it up or you have to fight because there is a lot of competition, then you have to take it all into account. This isn't a generic issue and it can't be solved right away and estimated that you need, for example, half a million dollars to start an ML project. Machine learning can be many different things, these projects are very different.
Now the question is: how to run an ML project well? There is no definite answer to this. Certainly, when designing the system, it is worth ensuring that the system is modular enough to allow us to extend it to other functionalities, to new areas. What is also very important in my opinion is the separation of the specific models we use from the rest of the system. That is: not being attached to one type of model. Let's imagine that we have a system based on classic machine learning, e.g. decision trees, and it works very well, it is 90% accurate. Suddenly, it turns out that someone has published a paper and a repository with a new deep learning method that is 98% accurate. If we're focused on the solution using a specific library, using a specific type of model, then we have to rewrite the system. But if it's built in a modular way, i.e. the system takes input, returns the output, processes it, there is an appropriate abstraction layer between these modules, then it’s not a problem, we only attach other methods underneath and we can really switch to a new type model.
What about estimating the time of work in ML projects? Is it worth estimating at all?
When it comes to estimation, it is quite a difficult question because in experimental work, it is difficult to determine how long it will take to build a model. What I usually do is brainstorm what methods we would like to use, we list them, and for each such approach we can already plan something specifically, and for each component, we can assign an estimate. That's also safe when we are aware that the project may not work out. For example, someone asks us to do a pilot in an area where no one has done it before — the deal should be safe for both parties. The contractor of such a system wants to know that they will be paid for their work, regardless of whether the model works or not. The client wants to at least know what's been done.
Decomposing a big problem into some specific procedures that we will carry out as part of the experiments allows us to later show that we tested given methods, each of them produced these results, here are the scripts that you can replicate these results with. It's documented: we have done all the work and the client can take what we delivered. Then we also don't have to rely so heavily on the outcome. We can have a success fee, but we are sure that when we do our job, conduct the experiments, someone will pay us for it. It is nice to set a milestone here, e.g. if we achieve sufficient quality, we can go further. But if we see it's not going well, it's 50-50: we're either right or wrong, it is clear that something is missing: it may be the data, or the problem is poorly presented, or it is simply impossible to be done at the moment — and this must be detected as early as possible in order not to invest a lot in such projects.
It is also helpful to know how to identify problems that are experimental and risky. When it comes to, for example, image recognition, practically everything that a human can recognize, the computer can recognize as well. If we can limit ourselves to image analysis and there are some patterns that a human can recognize with the naked eye, then chances are high that this project is not risky, that there is a solution that can be done. The only question is whether we have a training set and whether this set will be representative of what will happen in the real system. If so — it should be successful, then planning several types of networks should be enough.
Obviously, the larger the project, the more difficult it is to make an estimation, so decomposing it is advised. However, you have to be aware that making a PoC or a pilot makes the entire process longer. For some projects it is not necessary because, returning to the example of license plate recognition, if I only search for numbers, I have to do a lot of work twice. We can also divide the project into parts so that PoC is not focused on machine learning, but we prove that it works as a separate module, and we only build the system when this module works. We can 'cut out' from the project on whatever side, but it's important that we make the business analysis thoroughly.
ML projects are carried out by interdisciplinary teams — how are they integrated with other tasks within an engineering project?
I believe that it should come out already at the stage of building the concept and business analysis, i.e. we should get information about what we expect from the analytics module. If we want to put given data into the model, we want to get a specific answer and we expect accuracy measured by some metric. We want these two worlds to work independently: on the one hand, there is this model, we know its characteristics, and on the other hand, there is a system that gives something to the model, receives something from it and on this basis, we should have a ready-made solution.
I also think that this first stage, i.e. business analysis, must not be omitted. I don't like it when someone says “here you have the data, we'll send you a sample, do the system”. Without some kind of workshop, if the project is not really trivial in terms of application and business analysis, it is unfortunately doomed to failure. There are always many aspects to consider when building a model.
Do you think it is possible to develop good practices in ML project management and follow them in such projects?
It all depends on the specific case. For some business applications, an accuracy of 50% is great and that's something we want, something that brings us a lot of money, but in other cases, an accuracy of 98% means that the system can't be applied.
When we were carrying out a project related to predicting machine breakdowns, there were very big differences in what the clients expected: one wanted to be notified 5 minutes prior to the failure, and the other said "if I don't know 3 days before it happens, it doesn't have any value for us". Such differences emerged at the stage of building business requirements. Building analysis of functionalities and describing them well is definitely an element that should be replicated in ML projects.
I haven't seen any collection of such good practices for ML projects but I've also never looked for any, they could probably be developed. But when we hire the right people, their experience is already enough. The project management itself can be carried out by the team or a dedicated person. It's great if we have a person with project management competences and experience with ML projects because they understand the risks. However, it's ok to have a good project manager without prior ML experience, give them their first project, guide them through the process, and they will learn fast.
The role of the project manager is partly to take PM responsibilities off the shoulders of the team, but also to understand, coordinate, and spot the risks in these three orthogonal areas. Within the ML team, if the PM role is given to a data scientist, they will pay the most attention to the modeling, while business analysis and implementation won't be so important. If we entrust management to the person responsible for the deployment part of the project, it is possible that the system will be well-designed and well implemented, but the model will not return what we need. I think that due to the fact that we have an interdisciplinary team, having someone responsible for coordinating this work is a big advantage. I really like when there is also a project manager in the project that requires a timeline, determines what is to be delivered and when, accounts for all of it, watches over and checks the quality. The project manager is also the one who states that something has gone wrong, which is sometimes difficult. A data scientist might say that maybe the metric we were supposed to measure wasn’t looking great, but on a different metric, the model works very well. Then the project manager verifies that, asking the business analyst whether changing this metric really does not matter because it may turn out that we're on the right way to building a system that tells sick people they're healthy. The PM is a person who watches over the entire process.
I am a fan of the approach in which we set specific requirements and quality criteria for individual components and try to account for them. Otherwise, it may turn out that we will not go through the integration phase because each module in isolation works fine, but we will not be able to match them. Here, good project management preparation is important. Clearly define criteria: when is it green, yellow, red? Check it by asking the team a simple question: are you in the green, yellow or red group? - Red. But look, here ... - No. It's red, we report it as red, we report there is a problem. Sometimes this relentlessness saves projects. The project manager does not have to have detailed knowledge of ML, or actually does not have to have it at all, but after a few projects, they will definitely gain the necessary know-how. But different teams have different needs, and that’s also true for project management, there isn’t one universal way to handle it.
Looking for more quality tech content about Machine Learning, Big Data or Stream Processing? Join Data Times, a monthly dose of tech news curated by SoftwareMill's engineers.