Guide to AWS ML Specialty Certificate

02 Dec 2024. 10 minutes read

Guide to AWS ML Specialty Certificate webp image

As an ML Engineer with significant Software Engineering experience, I often find myself trying to reinvent the wheel, so to speak. After multiple hours of development, I often find a ready-to-use solution that works better than my custom code and, more importantly, does not need to be maintained later.

Of course, off-the-shelf solutions are often limited in their configurations. They might be subject to changes introduced by their authors, but in most cases, they help to create our solution with less effort in a fraction of the time.

To mitigate that and gain in-depth knowledge about the pros and cons of such a solution, I decided to pass the AWS ML Speciality certificate. This will give me a comprehensive view of what is possible and what shortcomings are present in AWS ML services (as I had the most experience with that cloud).

In this blog post, I will cover an exam guide followed by four interesting AWS services that might significantly improve your ML pipeline within minutes of being set up.

Why is it worth having?

certificate

Firstly, you can learn what is happening under the hood. In a time when AI is being adopted at almost every step, Software Engineers could benefit from understanding how the underlying models and architectures work and be able to create and fix them according to the requirements.

Additionally, you can learn something new even as someone with experience. As an ML practitioner, I was able to systemize some parts of my ML knowledge that I had less experience with, i.e., which time series solution was best for which setup.

Moreover, you can certify your knowledge of both ML and AWS. Having such a certificate shows a prospective employer that you are capable of ML topics and can quickly deploy them in AWS. In this blog post, I actually cover some tools that I found interesting and not so standard (often going beyond training simple models in Sagemaker), even with multiple years of experience in the field.

Finally, you can feel like a real-life ML consultant while learning. In a significant part of my work, I am often tasked with finding the most suitable solution given requirements from less ML-savvy people. Questions are often formed similarly to a given problem (e.g., create Amazon Alexa with the least effort). You must provide the client with the most suitable solution (consider a combination of e2e tools like Lex and Polly).

AWS ML Study path

This specialisation course is at the pinnacle of the available courses regarding machine learning at the AWS. Still, if you somehow read this and feel less advanced, you might consider lower-level certificates, which might be more suitable for your knowledge. Here, I mean 2 ML related ones (at the moment of writing, they are still in beta version):

AI Practitioner (Foundational level) is also suitable for people with fewer tech skills.
Machine Learning Engineer (Associate level) significantly overlaps with ML Speciality level but with easier-to-grasp detail and less ML expertise required.

Additionally, as ML positions might have different faces (MLOps, Prompt Engineer, Data Scientist or ML Engineer), I can recommend the following guide (from the official website of AWS):

aws certification paths

Source: AWS journey path

Exam structure and details

The exam for ML Specialization is an AWS certificate exam consisting of multiple-choice questions (some with multiple answers). You must answer 65 questions in 180 minutes for this particular exam. This exam is a good blend of cloud-related knowledge with ML-specific expertise.
Questions range from the following domains:

official ML Speciality Study Guide

Source: Official ML Specialty Study Guide

To pass, you have to get above 750 (scoring from 100-1000).

According to the guide, the question within domains can range from the following topics (as an excerpt from AWS study guide):

Domain 1: Data Engineering
- 1.1 Create data repositories for machine learning.
- 1.2 Identify and implement a data-ingestion solution.
- 1.3 Identify and implement a data-transformation solution.
Domain 2: Exploratory Data Analysis
- 2.1 Sanitize and prepare data for modelling.
- 2.2 Perform feature engineering.
- 2.3 Analyze and visualise data for machine learning.
Domain 3: Modeling
- 3.1 Frame business problems as machine learning problems.
- 3.2 Select the appropriate model(s) for a given machine learning problem.
- 3.3 Train machine learning models.
- 3.4 Perform hyperparameter optimisation.
- 3.5 Evaluate machine learning models.
Domain 4: Machine Learning Implementation and Operations
- 4.1 Build machine learning solutions for performance, availability, scalability, resiliency, and fault tolerance.
- 4.2 Recommend and implement the appropriate machine learning services and features for a given problem.
- 4.3 Apply basic AWS security practices to machine learning solutions.
- 4.4 Deploy and operationalise machine learning solutions.

As the above requirements might seem a bit vague, I will cover some helpful learning resources in the next section.

Useful resources

To prepare for an exam, I have completed two courses: one from Whizlab and another course on Udemy (from Stephen Mareek and Frank Kane).

The ML course from Stephan Mareek and Frank Kane is the best resource for passing the exam. They present the material clearly, highlighting the information that might appear on the exam.

The Whizlab course, on the other hand, is a great resource for gaining hands-on experience (with more than 12 labels), which will help you better grasp how certain tools look in the AWS Console and what parameters are available.

AWS Skill Builder is an excellent tool to supplement your learning and find knowledge in your articles.

Last but not least is AWS documentation (either for ML, Sagemaker, or other data-related tools), which came in handy for answering some questions for variety of exam questions.

Tips for the exam

As you may have noticed, I have completed quite a few exams to prepare and gain confidence, and I can give a few tips:

Consider the time limits when writing the exam. During practice, I completed them on average within 60% of the time, while during the exam, I took the whole available time. If you are not a native speaker, you can claim an extra 30 minutes.
Consider booking a practice exam with AWS. This will give you insight into whether you are ready and also better insight into how you would feel during the real exam.
Do not limit yourself to the simplest and most popular question sets. Practice more complex questions as well to prepare for more challenging scenarios, which may also be present in the exam.

Interesting AWS ML features

In this section, I will cover a few of the features of AWS I learned about while preparing for this certificate.

Two faces of the Sagemaker Canvas

sagemaker canvas

Sagemaker Canvas workflow as a no-code solution for ML workflows, source: aws.amazon.com

As someone who prefers to understand how underlying components work, I was sceptical about the whole AutoML trend. However, AWS SageMaker Canvas is on a whole new level. This solution allows the creation of an end-to-end ML lifecycle in a drag-and-drop fashion. This kind of solution is popular in AI; Roboflow, Encord, and SuperAnnotate provide similar solutions.

Although this solution might not be the first choice for seasoned Machine Learning Engineers with requirements for custom setup, as their target group is the less technical audience, it can still be useful for easier use cases that need to move quickly.

The cost itself isn't large as one instance costs 1.9$/h (with small extra fees when you have more than 5GB of data).

On the other hand, as anyone who has ever worked with AWS knows, there might come a time when you suddenly forget that a particular service is running and your AWS bill is increasing. This is common, however, with AWS Sagemaker Canvas, this might go to the next level, as there are certain histories where AWS changed charging rules, causing extra costs on a hidden Canvas instance.

To sum up, AWS Sagemaker Canvas might be a valuable tool when you need to produce a solution quickly; however, you must be extra vigilant about removing unused resources and closely monitoring the cost.

Solving the fairness and bias with AWS Clarify

iris dataset

Global explanations on iris dataset using AWS Clarify, source docs.aws.amazon.com

AWS Clarify is a tool that allows you to evaluate and interpret the models' results and possible biases in your dataset. Focusing on these areas might be required when adhering to guidelines like ISO/IEC 42001 when preparing data, creating a model, or running it in production.

When detecting bias, you need to define demographic groups (i.e,, race, gender, age, etc), and after that, you can use one of the available metrics:

Class Imbalance - i.e., one class might be underrepresented.
Demographic Parity Difference - measuring the difference in positive outcomes between demographic groups (i.e., whether white males are more likely to get higher credit scores).
The difference in Positive/ Negative Outcome - measuring the difference in likelihood of positive and negative outcomes between the groups.
Disparity Impact - i.e., checking if minority groups are not unfairly disadvantaged.
Equal opportunity difference - measuring the difference in true positive rates between groups.
Average odds difference - comparing differences in false positive and true positive rates across the groups.
Conditional Demographic Parity- checks for fairness in positive outcomes across the groups while taking into account conditional on other features.

When considering the models' interpretability, AWS Clarify offers Kernel SHAP for both local interpretability (sample level) and global interpretability (dataset level). Such values can help understand feature attributions at both training and deployment time, which could be a good starting point for data drift detection.

If you are interested in Explainable AI in Computer Vision, consider checking our library: FoXAI.
Read: FoXAI for pneumonia

A simple way to reduce labelling time with AWS Ground Truth Autolabeling

sagemaker ground truth for LIDAR

SageMaker Ground Truth for LIDAR data, source: aws.amazon.com

As tools for labelling go, the AWS one is similar to the others. However, a distinguishable feature is active learning (effectively making labelling a hybrid process where the efforts of human labellers are combined with an automatically retrained model). Combined with the vast model repository available on the AWS cloud (Bedrock, dedicated services or custom deployed models), it might create quite a powerful mix.

When looking into the details on how to set it up, AWS suggests in their guide that labelling at least 1250 objects (with 5000 being a strong suggestion). After that, the workflow above is created by training a model from a given domain and later using it in batch form to provide a label to the unlabeled part of the dataset (if the confidence of the given label is higher than the set confidence threshold). Low-confidence examples are sent to human labellers. These last two steps happen iteratively until no low-confidence examples are present. This hybrid workflow can greatly decrease labelling time (according to AWS, by up to 70%). For more detailed information, I recommend the following guide from AWS.

AWS Ground Truth is also famous for its on-demand cheap Mechanical Turk workforce. However, most solutions either require specialised domain knowledge or have high data privacy standards, limiting the usability of the AWS workforce.

Human oversight with AWS Augmented AI

When designing a Machine Learning solution, it is nearly impossible to achieve 100% accuracy. You often have to find the best trade-off between false positives and false negatives. But what if I told you that is not necessarily a lost cause?

AWS Augmented AI allows the introduction of human oversight on low-confidence samples, effectively creating a hybrid system with humans in the loop for more challenging examples.

In summary (more details here), what you have to do is:

Create a labelling workforce in Sagemaker Ground Truth (a few clicks in the console).
Create human workflow via A2I and assign the label workforce to it (another few clicks).
Invoke the workflow where the response is needed.

Below, you can see the code (step 3) for detecting sensitive and adult content using the AWS Recognition service. If your use case requires a custom task, consider this tutorial.

response = client.detect_moderation_labels(
            Image={
                "S3Object":{
                    "Bucket": "amzn-s3-demo-bucket", 
                    "Name": "image-name.png"
                }
            },
            HumanLoopConfig={
               "FlowDefinitionArn":"arn:aws:sagemaker:us-west-2:111122223333:flow-definition/flow-definition-name",
               "HumanLoopName":"human-loop-name",
               "DataAttributes":{
                    ContentClassifiers:["FreeOfPersonallyIdentifiableInformation"|"FreeOfAdultContent"]
                }
             })

Code example to invoke human in the loop job, source: AWS guide on A2I

Conclusions

In this blog post, we covered a short exam guide that should help you get the certificate while also presenting multiple tools that might be helpful for your ML deployment. These range from Sagemaker Canvas, a no-code solution for creating ML workflows, to AWS Clarify, a bias and interpretability tool, and finally, human-in-the-loop solutions for labelling and deploying ML models.

Reviewed by: Adam Wawrzyński

Contents