Getting Started with Machine Learning

December 16, 2024 | 10 min read

Disclaimer: I am not an expert; I'm a learner, and I've just outlined what has worked for me and although I have tried making it more general, this could still, to some extent, be a subjective list.

This list assumes the following:

  • that you have no prior background when it comes to machine learning
  • that you have completed high school math (maybe you've forgotten, but that's okay)
  • that you don't know much about Python but aren't too unfamiliar with programming languages. If you know some Python, awesome. If you're an expert, even better. If you know another programming language, that'll help a lot! You more or less know where you fit here so move forward accordingly; let's assume that you are a beginner in Python (or programming in general).

There's no particular order to this, and the resources I suggest could be subjective. As a learner, you're free to roam around, but having some basic foundation when it comes to math & programming (Python) is key.

I'm going to paste some materials below. These aren't necessarily in order.

My suggestion: complete [1], get some basic understanding of Python (and some PyTorch too if you have time) in [2] (you do not need to master Python or PyTorch! learn as you go!), complete the 3B1B playlist from [3.1.1] — make sure you have a basic high-level understanding of the concepts — and start [3.2]. The rest can be done concurrently with something else. Even if you skip some parts, make sure you come back to complete it. When I say this, I'm not talking about the course, but the topic!

Note

If you find something from below not good enough or interactive, it's okay to refer to something else to cover that topic. It's all subjective, anyway. You probably know what works best for you. What's important is that you cover that topic in some way.

Okay, here's the list.

1. Math

Again, this could be subjective but its important to get the absolute foundations down. So, I suggest:

1.1 Math for ML Specialization

This specialization covers basic linear algebra, calculus, and statistics required for ML. You don’t need to pay; just audit the courses.

This series offers a basic fundamental understanding only, which I consider important since being able to understand mathematical intuition is super essential, even if it’s just a basic one.

Here's the specialization link.

1.2 Essence of Linear Algebra by 3Blue1Brown

I highly recommend this. The material is in YouTube video format as well as reading format. You can watch/do this concurrently with the linear algebra course above.

Here's the playlist link.

1.3 Differential Equations by 3Blue1Brown

I highly recommend this playlist as well. YouTube playlist as well as text format available. You can watch/do this concurrently with the calculus course above.

Here's the playlist link.

1.4 Essence of Calculus by 3Blue1Brown

This is my other highly recommended playlist series. YouTube playlist as well as text format available. You can watch/do this concurrently with the calculus course above.

Here's the playlist link.

Disclaimer Regarding the 3B1B Courses

These courses are for getting a high-level preview of the concepts — some sort of a mental model — and are not necessarily to be relied upon for purely learning. For that, you need paper, a pen, and some proper math exercises to solve.

This is where my next suggestion comes in.

1.5 Mathematics for Machine Learning, Math Academy

Not for someone who wants to get the basics done and jump into machine learning relatively quickly. This is a paid course ($49/month) but if you are willing to learn serious math for machine learning (which I highly recommend if you can afford the monthly subscription), this is the best place for ml math (or any sort of math for that matter). My biggest revelation of 2024 has been this website, and this thing ensures you learn. The scaffolding this site applies is something I have never seen anywhere else, and the spaced repetition alongside the site's AI that closely monitors your progress -- always tracking your strengths and weaknesses -- is something I regard one of the best things I've ever seen. This is the best way to learn math, and I highly recommend it. If you're willing to take this route, you do not need to take any math for ml courses anywhere else (except for maybe the 3B1B courses to craft some mental models).

You can take a look at how this site works here. I am not affiliated with this site by any means; I just think it's that good.

Here's the course link.

2. Programming

Although I'm pretty sure this is something you might already know where to look at to learn, allow me to still paste some materials below and you can see if they might be helpful to you.

2.1 Practical Python by Dabeaz

I really like this course. It's got roughly 40 hours of intense work that's done mostly in the terminal (which I believe is a great thing to get used to), focusing mainly on script writing, basic data manipulation, and program organization. It's an awesome way to get started with serious Python.

Here's the course link.

2.2 PyTorch - Zero to Mastery by Daniel Bourke

PyTorch is arguably the most popular library for deep learning, and it's the one that's used in most of the research papers. It's a very powerful library that allows us to do a lot of things such as manipulating multiple dimensions of data, implementing and training neural networks, and more.

Now, let us swing back to the topic of this PyTorch book. I absolutely adore this series. It's primarily an online book, with a 24-hour YouTube video available (covering the first 5 chapters out of 10) if you're more a video format person (I personally think just referring to the book is fine, but it's up to you). This takes us from the absolute basics of PyTorch and many machine learning concepts in a hands-on, code-first way all the way till implementing a research paper. It's a great way to get started with PyTorch.

Here's the course link and the YouTube video link.

2.3 Official NumPy Docs

NumPy is a very important library for manipulating data (mostly in the form of arrays when it comes to machine learning). Alongside PyTorch, this is another library that we must know about. PyTorch and NumPy may seem similar, but the main difference lies in the computation: Pytorch tensors (think of them as n-dimensional matrices) are similar to numpy arrays, but can also be operated on GPU. Numpy arrays are mainly used in typical machine learning algorithms (such as k-means or Decision Tree in scikit-learn) whereas PyTorch tensors are mainly used in deep learning which requires heavy matrix computation. You can take a look at this document for a quick comparison.

I recommend completing the absolute basics and the fundamentals sections from the official NumPy documentation to get started.

2.4 Official PyTorch Docs

This is the official PyTorch documentation. I highly recommend firstly going through the PyTorch book by Daniel Bourke from above, and then using this documentation as a reference. We do not necessarily need to learn or know everything from it; we just need to know where to look at when we need to.

Here's the official documentation link.

2.5 Advanced Python Mastery by Dabeaz

This course can be thought of as a more advanced version of the Practical Python course. It is not meant for beginners, but for those who have a good grasp of Python and want to learn more advanced topics; therefore, if you want to skip this one, it is completely fine. That being said, if you want to move beyond writing scripts to writing more sophisticated programs and build a more complete mental model of the Python language itself and how it works, this is the course for you.

Here's the course link.

3. The "ML" Stuff

This section is where I'm going to list out some of the resources I've found useful for getting started with machine learning. People have their own views on how to start this as well, so I’ll list out the ones that worked (and are working) pretty well for me.

3.1 The "Theory" Stuff

3.1.1 The Neural Network Playlist by 3Blue1Brown

I consider these videos incredibly intuitive and highly recommend watching this before getting into the crux of machine learning to get a high-level understanding of the concepts. Text format available too!

Here's the site link and the YouTube playlist link.

3.1.2 Machine Learning Specialization by Andrew Ng

Andrew Ng is one of the most well-regarded teachers and pionners in the field of Machine Learning, and this recently revised specialization, with its 3 courses, is a great way to get started with the basics of the theory side of machine learning. This specialization also has labs, which, if you have full access to the course, can be done in Python. It does make use of TensorFlow, but you can easily do the labs on your own in PyTorch (or even JAX!) if you want.

Here's the course link.

3.1.3 ML Playlist by Statquest

I like to think of Josh Starmer's channel as my go-to for learning anything machine learning or statistics related whenever I run into a concept that I find difficult to understand. I believe his videos are interactive and intuitive, and I use his Statquest videos on ML for reference (just like the PyTorch docs). I recommend you do this as well if you feel like it :)

Here's the YouTube playlist link.

3.2 The Hands-On Approach

As the heading suggests, this section is about the hands-on approach to machine learning. This is where we get our hands dirty and start building things, and personally, I think this is the only way to actually learn to do machine learning. Everything so far has been to lead us to this point, and I think this is the quintessential part of the entire process.

3.2.1 Machine Learning with PyTorch and Scikit-Learn by Raschka - Liu - Vahid

The topics that this book covers ranges from traditional machine learning concepts (covering the fundamental concepts surrounding machine learning, including preprocessing the data, model evaluation, hyperparameter tuning, etc.) making use of mainly scikit-learn to deep learning (such as CNNs, RNNs, Transformers, parallelization, GNNs, Reinforcement Learning, etc.) using PyTorch. It's a ridiculously packed book, and I highly recommend it.

If you are interested in knowing more about the topics covered in this book, you can take a look at the book link, its detailed table of contents, and the blog by the author explaining the book content.

3.2.2 Neural Network - Zero to Hero Series by Andrej Karpathy

Andrej Karpathy is one of the most, if not the most, well-regarded instructors in the field of deep learning, and it's incredible that a series like this with its level of quality and detail is available for free. He takes a very hands-on approach to teaching, starting from absolute basics of python classes, derivatives, gradient descent, micrograd, then from a bigram counting model to makemore, nanoGPT, tokenizers, and more.

As of December 2024, the series is up to 8 parts, and I highly recommend you go through each and every one of them and replicate what he teaches. Andrej is an incredible teacher; all the concepts, regardless of their difficulty, that he illustrates in his videos are extremely well-explained, and the way he builds up everything is something I think is second to none.

Here's the link to the series.

4. Some More Resources

Do some small-scale side projects too! Kaggle is nice. Also, try to consume as much as you can from people like Sebastian Raschka in addition to Andrej Karpathy.

You can try reading some ml-related books and read some papers as well. Here are some books I was told are good (I have not yet read them all, so please check them out with your own curiosity). Links are provided for each book.

For papers, it's quite open but you can take a look at the links below for a kickstart:

I also really like this site that lists out tons of ml resources created by learners and experts alike.

Lastly, if you want to be a part of the community of driven learners, you can take a look at all the people I follow on Twitter. These are some of the most active and amazing people I've met online, and I've learned a lot from them.

Good luck!