Getting Started with Machine Learning for Software Engineers

There are definitely more articles about this topic than the number of characters in the alphabet, so one might wonder and ask, why even write this essay?

The thing is, the field is evolving so fast that each year¹ the most recent courses are becoming outdated. For software engineers, the best way to think about AI is software we don’t fully yet understand².

When you abstract AI as software, you’re able to find intuitive ways to understand it and also build lots of value using it. At the end of the day, deep learning is just optimization and matrix multiplications expressed in code.

So how does one master it? Well, there is no “right path” to do so. The right way to do it is to put in the effort for a long period of time until things start to make sense. There’s a huge compounding effect to practicing engineering that most people don’t talk about and has a lot of value in deep learning due to the complexity of the nature of its software.

There is no one secret roadmap you need to follow, instead, you need to put in the 10,000+³ hours of work, explore ideas, see what works and what doesn’t, take paths that don’t yield fruitful results, and iterate.

What’s worked for me is iterative intense sprints of going over what has been produced in the field, looking out for companies blogs and how they’re using AI, reading research papers and high-quality token blogs, being familiar with the limitations of systems, chatting with smart AI researchers and exchanging idea’s, having the ability to understand papers by myself and identify what’s actually⁴ pushing the field forward, implementing research papers, and building things from scratch.

An excellent starting point to learn about neural networks is Andrej Karpathy’s course⁵ “Neural Networks: Zero to Hero”. Micrgrad is very well written, simple, and helps build the right framework to think about deep learning code.

When I was in high school, I used to be a physics geek. I was lucky to have a strong relationship with a very smart physics professor that I would ask for advice about what to read. He taught me something that stuck with me, he used to say: “Listen, there are many books about Physics, but usually there’s one book from a scientist that really knows what’s happening that all books are just copying from and adding different diagrams”.

A lot of content online is just copy-paste from each other, so there is no need to go over ALL resources⁶. Instead, I learned that some people know more about what they’re talking about than others. So try to find those types of people, and learn how they actually think about neural networks.

Good luck!

Notes

Since the release of ChatGPT, it’s safe to say each month ↩
Sam Altman: “AI is how we describe software that we don’t quite know how to build yet, particularly software we are either very excited about or very nervous about”. Source: https://twitter.com/sama/status/1663983174030901249 ↩
Malcolm Gladwell is the one who popularized this idea. ↩
“So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ And I did. I plowed through all those things and it all started sorting out in my head.” Source: Dallas Innovates ↩
Karpathy.ai ↩
Compiled list of blogs on ML Fundamentals, and ML Research. ↩