If your dream is to become a university professor and work as a scientist, then choosing to pursue a PhD is an easy decision for you to make. However, if you’re a competitive engineer coming from a STEM background with a passion for software engineering, looking to make an impact in machine learning, choosing whether or not to do a PhD is a tough career decision.

Doing a PhD in machine learning is definitely worth it if you either want to pursue a career in AI research or you think that the job prospects after getting a PhD are much better.

However, making an early judgment on whether an approximately 5-year journey will be worth it, take your career to the next level and help you make a bigger impact in the industry is difficult without enough experience and context about how things work in the real world.

If you find job opportunities that allow growth and enable you to be in a better position after a few years, then you might want to think twice before pursuing a PhD. Your marketable skills and experience will be the top thing your prospective employers will look for at any stage of your career, and adding an advanced degree itself to your resume doesn’t always increase your opportunities.

If you did a PhD, you’ll be judged on the results and impact you made during that period of time. You don’t want your position to be the same in the job market afterward. For machine learning jobs in specific, gaining experience in the practical market is also of high value nowadays with the explosion of ever-growing MLOps tools that are changing the way data-driven products are built

Last year I went through the whole process, from applying to researching the pros/cons of each career path, and got to hear advice from notable people that I would love to one day be as successful as they are.

In this blog, I won’t take sides on what career path is better since that’s not even a valid hypothesis. The question is rather “Is doing a PhD a great career path for YOU?”. I personally chose not to do a PhD, at least for now. And in this post, I will share the information that I learned that hopefully will give you a better idea of what to expect and help you make your own decision.

Disclaimer #1: I’ve been fortunate to hear advice from prominent people in research and industry that I have big respect for. A lot of advice that I share in this blog about being a successful PhD student is from what I learned from researchers themselves and online content, not from my personal experience.

Disclaimer #2: It’s been pointed out to me by one of my peers that I’d be better off not talking about metrics in research like h-index or conference rankings, due to the disagreement on such topics in the community and the irrelevance of what advancing science actually is. I have my own take on the whole system as well, but in this blog, I’ll just talk about the process in a hackable way for improving employability prospects.

Table of contents

Doing a PhD

One thing to think of is your job market potential after you finish your PhD. Remember, you will be focusing a lot of your time on a very niche domain that probably won’t be of importance in the near future.

Tooling in machine learning is developing rapidly, making lots of tasks not require specialized training and skill set to do. Distributed training, for example, would have probably needed a PhD in the past to be able to do, but now there are frameworks that automate these processes and make them painless.

What is a PhD?

A PhD is fundamentally a contract, between yourself and another party like a university or research institution where you are provided with supervision, tools, funding, and technical assistance for you to conduct research (fundamental or applied) towards pushing the state of the art of a narrow part of a field forward.

It’s also a way to accumulate knowledge, training, and skills to evolve as a junior researcher yourself.

A PhD is a journey, a safe haven where you can focus on a specific topic of interest for a long period of time (~5 years). It will give you independence as well as a safe environment to do so.

PhD Applications

Since a PhD is also a degree, it has its own application process and success criteria for applicants. I would say it’s hard to get into a top program if you weren’t preparing yourself ahead of time to get admitted into one.

Getting admitted into a PhD program is similar to the undergrad process, you need to prepare ahead of time to get the right grades, research experience, “maybe” GRE, extracurricular activities, etc.

The higher you aim for, the higher the acceptance bar is going to be.

Although it might sound counterintuitive, a senior research scientist from Google pointed out to me that you can sort of already infer with a strong confidence interval your chances of getting into a PhD program at a particular school before applying.

Due to the exponential growth in deep learning, there’s fierce competition for AI research roles. The entry bar became so high, and it’s very tough nowadays to get into top PhD programs at top schools without a couple of publications at top-tier conferences or journals.

Another important factor is your network, which is probably what will get you your interviews in the first place.

Due to the 100s of applicants with almost similar resumes, if your resume doesn’t let the person reading it say “WOW I should interview this applicant!” then a referral will help you get that interview.

What will a PhD provide you?

A PhD will give you the resources and freedom to explore what you want. A PhD is also currently a main qualifying factor for scientists and some software engineering roles in big tech companies (Google, Amazon, Meta, etc).

A PhD can be useful for:

  • Inventing new things
  • Asking the right questions
  • Training in a very narrow domain (might be valuable for a company)
  • Doing cutting edge stuff

As Shreya Shankar puts it, during a PhD you can be “your own boss” in a stable career option for a few years with clear mentorship.

Also, a PhD will most importantly help you develop a research vision, a taste of what works and what doesn’t and how to explore the boundaries of science or state-of-the-art.

Things to Consider

Probably most of your work during your PhD will be done during the last part of your time during one, and at the beginning, you’ll be investing your time to explore what’s cool and impactful to work on.

It’s critical that you choose your school, team collaborators, advisor, and most importantly the area of research because they will impact the level of success you will achieve out of your PhD journey.

School

The school is a strong factor in your success during your PhD, with all the network and resources it can provide you with. A PhD is a graduate program, and to some degree, your school can impact you landing internships and jobs afterward.

Advisor

Choose your advisor(s) wisely. A PhD is more, in my opinion, about who your advisor is. If your advisor is very well known in his/her field, then it is about the university. If your advisor is not a top-tier scientist in his/her field, then the name of the institution would carry more weight.

Think about questions like: “Will my advisor promote my research in the community I am in?”, “How long have they been taking PhD students?”, etc. Sometimes, a new advisor starting to take students will work twice as hard as an established one for your success and training.

Also, you can reach out to existing or former students to ask about their experiences on a personal level. Your advisor will be more important then the university for your success.

Postdocs

Why look at postdocs profiles? Well actually, they will probably be your collaborators and mid-supervisors. Even though most of them won’t be around during all of your PhD journey, your research will somehow overlap with theirs, and you’ll benefit from their expertise

Lab Alumni

Look closely at lab alumni, where did they end up in their careers? Did their advisors help them to get where they are?

You’ll probably cite a lot of their work as well during your PhD, and they can help you land internships or jobs along the way.

Research Topic

This one is the hardest to assess, especially as a junior trying to become a researcher.

You can start off by asking yourself the following questions:

1) What’s the upper bound for the success of your research? What kind of impact will happen if you solve your research problems? 2) What audience does your work target? How important is what you’ll be working on for them? 3) Can your project scale fast to bigger applications? Like language models where a few incremental milestones can later be assembled to create a big network that solves bigger problems.

I think choosing a research topic deserves a whole essay itself, but these questions would be a good place to start from.

Pay

One thing that you should be aware of is that although most PhDs are fully funded, the actual salary is usually close to minimum wage and is lower then what you’d get working in the industry. I wish research pay wasn’t like that, but that’s how it is right now.

Also, interns at industrial labs pay fairly well which might be a source of extra money to make while finishing your PhD. But in general, I think the economic tradeoff can pay for itself in the long run if you achieved outlier success, so hang in there!

Maximize the Journey

If you’ve decided to do a PhD, then it’s important to be aware ahead of time about how to maximize the journey, since pushing the boundaries of science alone won’t be enough for you to get a job and have a successful career.

As Sam Altman’s blog about how to become successful puts it, you should always aim to have strong data points in your career that will give you better prospects in the future.

One of my favorite “cracking the coding interview” equivalent for research is the “A Survival Guide to a PhD” blog from Andrej Karpathy. He sort of cracks the whole process of doing a PhD, and how to make the best out of it.

Give Talks

One piece of advice that I got about doing a PhD is giving talks. You want to maximize your visibility during your PhD in order to improve your opportunity prospects afterward.

Giving talks and participating in workshops will immensely help you gain visibility and open opportunities for collaborations.

Write

You also want to write papers, tutorials, and blogs as much as you can. Think of your written output as a language that your peer researchers can discover who you are and what type of potential you have.

Your work is sort of your “way in” to research institutions and jobs, you’ll probably introduce yourself with what you’ve done during your PhD then the school you did your PhD at or your advisors name.

Your publications and contributions during your PhD are what most of your prospective employers will look at.

Intern

If you’re going for a PhD, it’s valuable to intern at top industry AI research labs like DeepMind during your placement.

You’ll get to build a strong network, collaborate with a lot of smart people, and do high-quality work due to the resources that these companies have.

Compute power and datasets scale at this current time can make a huge difference on the quality of your results which probably doesn’t need further elaboration in deep learning since the ImageNet era.

Network

As Richard Hamming puts it, science fiction draws this idea that scientists are introverted geniuses living alone in this world.

Reality is, it’s the complete opposite. Science is built with collaboration, and those who manage to keep in touch with the outside world and work hard will know what types of problems are important to tackle and solve.

PhD in the Industry

I am not sure if there’s an exact equivalent of this, but in France there’s a type of research program called “CIFRE PhD” where it’s a PhD done in conjunction with a professor at university/lab and in collaboration with a company. It’s like the Industrial Fellowship Programs at Google Brain.

The project is usually funded by a company and usually pays higher then normal university PhD programs. The research focus is more applied, focusing on research and development problems for a company’s product.

Pros

  • More hands-on experience
  • Get to work in the industry
  • Get to use heavy equipment that might not be at the level of what universities have (clusters of GPUs for example)
  • Bigger teams with stronger engineering talent for people to support your research experiments
  • Upper bound for the scale of projects is larger

Cons

  • Big constraints on the type of research you are doing, might spend 3 years improving accuracy from 90% to 95%
  • Not recommended if you’re looking to work in academia as a professor and lab director
  • Type of research is applied, whereas a PhD in a university lab is usually more fundamental with more freedom to try out things that you want to explore
  • Constraints on what you can publish, since some algorithms/results are proprietary

Academic vs. Industry Research

In schools, the focus is usually on the fundamental side of research. The focus in industry is applied research: How can I build this technology that helps me improve my products? You’ll have more collaborators, engineers, and senior people.

At industry labs you’ll have bigger amounts of computing power and infrastructure. For example, Yann LeCun says that big companies like Meta “can build a new ImageNet size dataset in one day”!

A lot of what Meta AI does for example is applied research, and their research quality is not less then any top school in AI. Several important papers in vision for example came out of Meta AI (formerly known as FAIR).

It’s also very common in machine learning in particular for researchers to spin off new frameworks or tools for AI straight out of research labs, like RAY which came out of RISE Lab from UC Berkeley.

AI Residency Programs

Getting into an AI residency program is probably as hard as getting into grad school for a PhD at a top school. The interview processes are highly selective, but if you get in there’s a lot of flexibility and freedom in terms of choosing which topics to work on during your PhD and very amazing collaborators.

If you come from a non-computer science but strong research background, and want to get into AI research doing this residency is very helpful to tip your toes inside the field.

Decision Algorithm

Thoughts

It’s hard to tell which career path is better since you can either do a PhD or not and still find a great job in the end.

I once came across this response from Andrej Karpathy for people making the hard choice, choosing between software engineering offers at FAANG vs. PhD at MIT/Stanford (Karpathy’s advice on Quora).

If you have offers and aren’t sure if you should go, you can always go to the industry, work and then apply again if you decided to. Just be careful not to spend too much time in the industry on irrelevant skills for research.

Also, you don’t want to go back to school with more responsibilities. There is an actual effect of youth and freedom, so you want to maximize your throughput during that timeline in your life.

Having research experience before going to grad school in my opinion is very valuable, it’ll help you taste what research looks like, and you’ll get a better understanding of your capabilities and whether or not you’d want to do a PhD.

Look for Exponential Growth

One career hack I learned is to always seek exponential growth curves, and then try to jump on them. You never want to grow linearly at the early stages of your career, therefore you can assess a PhD offer from this perspective.

A PhD can be an amazing experience to grow both personally and professionally. Ask yourself: “If I went to do a PhD in CS at Stanford, will my skills and opportunities grow exponentially? What alternatives do I have?”

Sometimes it’s not the degree itself, but rather the opportunity of changing your environment and going into the heart of California is where you’ll find value in the program, rather then the degree itself.

Think in delta’s, if you went into this program for ~5 years, how much of an improvement will it make in your career and job prospects afterwards? Where would you see yourself in 3-5 years?

Companies at the end of the day don’t mind hiring the most trained person for the same salary, but would you invest too much training for a low ROI?

Will you be able to do impressive work? Will you have strong outputs during that timeline? You will be judged harshly on that. Employers will look at what types of papers you produced, what was your ranking and contributions in those papers? etc.

Think of a PhD as a way to express to the world what you’re capable of producing when given the resources, time, and freedom to do so. Therefore it can reflect negatively on your career.

There are different routes to do impressive work as well. PhD is more of a protective environment to explore and think more by yourself, whereas in the industry you will be evaluated with different metrics.

Opportunities Afterwards

At the Tesla Autonomy Day in 2019, Elon Musk said to the person introducing Andrej Karpathy “there’s too many PhDs from Stanford”, to shed light on his actual contributions and competence points that make him the right person to lead the AI team at Tesla instead of what degree’s he holds.

Although it’s negotiable that his PhD opened those opportunities to teach the course at Stanford, his role at Open AI, etc (correlation vs. causation?), it’s always important to think about what your actual contributions and outputs are during the PhD rather than the degree itself.

A good way to think of it is: What will my opportunities after doing this PhD program look like? Will I be in a better position? Will it broaden my network? Will it open collaborations with Meta AI people? Do I really love this topic and want to spend ~5 years on it? Will my skills and knowledge built throughout this long period be useful in the market? Are the skills transferable?

For example, working on federation learning for the biomedical sector itself might not get you a job in the industry, but you can reframe your experience to be useful for companies like Google that use such techniques on mobile apps to train language models.

Roles

The list of all roles in the industry that are available for the machine learning stack is beyond the scope of this blog, so I’m just going to talk the 3 main components of roles that are transferable between different titles and positions.

Software Engineer

Lot’s of companies hire PhD graduates for software engineering roles, and will probably match them to different positions then they would for normal graduates (doesn’t always translate to being better for those who hold a PhD).

Companies like Google for example will allow you to join with a PhD (which can take 5 years) at a L4 SWE level, whereas going from L3 to L4 can take 12-24 months by an average engineer.

Machine Learning Engineer

Although it’s getting popular for employers to ask for a PhD as a requirement, it’s not actually a hard requirement, and you can get the same job without one.

Being a machine learning engineer (MLE) is fundamentally an engineer with machine learning background. As Chip Huyen said:

A machine learning engineer operates in between a scientist and a software engineer. You’ll be a builder, but since deep learning is highly empirical-driven, you’ll need to know how to set the right experiments to get the results you want and therefore wear both scientist and engineering hats.

SWE < MLE < Scientist

If you want to learn more about machine learning-driven products with an end-to-end pipeline, watch Chip Huyen’s course and the project’s demos for that matter to take a peek of what it looks like building machine learning products (CS 329S Demo Day 2022 - Stanford).

Research Scientist

If you’re into research that much and planning to work as a researcher in the AI industry then you’re probably looking to work at FAANG AI (or equivalent) industry labs.

Another option to work at these types of institutions is to work as a research engineer (where you don’t need a PhD). For example, take a look at Aleksa Gordic’s article on how he landed a job at DeepMind as a research engineer without a PhD nor a machine learning degree.

Also, lot’s of these AI institutions need hard-core engineering talent to build the tools and infrastructure that enable these research institutes to build such outperforming systems.

Starting a Company

One common approach to this question is “If my dream is to launch a startup, should I do a PhD? Is it useful?” What about the founders of Google, Sergey and Larry Page? They both did a PhD in search and then founded a search engine company.

I love the Paul Graham approach for this: “Do Things that Don’t Scale”. Rather than starting with the question of “Should I do a PhD to start a company?”, start with “What are the most impactful problems that I want to work on?”.

Start with that question in mind first, and then optimize for making an impact. If doing a PhD will help you towards your goal, then that’s great! But don’t do a PhD for the sake of starting a company one day.

So to think from Google’s founders perspective, the focus was on searching efficiently and developing an algorithm to solve that. Then the company was a product of their work.

AI Tools and Infrastructure

I feel that my experience resonates really well with what Chip Huyen describes here about how most of her friends were passionate about ML research, but she wanted to make an impact outside of research in AI.

At this current time, there has been a shift in the mentality and approach of tackling the development of AI technology. A lot of AI industry leaders like Andrew Ng for example, have set clearly that we should start to adapt the “Data-Centric” approach to AI development for improving the performance of deep learning models instead of changing the models themselves.

Also Sam Altman, the CEO of OpenAI keeps confirming on Twitter for the need of strong engineers to build AI systems:

Therefore, the AI industry is in immense need of hard-core engineers that can build tools, infrastructure, and AI-powered products that actually use machine learning in the real world.

These are rate-limiting factors for the development of machine learning products, and are overseen by lots of peers, or disregarded as not important.

Conclusion

It’s probably difficult to make this decision, but 10 years from now you will be in a good place in your career regardless of what you decide now, so be easy on yourself!

A good way to start deciding is to start from what offers you have and opportunities, before canceling or choosing ideas that you might not get a chance to try out.

Although I personally decided not to pursue a PhD for now, I’m really grateful that I started my career at a fast-growing startup that lets me experience what building a company from scratch and making an impact looks like.

It’s hard to conclude which career path is optimal since there’s a huge scope of decision variables and constraints among different people. But I would summarize with the following:

  1. You can get your dream job with or without a PhD.
  2. You can go to the industry for a few years and choose whether or not you want to go back into academia (Shreya Shankar for example, at RISE Lab doing a PhD in databases, where the founders of RAY started).
  3. Doing a PhD might open an opportunity for business like what happened with William Falcon (founder of grid.ai).
  4. Plan, but don’t over-plan.
  5. You don’t need to have a PhD or work in research to do useful stuff. Chip Huyen is one of my favourite people out in the real-world AI space, and doesn’t have a PhD.