The LLM hype and the job of the data scientist

I’ll start by stating that I am writing this on October 31st, 2023. If you are reading this post one week from now, knowledge, trends and technology in AI might have changed so much that this post seems like it has been written in the stone age.

by Christian Hamböck

I recently came across a Reddit discussion about how the LLM hype destroys data science and felt a lot of empathy related to what’s going on in terms of hiring here at viesure. For one of our projects, we want to use a Large Language Model (LLM, the technology behind ChatGPT or Bard) to search and summarize information from company internal documents. When thinking about hiring an expert in the field, we were aware that they’re a rare species on the job market, since LLMs have become popular only a few months ago. So we extended our job ad offering a broader field, hoping to attract candidates who are generalists and interested in becoming specialists. That was the point when we realized that we didn’t know what the “general field” of an LLM specialist is. Spoiler alert: it is not (necessarily) data science.

My first intuition was a machine learning engineer, someone who has experience with deep learning, builds models from scratch, and understands the mathematics inside the black box of an LLM. However, soon it became clear that we wouldn’t build, not even train an LLM ourselves, since this is prohibitively expensive. This is instead done by the large cloud service providers, who let you access the trained model via APIs, if you prefer even inside your own, secure environment (to avoid any potential “leakage” of your private data).

Also a “classic” data science profile (as myself) would not be the right fit. I am used to using a process called CRISP-DM where I try to solve business stakeholders’ problems by working with data and statistical programming. Results are often reports, visualizations and predictions with various degrees of automation. Again, the “statistical programming” part of an LLM is already done.

We saw that we don’t need someone who knows how to build an LLM, but rather someone who knows how to use it. “Use” means how to feed it with the right data, instructions (“prompts”) that get it to produce outputs corresponding correctly to this data, and how to include it in applications for end users. Does this mean the “general field” we are looking for is classic backend engineering, and LLMs are just a new tool in their box? Some of our engineers dedicated themselves to the task and got familiar with Langchain, a python framework which lets you connect and configure all the relevant parts you need for an application around a language model.

So far, they describe their experience like a rollercoaster ride. The framework and its dependencies are updated almost daily. Every few weeks, new evidence suggests critical issues in one or another part of the current conceptual approach, or even that it should be replaced with something entirely different. Additionally, vendors come up with their own proprietary solutions, promising better results than Langchain.

The field is composed of many different subdomains, including Machine Learning Engineering, Data Science, Data Engineering, Natural Language Processing or Backend Engineering. Many of them might not be relevant at a given point in time, but could become crucial overnight when a new approach or trend appears. This means, a real specialist in the emerging field of LLM application building would have to be a jack of all trades.

Should you specialize in this field? As the Reddit discussion highlights, data scientists are divided on whether it’s a good idea to become an LLM specialist and if yes, whether they are in the best position to do so. Having a deeper sense of the technology’s limitations and potential pitfalls, they also often find themselves managing excessive expectations. This opens up the question of what happens when the initial hype ends. My recommendation: don’t forget your skills in your original field! 😉