There has never been a more conducive environment for a data scientist career path than it is today. If anything, the stock of data science professionals is at its highest. You will not come across many fields that can match up to the prestige and viability of a successful data scientist. And if the inclinations of worldwide trends as well as our dependency on ever-advancing technology are any barometer to go by, it is safe to say that data science expertise, e.g., big data, artificial intelligence, machine learning, etc., is not only here to stay but it is going to grow further. (Olhede and Wolfe, 2018)
A well-known economist and author, Andrew Flowers, said “More employers than ever are looking to hire data scientists.”, while in another news Forrester Research analyst, Brandon Purcell said that “the demand for data scientists will only grow, as organizations are increasingly relying on data-driven insights.” There are many other similar endorsements from technical as well as non-technical experts to safely conclude that data science is here to stay and shape our future. (Donoho, 2017)
As we step into 2020, it is worth noticing the career graph developed by Linkedin’s 2019 report on the most promising jobs and be assured of the future possibilities that data science is set to create. Due to its multidisciplinary nature, this field of science has more avenues than there are qualified data professionals. The current gap between demand and supply of skilled data scientists serves as a testimony for the authenticity of this prediction. Data science expertise is playing a significant role in career development, regardless of the field or nature of a specific job. It could be anything from business marketing, city planning, healthcare, security, etc., today, data-driven analytics and logistics are a backbone for the sustenance of every profession (Costa and Santos, 2017).
But despite positive outcomes and trends, there seems to be a contradictory debate that is gaining momentum with time. It is that data scientists are leaving their jobs or changing companies at an astounding pace. Why is that? Is that even true? Is there something that is being kept from us? These and a lot more questions have been surfacing lately. In this article, we are going to address all these concerns based on the personal experience of data professionals themselves.
Let us begin with the Financial Times report that set the ripple effect in motion. According to Financial Times’s recent investigation, “Data scientists are spending an average of 2 hours a week looking for a new job.” It also reported that “Machine learning specialists topped its list of developers who said they were looking for a new job, at 14.3 percent. Data scientists were a close second, at 13.2 percent.” This report is based on the reliable data that was gathered by the Stack Overflow Developer survey, which had nearly 90,000 developers on board from around the world.
The credibility of this report and opinion of 90,000 developers should be enough to determine that there, in fact, is a problem. We have somewhat established the authenticity of these speculations that if data scientists are not outright quitting their profession, they are at least looking to switch jobs. Which takes us to the crucial question here that why exactly are so many data scientists dissatisfied with their seeming dream jobs?
Jonny Brooks-Bartlett, a data scientist at Deliveroo, pointed out four reasonable factors based on his own experience and perception. He believes that if we set our minds to eliminate these reasons, we can retain the stimulation and reward that comes with being an accomplished data scientist. To that end, let’s go over the big reasons that are making data scientists leave their jobs.
Confliction of Expectation & Reality
The gap between expectations and realities is huge. Many junior scientists’ sign up the professional commitments without knowing the ground realities of the nature of their work. There are several reasons for the false expectations from real work scenarios, and listing them all down is just not possible. The common assumption many aspiring data scientists have that they will be expected to develop out-of-this-world machine learning algorithms or effectively solve complex problems to come up with life-changing decisions.
In some cases, aspiring data scientists are self-educated through data science specialized books and online courses, which imparts adequate knowledge until the individual is exposed to real-world datasets. Working on real-world projects is quite different from winning online data science competitions or practice runs. There are so many new data scientists who turn out to be uninformed about the basics like:
- Machine learning pipeline functionalities
- The insufficient skill set that lacks knowledge of pertinent fields, e.g., software engineering
- Model deployment
- Data cleaning significance
Not just that, the newbies often get too consumed by high-class machine learning tools or frameworks that they do not even register their slacking and failings to meet the company’s expectations.
While in other cases, companies that hire data scientists do not associate much value to a suitable infrastructure to maximize benefits in terms of AI. It becomes especially problematic when there are no senior data experts in place. Given that the hiring team of data scientists that is comprised of experienced and juniors is the correct way, companies these days are overlooking all critical steps and ending up with an unhappy working relationship for all involved.
The new data scientists join the workforce with the idea of just writing cutting-edge machine learning algorithms to extract actionable insights. This becomes a faraway dream due to the absence of feasible data infrastructure, and they end up making analytic reports or creating structure. Data scientists operate on a certain methodology that is hardly worthwhile for a company that only cares for a presentable chart in a high-level meeting.
This attitude leads to the absolute frustration and disappointment of both parties; data experts – for having sub-standard mechanisms to do any real good, and companies – for the failure of swift value drive. All of it makes data scientists dissatisfied with their role.
The most challenging aspect of data science – politics. Like anywhere else, the underlying sycophancy in the direction of movers and shakers of the establishment is rampant in data divisions. Many aspiring data scientists believe that working to have superb command over toughest algorithms like Support Vector Machines or DeLorean etc. is a sure way to make them indispensable – which is contrary to the ground reality where you must strive to remain in the good books of those with the most clout. Doing ad hoc tasks like working diligently to extract information from databases for the right people at the right time. Or anything for that matter to maintain a good perception.
It is simply not what an aspiring data scientist has in mind before they start professional practice, which takes us back to the mismatch between expectation and reality. The frustration, sense of ineptness, and the pressure to continue the boot-licking work, all of it compiles into a heavy dissatisfaction.
Many of the traditional companies have that one person in the most influential position who has not much knowledge about anything. A person that we discussed in the second point above. These individuals seem incapable to think or absorb beyond their dealings. You are often in a position to actively work for their good word while knowing that they have no idea about what your job or potential is. They do not understand what a data scientist is or what responsibilities the position of data scientist entails. This means that you will have to be their analytics expert, database expert, reporter, etc., all at once.
These unreal expectations from data scientist’s skillset are not limited to non-technical administrators of the company. Almost everyone that data scientists work with assumes them to have the knowledge of everything to do with data and machine learning. It often exceeds other professional disciplines like computer programming, statistics, etc. and you absolutely must know each and everything that is there to know about them. The assumptions lead to more assumptions; if you know a certain thing, then you have access to its data, and if you have access, then you know all the answers to all the problems in the world. (Moulton, 1984)
What makes it worse is your reservation to clear away the unreasonable conventions. Because as a junior data scientist who is only just starting, you will be convinced that people who matter will start to think less of you. This mental contradiction, coupled with unreasonable work demands, aggravates the situation like nothing else. The best way to counter this particular problem is stay away from jobs that have description like; “We are looking for data scientist who knows his/her way around Spark, Hadoop, Hive, Pig, SQL, Neo4J, MySQL, Python, R, Scala, A/B Testing, NLP, anything machine learning and anything data related.” It screams of the company with zero ideas about data scientists’ capabilities and their data strategy. They will expect anyone with data in their resume to fix all the data problems their company is facing.
Lack of Data Scientist Upspring-ing
Stagnancy is a curse for growth. You cannot expect to have the same skillset in times that are changing with lightning speed. Data professionals in particular love new challenges, which are just as well because if there is one field that is ripe with challenges today, it is data science. The Natural Language Processing (NLP) domain is the best example of the rapid advancement that data professionals go through.
Data scientists like few things better than experimenting with new technologies, techniques, and frameworks. Working to build and restate on a singular logistic regression model is like hitting a brick wall after a while. And many data professionals go through the lack of motivation due to the absence of new challenges in their work profiles.
It becomes an even great source of dissatisfaction for data scientists employed in blue-chip companies. Their flexibility margin is as slim as their size grand. It is one of the main reasons that data experts working in start-ups or relatively medium-sized companies appear more contented with their jobs. Following are three reasons that lead to data scientist leaving their high profile jobs:
- First is the lack of proper infrastructure in terms of computing systems and access to advanced tools that enhance a data scientist’s role.
- The second reason is the limited scope of a company. In such cases, it gets really difficult for data scientists to infer insights from narrow operational capacity data.
- Third, being still the main reason is the absence of research & development (R&D) facilities. Data scientists perform their best by exploring the field beyond the scope and vision of the company.
Despite being a dynamic field, data science is currently hanging onto some stone carved rules, processes, and aspects that are demotivating highly skilled data professionals. It is also making it difficult for companies to retain this sought-after talent in their teams. However, the aim of this article is not to discourage aspiring data scientists; in fact, if anything, this job position is the most satisfying and rewarding minus the few negative aspects. We believe that for the sake of both parties, there should be robust systems and fulfilling working environment in place to ensure job retention for data scientists.
- Olhede, S.C., & Wolfe, P.J. (2018, May). The Future of Statistics and Data Science. Journal of Statistics & Probability Letters, 136, 46-50. https://doi.org/10.1016/j.spl.2018.02.042
- Costa, C., & Santos, M.Y. (2017, December). The Data Scientist Profile and its Representativeness in the European e-Competence Framework and the Skills Framework for the Information Age. International Journal of Information Management, 37(6), 726-734. https://doi.org/10.1016/j.ijinfomgt.2017.07.010
- Donoho, D. (2017, August 1). 50 Years of Data Science. Journal of Computational and Graphical Statistics, 26(4), 745-766. https://doi.org/10.1080/10618600.2017.1384734
- Moulton, R. (1984, February). Data Security is a Management Responsibility. Journal of Computer & Security, 3(1), 3-7. https://doi.org/10.1016/0167-4048(84)90020-8