Data is not a "profession" (plus a series of 'useful' analogies)
Am I right or am I wrong? And if i'm right and Data is not a profession, what needs to change to make me wrong?
Am I right or am I wrong? And if i'm right and Data is not a profession, what needs to change to make me wrong?
“Context” of this post
I often find writing helps me coalesce and refine my thoughts when I constantly say the same things on a regular and repeated basis.
So this article is a brain dump / train of thought article on the lack of “profession” in the data profession, and the analogies I repeatably use to describe that problem.
Is Data a “Profession”?
This week seems to be a week of “thinking”, triggered by posts on the various active data communities I engage with and conversations with awesome data people I know.
This thinking was triggered by my response to a post from Christine Carragee:
over on the Technical Freelancer Academy (TFA) community run by SeattleDataGuy
https://the-technical-freelancer-academy.circle.so/c/what-did-you-accomplish-this-week/translating-from-english-into-english-conference-session-advice-request
(if your not a member join now)
Christine and I had a great conversation on the TFA around this (which of course inspired me to write the rest of this post).
I replied to Christines post with:
I take a different view on all this.
My plumber doesn’t try and educate me on the best way of connecting the pipes when something breaks, nor tries to get me to understand the size of the pipe or the tools they use to connect the pipes.
They are professionals, they solve my problem, I trust them to solve my problem and I pay them for that value.
When I am coaching data and analytics teams I tell them the same thing, stop asking for permission on how to do your job, the tolls you should use, the way you should work, you are the professionals.
Instead focus on understanding the organisational problems that can be solved using data and information and then help solve them.
Christine replied with this:
I think this analogy breaks down because you don't hand your plumber a middle school science experiment and ask them to enable it in your house -- you don't buy parts and ask a plumber to install them and there are building inspectors and standard codes they have to follow whereas there aren't those same standards in many software contexts.
Your water heat fails, you don't ask the plumber to install one that could allow your entire extended family to take hot showers and run the dishwasher and laundry all at once, but we get asked all the time to build systems not for the current needs but some unmeasurable future scale/demand without expectation of a current tradeoff.
It's a great response to help me refocus my thoughts on how to enable effective collaboration — translating data into physical analogies is super useful.
And that made me think in this way:
I think the analogy is perfect for those very reasons you outline.
How can we call ourselves engineers, when there are no standards for our engineering?
My plumber charges me a lot more if I try and fix the pipes myself and make it harder for them to do the job properly as it takes them more time, why do we let people hand us a pile of badly formed code and try and make it work?
My plumber won’t let me buy the wrong equipment and force then to use it, why do we.
Think you have triggered todays LinkedIn post
Its not quite “today” but here is the post.
The use of Analogies
I loved this comment from Christine:
translating data into physical analogies is super useful.
Then Christine sent me a link to this LinkedIn post:
https://www.linkedin.com/posts/samuelivanecky_data-ai-analytics-activity-7340747839578288129-cvPm
A Chef, a Plumber and a Surgeon join a Data Team ….
Which made me think, I often use the analogies of a Chef, a Plumber, a Surgeon, an Accountant to describe the lack of profession in most data professional’s, and based on Sam’s post I am not alone.
Chefs
I use the analogy of a Chef in a kitchen to explain the balance of the art of expertise and efficiency of experience a Chef brings to the work they do, coupled with the intense focus on quality.
I often liken data modeling to the combination of art and science that a Chef displays.
I often highlight the fact that a Chef would not let rotten tomatoes get into the meals they create, and ask why do data professionals allow bad quality data to be included in the data work we delivery, and then make it the information consumers job to work out what tasted bad and how to fix it?
I have written about the rotten tomato problem before:
Why do we delegate responsibility for some of the most important data work to people outside our data teams and our data profession?
Plumbers
I use the analogy of a Plumber and their Ways of Working to ask why do data professionals ask unqualified people who have never worked in their domain (aka stakeholders) what tools they should use or how they should work, where as qualified Plumbers know what tools to use for each job and how to do their job.
Plumbers also have standards they must follow and people who both set those standards and inspect they have been meet.
Christine makes this point with this part of her comment:
you don't buy parts and ask a plumber to install them and there are building inspectors and standard codes they have to follow
Why do we not have standards we need to learn and follow as data “professionals”?
Surgeons / Accountants
I use the analogy of Surgeons and Accountants and their professional bodies that make sure that the Surgeons and Accountants are trained to a code, a code that has been developed and tested over centuries, and then held to account when they break that code.
I highlight that a Surgeon and the surgical team will always check to make sure all the gauze and clamps are removed before sewing the patient up, not leave it to the patient to discover it themselves.
The Surgeon work to understand the problem they need to solve and then uses their expertise and experience to solve that problem.
Why do we not have proven data ways of working, a data code, that we are held to account if we don’t follow it.
Why do we deliver data that might or might not be fit for purpose, might or might not get the job done, and let our stakeholders discover if it is or isn’t for us?
Why do we allow ourselves to be given ‘task to be done’ not problems to be solved?
The problems in the Data “profession”
As I started to write about all the problems I see in the data profession in detail , I started to run out of time (I try and time box my train of thought articles).
So over to the pattern of giving the content I have written so far, to my ChatGPT co-writing machine/friend to give me back a list that I then review, edit and augment to add my personal thoughts and flavour (em dashes etc purposely left in):
Lack of Industry-Wide Standards
Other professions operate within established standards (e.g. plumbing codes, surgical checklists, accounting regulations). Data has no universally accepted standards or codes of conduct across roles.
I.e we make shit up on a regular basis. And then a few years later we make it all up again.No Formal Accountability Structures
Surgeons and accountants are held to account by professional bodies if they breach protocols. In data, poor decisions or low-quality outputs rarely lead to professional consequences.
Hell, we can’t even agree a set of common data modeling technique we will all use, or the semantic definition of the term “semantics layer”.Weak or Non-Existent Professional Bodies
Professions like law, medicine, and engineering have strong professional organisations that govern certification, ethics, and education. In data, bodies like DAMA exist, but adoption is inconsistent and their influence limited.
Ill write more on this at the end of the article, as I for one aren’t a certified member of DAMA.Role Ambiguity and Title Confusion
“Data professional” can mean wildly different things — analyst, engineer, scientist, steward, architect — with no consistent skillset, training, or pathway.
Specialisation in data is common, but so it is in other professions, there is specialisation in Accounting and in the Medical professions, so this is no excuse for us.Tool-Led, Tool Mandated
Unlike plumbers or surgeons who bring their tools and decide the best approach, data professionals are often asked to use tools chosen by others (stakeholders, execs, vendors), undermining their expertise.
For some reason we like to ask what we should use, or wait to be told what to use, rather than bringing the toolkit we are most experienced in using with us and using that to do the work the way we know how to do it best.
Stakeholder Overreach
In other professions, clients describe the problem, and the expert chooses the solution. In data, stakeholders often specify both the what and the how for the work to be done, without the domain expertise to make those decisions.
I think the whole process of a “data request” is the fundamental anti-pattern that causes this.No Agreed Ways of Working
There’s no equivalent of an audit checklist or surgical protocol for delivering data or data products. Every team reinvents their process, leading to inefficiency, poor quality, and constant rework.
Even when you get an external person or organisation in to ‘audit” your way of working or what the data team produces, its more of a review than an audit where the basis of the review is itself based on opinion not a set of standards. Its people making shit up to see if somebody else made the right shit up.Inconsistent Quality Expectations
Rotten tomatoes don’t make it into a chef’s dish. Bad data often makes it into reports, dashboards, and models — with the burden of identifying issues falling on the consumer, not the producer.
What other profession is it ok to deliver a crap product and then still expect the consumer of that product to buy or use your next crappy product? Its probably a result of the fact that most stakeholders don’t have a choice to “buy” their data products else where. I think GenAI is about to change this one, woe betide the data profession if/when this happens.Over-Reliance on Individual Heroics
Instead of systemic approaches to quality, reliability, or design, data work often relies on individual knowledge, tribal practices, or undocumented experience.
Teams of one, expert data professionals who hoard and don’t share knowledge, constant replacement of data teams and data team members, all contribute to this one.Failure to Distinguish Between Art and Engineering
Is data science an art or a science? Is data engineering actually engineering? The profession has a deep identity crisis, lacking clarity on where craft ends and rigour begins.
Ask a “Analytics Engineer” or a “Data Engineer” what educational path they did to get their role, and then ask a “Structural Engineer” and compare the difference.Poor Incentives and Misaligned Measures
Success is often measured by delivery (a dashboard built, a model trained), not by value (a decision improved, a problem solved), which widens the gap between data teams and business impact.
Its not our fault but it is our problem. The focus on keeping busy with busy work, number of Jira tickets cleared, # data tables loaded, lines of code written, # of dashboards produced and supported, rather than organisational problems solved or organisational value delivered does not help us help ourselves.Task Execution Over Problem Solving
Professionals in mature fields are trained to diagnose and treat. Data teams are frequently handed a “task list” instead of a problem to solve — diminishing autonomy and reducing effectiveness.
Problem solving is a skill, one every data professional needs. But we are often taught tools and technologies not skills. We allow ourselves to be handed those list of tasks, rather than demanding to understand the problem to be solved or the job to be done, and using our skills to use data to diagnose and resolve it,Limited Barriers to Entry
Anyone can become a “data professional” with minimal training. There’s no licensing, no apprenticeship model, and no gatekeeping — which can be both a strength (low barrier to entry) and a flaw (variable quality).
This is only going to get worse in the age of “AI”. The training grounds where a junior data person was buddied up with a senior to learn the craft, is about to disappear. The barrier to entry to this space, with the latest ‘vibe’ coding and “AI” powered natural language data tools, is reducing the barrier to entry to almost zero.No Shared Language or Canon
Professions have shared frameworks, textbooks, and language. Data lacks a commonly accepted canon — what one team calls “modelling” another might call “dbt code” leading to confusion and misalignment.
Although I have hope on this one, given the number of data books being published in 2025 to help with this problem.Lack of Self-Regulation
Many professions self-regulate to preserve credibility. Data teams often chase the latest tech or trend without critically assessing its suitability, maturity, or ethical implications.
This is both a problem in the data domain, where we all love to play with the next greatest tool or technology, but also a symptom of our stakeholders constantly looking to the latest “silver bullet” to solve the problems they experience trying to engage with data professionals to help solve their organisational problems with data.
Think my ChatGPT friend nailed most of the problem points, and of course I spent more time editing and adding my thoughts than I expected.
Thats another point that Joe Reis highlighted today on the Practical Data Modeling discord: today referencing this article:
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Maybe the “AI” doesn’t make us as efficient as we think it does.
Makes me think, would other professions just jump in and starting using new “AI” technologies based on the ‘vibe’ that they made them more efficient?
I think the example of the Medical domains use of Vision based Machine Learning models to identify cancer etc from scans, shows us they apply a little more rigour to their profession before adopting new shiny things, that supposedly make their work easier, faster and better.
Its not just a problem with the Data Engineering “profession”
Joe Reis posted this on the Practical Data Modeling discord:
As I’ve said publicly, data science is to science what a Subway Sandwich Artist is to art
https://discord.com/channels/1327066929122508858/1373091334671437934/1392882337385152602
(if your not a member join now!)
The problem of a lack of professional standards and ways of working is rife across the entire data domain.
Maybe data is more artist than a professional engineering or science discipline, we certainly behave like it is.
But I am not sure that even if this is true, that it should be.
So where are our data domain, industry wide, standards?
This week I was privileged enough to be able to present a short overview on the Information Product Canvas template to the DAMA SA meetup hosted by Howard Diesel and Debbie Diesel.
At the beginning of the Meetup session they discuss what is happening with DAMA, DMBOK and CDMP.
I am not a member of DAMA, I have scanned the DMBOK at best, I am not CDMP certified.
A lot of people I respect in the data domain, who I think of as being at the level of a data expert or a data coaches I would say are the same as me.
Why isn’t these things the standard you need to pass to call yourself a “data professional”.
I need to investigate that quesition a little more to find out the answer.
Feel free to drop a comment on your thoughts on this one.
Maybe we are “Crafts People” not “Professionals”
Maybe I should liken data people more to an analogy of a furniture restorer or a roof thatcher.
A person that has learned a valuable craft over years and years of experience doing the work to become an expert, rather than other professions that are based on standards, training people to those standards and holding people to account for meeting those standards.
What say you?
no one says yes to “vibe surgery”