Join Shane Gibson as he chats with Chris Bergh on improving your teams way of working by using DataOps patterns.
A list of great DataOps resources
Open Source Data Observability
Open Source DataOps Data Quality TestGen
Listen
Listen on all good podcast hosts or over at:
https://podcast.agiledata.io/e/dataops-patterns-with-chris-bergh-episode-59/
Read
Read the podcast transcript at:
https://agiledata.io/podcast/agiledata-podcast/dataops-patterns-with-chris-bergh/#read
Google NoteBookLLM Briefing
Briefing Document: DataOps Patterns and Principles with Chris Bergh
Introduction
This document summarises the key themes and ideas discussed in the AgileData Podcast interview with Chris Bergh, co-founder of Data Kitchen. Bergh, a techie with a background at NASA and MIT, brings a unique perspective to DataOps, drawing from his experiences with software development, lean manufacturing, and a strong focus on customer success. The conversation unpacks what DataOps really means and how it applies to data and analytics teams, while also touching on the common problems these teams face and some possible solutions.
Key Themes
DataOps is More Than Just DevOps for Data: While DevOps practices like automation and CI/CD are important, they are "necessary but not sufficient" for DataOps. Bergh emphasises that DataOps also requires incorporating principles from lean manufacturing (like statistical process control and data quality testing) and a strong focus on the customer. "It's about productivity. It's about making your customers more successful instead of, let's do more automation, which is a means to get there."
The Data Factory Analogy: Bergh and host Shane Gibson discuss the analogy of a data team being like a factory, but acknowledge it's more complex than that. It's about the "left to right process of integrating data and producing insight," but also about the "perpendicular" value stream of rapidly iterating on and changing data products. This dual nature means data teams have to be both "manufacturing teams and software teams."
Customer-Centricity: A central idea is the need for data teams to move away from a purely technical focus to become more customer-centric, "helping teams get less focused on technology and more focused on their customer." Teams should be focused on delivering value to their customers, not just producing data.
Waste Reduction: The importance of reducing waste is a recurring theme. Waste isn't just about over-engineering or over-collecting data, but also about building things that aren't used and processes that aren't efficient. “It’s really about maximizing what you don’t have to do is the real key here,” says Bergh.
Team Empowerment and Ownership: A big chunk of the discussion is about empowering data teams to make decisions and take ownership of their processes. This includes the ability to stop the line, fix problems, and not be afraid to surface issues. As Bergh puts it, “As a leader, you own the result. Not the person who cut it up, not the supplier. It’s like you own it. And so you have to fix the process." He believes that "95 percent of the time... when you have problems, it’s mainly the process people work in and not the person."
The Need for Metrics (DataOps DORA Metrics): The lack of clear metrics for data teams was identified as a significant problem. Bergh highlights that while software teams use DORA metrics (Deployment Frequency, Lead Time, Change Failure Rate, Mean Time to Recovery), data teams don’t have an equivalent. They should be measuring things like error rates, cycle time, utilization/value of data products, and time spent actually creating value. “As data and analytic teams, we’re so unanalytic about how we run our organizations.”
The Importance of Feedback Loops: The conversation repeatedly comes back to the importance of getting feedback – from customers, from the data itself, and from the team’s processes. These feedback loops are essential for reducing waste, improving quality, and delivering value.
Data Quality is a Linchpin: Chris believes that data quality issues are often a critical bottleneck for data teams. Poor data quality often leads to blame and lack of progress. Bergh is pushing for data quality teams to target specific business-critical data elements rather than trying to fix everything at once. They should also be able to take actionable steps to provide specific data for the teams who are ultimately in charge of fixing it.
Important Ideas and Facts
DataOps is not new: While some might see DataOps as a recent buzzword, the underlying principles are derived from more established domains.
Teams can make incremental changes: Even in "blame" cultures, there are ways to make small improvements, such as quality circles, refactoring, and automating repeatable tasks.
Version Control is not enough: Teams need to have effective environments and test data in order to properly implement version control.
Teams are often overly focused on technology, not value: This leads to the development of data products that are unused or not valuable.
The "Excel user" problem: The data team often delivers complex data that is immediately deconstructed by business users in Excel – highlighting a mismatch between what's produced and what's needed. The analogy of a three-kitchen structure where the final meal is often deconstructed is used.
Leaders should focus on process: Leaders need to take ownership of their team's processes and create environments that enable them to succeed, rather than blaming people for errors.
There needs to be a balance in applying practices: There's not a single "right way" to implement these principals. For example, deciding how many attributes to bring in with data collection has a "balance and context that matters and having the discussion".
"Whiney little bitches": Managers who are blaming others should be called out.
Data modeling is a lost art: There are experts out there who can model data quickly and accurately, but it's important to enable the rest of the team.
Metrics are key for DataOps: If you aren't measuring the performance of your data teams, it will be difficult to drive change.
Instrument your data to understand value: Data teams should track who is using what data and how often to gain visibility into what their customers value.
Data contracts: Data contracts can be a way of getting teams to focus on the end-to-end cycle and take more accountability.
DataOps = Reduction of Waste: It can be viewed as a method to reduce waste in data processes.
Quality Circles: This is a method where you look at every error and try to fix them, which can be done by putting all errors into a spreadsheet.
Data Quality is the key: Data quality can be the key to moving forward with Agile data teams.
There are too many "Field of Dreams" Data Teams: Building it doesn't mean they will come, value needs to be the focus.
Open Source Tools and Resources: Data Kitchen offers open-source tools, training programs, and blog content that can help teams implement DataOps principles.
Quotes for Emphasis
"It’s a set of technical practices and management paradigms for data and analytic teams to drive customer success and be more productive." – Chris Bergh defining DataOps.
"I think the patterns here are not new... I just think we're, if anything, we're just trying to take these principles and say, look, they have a unique instantiation and data and analytics, but the ideas are old." - Chris Bergh on the origins of DataOps principles.
“I’m really sick of... the whiny little bitches who blame other people. I’m really tired of it, especially the people who lead teams.” – Chris Bergh on leadership accountability.
“The most important metadata of any organization is code.” – Chris Bergh on active metadata.
"If your customers trust your data, that means you have very low errors." - Chris Bergh
Conclusion
This discussion provides a valuable framework for understanding and implementing DataOps. Chris Bergh’s insights, drawn from a long and varied career, highlight the importance of customer focus, continuous improvement, and team empowerment. The conversation also serves as a call to action for data teams to step up and take ownership of their processes and ultimately, the value they provide to the wider organisation. By focusing on waste reduction and implementing solid metrics, data teams can move from being seen as cost centres to strategic assets within an organisation.