The agile part of DataOps
In this previous article, we have defined dataops as a”combination of tools and methods inspired by Agile, Devops and Lean Manufacturing » (thanks to DataKitchen for this definition).
Let’s focus on the agile part and why it is so relevant for your data use cases.
- “Data is like a box of chocolates, you never know what you’re gonna get.”
It is about the nature of data : you cannot guess what will be the content and the quality of your data sources before seing it. The Exploratory Data Analysis part is the starting point and you will have to adjust based on what you have found. Metrics or dimensions that you have specified will not be possible because of the data sources. Sometimes you will have to build referentials because the data is too granular.
This is where Agile comes in. Because you do not have stable requirements, you must be able to always adapt and change. Starting a data project with fixed dates and fixed requirements does not exist in real life. This is also the difference between an A team and a normal team “A late change in requirements is a competitive advantage.” – Mary Poppendieck.
2. Resheduling is a competency, not a fatality
Agile is about mastering your schedule. You are not guessing anymore. You have your backlog, your user stories, your point, your velocity and you can start to calculate when it will be done and what you will have. In this brillant video and to quote Robert C Martin “You do agile not to go fast, You do agile to know how fast you are”.
The heart of agile is here : produce data to know where we are and where we will be in 2 or 3 sprints. This is also where as a team, we can discuss our priorites. It does dot mean we have to remove scope, it is about what we should do first.
3. Technical Excellence or the fastest route is not always a straight line
The forgotten part of agile software development : develop a technical excellence. Even in 2020, Agile means “quick and dirty”. It is in fact always slow and dirty because of the technical debt created. This is where Agile goes along with Devops. You must have the technical stack to be able to deliver. If you do not have it, you have to build it. At the end, you have better chances to go faster and you will not postpone in your run phase all the quality problems.
Dataops is all about this technical excellence. Data is a journey, you will have tons of pipelines to build. Don’t be Sisyphus where your data is the boulder
4. The human side of Data
If you think data is a cold and scientific subject, it is not. At the end, you have users who will use data and there is nothing more relative than “good data” or a “good dashboard”. Proximity with the business and daily communications are just the main key success factors for you data use case.
This is the opportunity to catch all these rules about data quality and convert them into automatic data testing.