How to Avoid Data Quality Entropy in Agile Transformations

Photo by Stephen Phillips - on Unsplash

Picture it: you have set up your organization to collect predictable, cohesive data. You have what you need, you’re getting things done. Everything’s predictable and steady. Then one day, your technology partners decide to undergo an agile transformation. All of a sudden, you’re overwhelmed with much more rapid change in the user experience and a related uptick in missing or improperly created data. You ask yourself what happened to cause such a foundational shift.

Why Does Data Quality Suffer in Agile Transformations?

Agile methodologies are supposed to lead to better code quality. Why does data suffer as a result?

Many top-down Agile transformations, where management dictates to teams that they need to change their ways of working to fit into an agile framework such as Scrum, will often include a hyper-focus on speed of delivery. In this new environment, features or components are added more quickly than before. Because of pressure from management, newly formed agile teams’ focus on speed to delivery supersedes most other things, including analytics tracking and data collection standards. As a result, website updates go to production with broken or incomplete data collection, and often without tagging entirely.

Often these gaps in tracking are not realized until you check your reporting to see how new features or pages are performing, but even if you have tools in place to understand immediately that data is not being collected, simply knowing the problem exists is not enough. Your next step is to submit a bug fix to the team, which now has to be prioritized, scheduled, and worked. Your bug fix may not be important enough to fix within a sprint or two, so it may go weeks or months without resolution. Even when using a tag management system (TMS), these new rules apply.

Adding to this, natural attrition and shifts in team dynamics bring additional challenges: new developers start working with the team and seasoned developers leave - scrum masters and product owners may transition as well. In this whirlwind of activity and as organizations push for faster delivery, the rules governing analytics tend to be deprioritized or lose focus entirely.

As a result, over time, analytics tracking is done improperly (or not at all), fixes are not prioritized or well understood, and an organization’s downstream data quality suffers.

How to work better, faster

Using deployment tools that support continuous product design like feature flags could be helpful. We have found that many of the data inconsistencies are not introduced by TMS publishes but instead by changes in the application itself. Feature flags allow for faster, more nimble changes in production deployments. Development teams introducing anomalies in data can be more quickly detected during the scaled rollout and fixed within a sprint because the problem shows up during the test cycle instead of post-release. (To learn more about how Data Sentinel can be used to automatically detect these issues, please reach out!)

You’ll also want to be mindful about communicating the value of data collection to the delivery team. Agile transformations put a huge focus on user stories — the “why” behind the work. You will need to establish a healthy working relationship with your development teams and clearly articulate why you require reliable, stringent data collection standards. By doing so, you will be able to build credibility — in a more perfect world, the teams could build in data collection validation to their user stories as a part of acceptance criteria, allowing you to provide feedback along the way instead of discovering problems in the data weeks or months after the work is done.

Taking these actions will help ensure that data collection is at the forefront of the a developer’s mind as they develop, not just as an afterthought or for post-release data fixes.




More info at

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Running and Writing Gatekeeper Policies in Kubernetes — Part 3

Docker and Python debugging error in VS Code: Timed out waiting for launcher to connect.

Don’t miss @MoonbeamNetworkCrowdloan!

Python in AI and Machine Learning

Introduction to GraphQL

A Simple Guide to How I Learned Vim/GitHub

Stop treating managers like the bad guy

Deploying big Spacy NLP models on AWS Lambda + S3

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Data Sentinel

Data Sentinel

More info at

More from Medium

Retail Data Analytics with Oracle Data Lakehouse and OML


How a Data Science Veteran is Solving Enterprise Data Quality

Scentbird Analytics 1.0. Analytics for dummies