Many of us have been using Agile methodologies for doing product development or for doing IT Projects very successfully. I have noticed that Agile methodologies are very well suited for addressing enterprise data and information quality (EDIQ) initiatives. I almost feel like Agile and Enterprise data quality (addressing of) was a match made in heaven.
Let us inspect key tenets of the Agile methodologies and relate those to what one has to go through in addressing enterprise data/information quality issues (EDIQ).
- Active User involvement (Collaborative and Cooperative approach): Fixing/addressing data quality issues has to be done in collaboration with the data stake holders, business users. Creating a virtual team where business, IT, data owners participate is critical for the success of data quality initiatives. While IT provides necessary fire power in terms of technology and means to correct data quality, ultimately it’s the business owners/data owners who will decide the actual fixes.
- Team empowerment for making and implementing decisions: Executive sponsorship and empowerment of the team working on data quality are key components of a successful enterprise data/information quality initiative. Teams should be empowered to make necessary fixes to the data and the processes. They should also be empowered to do enforcement/ implementation of these newly defined/refined processes for addressing immediate data quality and ensuring ongoing data quality standard is met.
- Incremental, small releases and iterate: As we know, big bang, fix it all approach for addressing data quality does not work. In order to address data quality realistically, incremental approach with iterative correction is the best way to go. This has been discussed in couple of recent article in “ Missed it by that much” by Jim Harris, and in my own article
- Time boxing the activity: Requirements for data quality evolve. Usually scope of activities will expand as team starts working on data quality initiative. Key to success is to chunk the work and demonstrate incremental improvements in a time boxed fashion. This almost always forces data quality teams to prioritize the data quality issues and then address them in the priority order (this really helps in optimally deploy resources of the organization to get biggest bang for the buck)
- Testing/Validation is integrated in the process and it is done early and often: This is my favorite. Many times data quality is addressed by first fixing the data itself in environments like data warehouse/marts or alternate repositories for immediate impact on business initiatives. Testing these fixes for their accuracies and validating its impact will provide for a framework as to how you ultimately want data quality issues fixed (What additional process you might want to inject, what additional checks you might want to do at the source system side, what are the patterns you are seeing in the data etc…). Early testing/validation will create immediate impact with business initiatives and business side will be more inclined to invest and dedicate resources in addressing data quality on ongoing basis.
- Transparency and Visibility: All throughout the work one does for fixing data quality, it is extremely important to provide clear visibility into the issues and impact of the data quality, efforts and commitments it will take to fixing data quality and the ongoing progress made towards achieving business goals about data and information quality. Maintaining a scorecard for all data quality fixes and showing trend of improvements is a good way to provide visibility into improvements done in enterprise data quality. I had discussed this in my last article and here is a sample scorecard.
There are many other aspects of Agile methodologies which are applicable to the enterprise data quality initiatives
a) Like capturing requirements at high level and then elaborating those in visual formats in a team setting
b) Don’t expect that you are going to build a perfect solution in first iteration
c) Completing task on hand before taking up another task
d) Clear definition of what a task “completion” means etc…
In summary, I really feel that Agile methodologies can be easily adapted and used in implementing enterprise data/information quality initiatives. Use of Agile methodologies in will ensure higher and quick success. They are like a perfect couple “Made for each other”
In my next post, I will take real life example and compare and contrast actual artifacts of Agile methodologies with the artifacts which are required to be created for enterprise data/information quality (EDIQ) initiatives.
Resources: There are several sites about Agile methodologies, I really like couple of them