There is a lot of buzz lately around Agile BI and Agile data warehousing. Typical questions asked include - Does it produce better results? Is it faster? How much does it cost? And of course, most importantly, is the Agile methodology a fit for Data Warehousing and Business Intelligence projects?
What is Agile?
Companies are marketing their BI and DW products as agile solutions, meaning that they are easy to work with, adaptable to business changes and quick to stand up (think BI data discovery tools and data warehouse appliances). While the topic of an enterprise agile solution is worthy of a discussion there is another type of agile. In project management circles, when the word Agile comes up in conversation, it is always in the context of the Agile software development methodology, which also happens to be the context for this article. I won't spend time discussing the Agile Manifesto or guiding principles here, instead I'll try to talk about how Agile applies to BI/DW projects. If Agile techniques are new to you there are plenty of resources available on the web to learn more, with Wikipedia being as good a starting point as any.
Why Choose Agile?
Does Agile produce better results than other methodologies when building a DW/BI application? The only real answer is that it depends. Building a BI application, a data warehouse or some other type of software is a complex process regardless of the methodology you choose. It is a commonly quoted statistic that over 50% of projects fail and some studies have even measured failure rates as high as 80%. So does using Agile increase the success percentage? The short answer is probably not.
So if it does not improve project success rates then why bother? If we are all familiar with traditional waterfall delivery methods, why not stick with what we know? There are two great answers to these question and which, in my opinion, confirm why we should all be delivering projects using Agile techniques. First , Agile delivers business value to our customers quickly, which is critical as time to market is a competitive advantage. For example, give the business access to customer data from the first source while you work on loading the second. The business can start analyzing the available data immediately and as sources are added their results become richer. Secondly, delivering functionality incrementally puts in a budgetary check and balance system. If the project is going to be a failure you have an opportunity to make an assessment after each incremental delivery. In a waterfall project there is a good chance you won't know (or won't admit) the project has failed until you are delivering it. So while the total project cost may be similar at the end, there is a huge savings if you can recognize a project failure early.
Will it Work for Warehousing?
Agile may deliver value to customers faster, and may be more cost effective, but can it work for Business Intelligence and Data Warehouse projects? Absolutely! It is interesting that given the benefits it has not yet gained mainstream acceptance in the warehousing and reporting project space. Agile has been slow to catch on because project teams are resistant to change, as more projects are successfully delivered I believe we will see Agile become the delivery methodology of choice. Let's start with why it makes sense in data warehousing. A warehouse project plan can have iterations built in fairly easily if framed correctly. Splitting delivery by subject areas and/or sources of data is a natural way to segment the schedule and provides an easy way to deliver value early and often. The quicker we can get data loaded and into production, the faster data consumers can use data for decision making. One potential downside to a subject area or source base approach is trying to implement without strong architectural and modeling guidelines in place. Knowing how keys will be structured, how data quality issues will be handled and what the overall warehouse roadmap looks like is critical to an Agile delivery of a warehouse solution. Upfront planning around modeling and architecture will smoothen subject or source based iterations. As with all projects, communication is critical, and with an Agile delivery it is important to communicate the inevitability of rework as the warehouse evolves. Setting these expectations ahead of time will pay dividends when the first major change is required. Even with potential rework, the benefits gained by putting data in the hands of the business earlier more than pays for cost of future rework.
Will it Work for Business Intelligence?
Once a subject area or source data load is ready, the next step is to provide data to consumers using a Business Intelligence tool. Once again, the Agile techniques are ideal for building and deploying a business intelligence application. Building the semantic layer can be done in a similar fashion to developing the warehouse itself. As long as a clear architecture roadmap has been set, the semantic layer can be a fast follower to the subject area or source file ETL load, perhaps even rolling out with the subject area itself. Individual reports, dashboards and cubes can be managed and prioritized via a backlog and delivered as quickly as they are finished, adding incremental value on a regular basis.
Delivering a warehouse project using Agile will feel uncomfortable at first. It probably won't work perfectly and you may have to adapt the processes to work with your organizational polices and procedures. Stick with it, listen to your team and don't be afraid to make adjustments as you go (it is agile after all). If you go in with an open mind and a clear roadmap Agile will deliver business value faster than you ever imagined possible.