Data & Agile: Communicate Early, Communicate Often
Data people and developers don't really get along very well. There have been lots of reasons for this historically, and, I'm guessing, as time goes on, there will continue to be conflict between these two groups of people. Sometimes these disagreements are petty; others are more fundamental. One of the areas I've seen cause the most strife is shared project work using an Agile software development lifecycle. I know talking about Agile methodologies and data-related projects/items in the same sentence is a recipe for a serious religious battle, but here I want to keep the conversation to a specific couple of items.
The first of these two items is what happens when an application is developed using an ORM and a language that allow the dev team to not focus on the database or its design. Instead, the engineer(s) only need to write code and allow the ORM to design and build the database schema underneath. (Although this has been around for longer than Agile processes have been, I've seen a lot more of it on Agile projects.) This can lead to disconnects for a Development DBA-type person tasked with ensuring good database performance for the new application or for a Business Intelligence developer extracting data to supply to a Data Mart and/or Warehouse.
Kind of by its nature, this use of an ORM means the data model might not be done until a lot of the application itself is developed…and this might be pretty late in the game. In ETL Land, development can't really start until actual columns and tables exist. Furthermore, if anything changes later on, it can be a lot of effort to make changes in ETL processes. For a DBA that is interested in performance-tuning new data objects/elements, there may not be much here to do--the model is defined in code and there isn't an abstraction layer that can "protect" the application from changes the DBA may want to make to improve performance.
The other problem affects Business Intelligence projects a little more specifically. In my experience, it's easy for answers to "why" questions that have already been asked to get lost in the documentation of User Stories and their associated Acceptance Criteria. Addressing "why" data elements are defined the way they are is super-important to designing useful BI solutions. Of course, the BI developer is going to want/need to talk to the SMEs directly, but there isn't always time for this allotted during an Agile project's schedule.
I've found the best way to handle all this is focusing on an old problem in IT and one of the fundamental tenants of the Agile method: Communication. I'll also follow that up with a close second-place: Teamwork. Of course, these things should be going on from Day 1 with any project…but they are especially important if either item discussed above are trying to cause major problems on a project. As data people, we should work with the development team (and the Business Analysts, if applicable) from the get-go, participating in early business-y discussions so we can get all of the backstory. We can help the dev team with data design to an extent, too. From a pure DBA perspective, there's still an opportunity to work on indexing strategies in this scenario, but it takes good communication.
Nosing into this process will take some convincing if a shop's process is already pretty stable. It may even involve "volunteering" some time for the first couple projects, but I'm pretty confident that everyone will quickly see the benefits, both in quality of project outcome and the amount of time anyone is "waiting" on the data team.
I've had mixed feelings (and results) working this type of project, but with good, open communication, things can go alright. For readers who have been on these projects, how have they gone? Are data folks included directly as part of the Agile team? Has that helped make things run more smoothly?
Top Comments