Investigating the Database Family Tree
Investigating your family tree can be an interesting experience. For example, what if you discovered you were related to a famous person who won a Nobel Prize or performed a heroic act? Conversely, what if you realized you had an ancestor who was an infamous criminal?
Much like examining your genealogy can be an exciting adventure, looking at the family tree of your database can prove to be just as rewarding.
Databases occasionally undergo a phenomenon known as drift. Drift happens when two entities—whether it’s tables, views, procedures, or data—are no longer congruent with each other. When this occurs, there’s a drift between those two things. You must find the database drift and reconcile the differences to prevent further harm.
The longer the drift remains, the harder it will likely be to correct it. There is a concept known as “data gravity,” which means the more data you have in any given structure, the harder it is to move the data into any other structure.
Just as your genealogy shows you how you relate to other individuals in your family tree, the database family tree depicts how individual tables relate to one another within the database. These relationships between tables help define how data is stored within the database. For example, a foreign key will depict a parent/child relationship between data elements, such as shown below.
Figure 1—An example of a “Database Family Tree” (Source: https://www.flickr.com/photos/caius/2300154566/)
Your genealogy could reveal certain health concerns if you discovered your great aunt on your mother’s side had some affliction that could ultimately be passed onto you. Similarly, a database family tree can reveal potential concerns in the branches where another branch or a relationship should be.
For example, let’s examine the foreign key relationship example from Figure 1. If you have two tables such as Department and Course and they don’t have an existing foreign key defined, then data from the Department table could be deleted, leaving orphaned records in the Course table.
Figure 2 - Table Relationships
Constraints such as foreign key relationships help ensure the data relates to another data element in a way that enforces data consistency. Failing to have this constraint in place could eventually lead to several issues if it’s not corrected.
Other than removing or altering the constraint, there’s no way for an application to manipulate the data in violation of the defined relationship. The foreign key enforces the relationship. You could say it enforces the branch in the family tree.
Furthermore, if the family tree between two databases is off, such as between Production and Development, then you have schema drift. Schema drift occurs when the schema of an object in the source database doesn’t mirror the schema of the target database. The family tree is no longer the same across the two databases. If you don’t have any type of continuous integration or deployment process configured, schema drift usually means something has made a change to the target database without correctly updating the source database, or the source code deployment process has broken.
Dependents are entities that rely on another entity. Kids, for example, are dependents because they rely on parents to provide shelter, clothes, food, and pretty much every part of their wellbeing. Genealogical trees show necessarily dependent descendants.
Spoiler! Database objects can also have dependents. For example, a stored procedure could have multiple dependencies on tables or views to run. If those specific tables or views do not exist, then the procedure cannot function properly. In large databases, discovering dependencies can be a pain unless you know where to look. You can do this within SQL Server Management studio as shown in Figure 3.
Figure 3 - Dependencies
In the above example, the procedure p_sel_get_orignization_by_employee is dependent on the Employee table and two scalar functions, GetAncestor and ToString. You can alter the view to see what objects are dependent on this procedure. If you had a large procedure, however, this list could be cumbersome and difficult to read.
There are several ways you could look for schema drift or missing relationships, but none of them are complete solutions. Here are some ways you could look at the database family tree:
- Use SQL Server Management Studio
- Write a script to print out the objects and manually examine the output
- Use Visual Studio with database projects
- Purchase a third-party ERD tool
- Find free tools
Each of these options has a place in the toolbelt for identifying and examining drift. Third-party entity relationship diagram (ERD) tools tend to have high per-user cost. While there are some free and open-source tools in this space, those tools often don’t offer the robust functionality you need to manage dependency tracking or source code generation. Ultimately, the right answer is to locate the database schema in a source control system to ensure schema drift is as rare as possible.
While the database family tree might not be as exciting as your own genealogy, taking the time to evaluate the differences in the family tree of your databases could shed some light on issues that need to be corrected. Schema drift within the tree can cause issues. Don’t be afraid to look at the past to make appropriate decisions for the future. After all, the data you save could be your own.
When you start researching tools to help manage schema relationships or dependences, look at the SolarWinds DBA xPress free tool. It provides a high degree of visibility into your database family tree. This will help you evaluate the overall health of your database and allow you to make any needed corrections.
To get started with a free download, click here.