But we’re not talking about a database administrator’s (DBA) iTunes library. We’re talking highly sensitive and important data that can be lost or compromised.
It’s time to stop treating data as a commodity. We need to create a secure and reliable data recovery plan. And we can get that done by following a few core strategies.
Here are the six easy steps you can take to prevent data loss.
Build a Recovery Plan
Novice DBAs think about backups as the starting point for data loss. It is the experienced senior DBAs that know the starting point is building the recovery plan.
The first thing to do here is to establish a Recovery Point Objective (RPO) that determines how much data loss is acceptable. Understanding acceptable risk levels can help establish a baseline understanding of where DBAs should focus their recovery efforts. Then, work on a Recovery Time Objective (RTO) that shows how long the business can afford to be without its data. Is a two-day restore period acceptable, or does it have to be 15 minutes?
Finally, remember that “high availability” and “disaster recovery” are different. A DBA managing three nodes with data flowing between each may assume that if something happens to one node the other two will still be available. But an error in one node will undoubtedly get replicated across all of them. You better have a recovery plan in place when this happens.
If not, then you should consider having an updated resume.
Understand That Snapshots != Database Backups
There’s a surprising amount of confusion about the differences between database backups, server tape backups, and snapshots. Many administrators have a misperception that a storage area network (SAN) snapshot is good enough as a database backup, but that snapshot is only a set of data reference markers. The same issue exists with VM snapshots as well. Remember that a true backup is one that allows you to recover your data to a transactionally consistent view at a specific point in time.
Also consider the backup rule of three, where you save three copies of everything, in two different formats, and with one off-site backup. Does this contain hints of paranoia? Perhaps. But it also perfectly illustrates what constitutes a backup, and how it should be done.
Make Sure the Backups Are Working
There is only one way to know if your backups are working properly, and that is to try doing a restore. This will provide assurance that backups are running -- not failing -- and highly available. This also gives you a way to verify if your recovery plan is working and meeting your RPO and RTO objectives.
Data-at-rest on the server should always be encrypted, and there should also be backup encryption for the database as well as the database backups. There are a couple of options for this. DBAs can either encrypt the database backup file itself, or encrypt the entire database. That way, if someone takes a backup, they won’t be able to access the information without a key.
DBAs must also ensure that if a backup device is lost or stolen, the data stored on the device remains inaccessible to users without proper keys. Bio-level encryption tools like BitLocker can be useful in this capacity.
Monitor and Collect Data
Real-time data collection and real-time monitoring should also be used to help protect data. Combined with network monitoring and other analysis software, data collection and monitoring will improve performance, reduce outages, and maintain network and data availability.
Collection of data in real-time allows administrators to perform proper data analysis and forensics, making it easier to track down the cause of an intrusion, which can also be detected through monitoring. Together with log and event management, DBAs have the visibility to identify potential threats through unusual queries or suspected anomalies. They can then compare the queries to their historical information to gauge whether or not the requests represent potential intrusions.
Test, Test, Test
This is assuming a DBA has already tested backups, but let’s make it a little more interesting. Let’s say a DBA is managing an environment with 3,000 databases. It’s impossible to restore them every night; there’s simply not enough space or time.
In this case, DBAs should take a random sampling of their databases to test. Shoot for a sample size representing at least 95 percent of the 3,000 databases in deployment, while leaving a small margin of error (much like a political poll). From this information DBAs can gain confidence that they will be able to recover any database they administer, even if that database is in a large pool. If you’re interested in learning more, check out this post, which gets into further detail on database sampling.
Data is your most precious asset. Don’t treat it like it’s anything but that. Make sure no one is leaving server tapes lying around cubicles, practice the backup rule of three, and, above all, develop a sound data recovery plan.