accidentaldba.jpg

 

Each new President-Elect talks about the goals they have for their first 100 days in office. Life as a new (or accidental) DBA will be no different. Well, maybe a little different, because as a new DBA, you likely have a 90-day probationary period.

 

That’s right: a DBA has less time to show their value than the president! That means you better be prepared to hit the ground running. But don’t panic! I’ve put together this post to help you get started on the right foot.

 

What DBAs Have in Common With the President

 

DBAs have much in common with the president. First, half the people around you doubt whether you are qualified to hold your job. Second, every time you make a decision or plot a course of action, you will be criticized even by your supporters. Third, you will be judged by what you accomplish in your first one hundred days, good or bad, even if it was something not in your control.

 

Also consider that the president is subject to approval ratings. You will have your own version of this: your annual performance review. Come review time, you want your approvals ratings to be as high as possible.

 

Right about now, you're probably reading this and thinking that the being a DBA is the worst job in all of IT. Perhaps it is, but as long as you are aware of these things when you start, the role may not be as awful as it sounds.

 

Your first objective is to create an action plan. If you think you can show up, grab a slice of bacon, and ease into your new position, then you are mistaken. Your bacon can wait until after you start gathering the information you need in order to do your new job effectively.

 

Here’s a quick list of the questions you need to ask yourself:

 

  • What servers are you responsible for?
  • What applications are you expected to support?
  • What time of day are the applications used?
  • Who are your customers?
  • Are the databases being backed up properly right now?
  • How would you know if the backups were failing?

 

Even that list of basics shows how the role of a DBA can quickly become overwhelming. That is why you need to put together a checklist of the bare essentials and get started. Then you can start making short-term plans for improvements.

 

Trust me, it is easier than it sounds. You just need to be organized.

 

The Initial Checklist

 

By now, you should be sitting at your desk on what we will call Day Zero. Your initial meetings with HR are over, you have gotten a tour of the place, and you are making sure you have the access you need to get started.

 

The very first piece of information you need is a list of servers and systems you are responsible for. Without that little nugget of information, it will be difficult to make headway as you start your long, slow, journey upstream.

 

Because I like making lists and categorizing things, I have divided this initial checklist into sections. One section pertains to gathering information on what I simply call your stuff. Another section deals with finding information on your customer’s stuff. The last section is what I call your action plans. Focus your efforts on these three areas on Day Zero: find your stuff, find your customer’s stuff, and start making an action plan.

 

A sample checklist might look like this:

 

  1. Create a list of servers
  2. Check that database backups are running
  3. Spot check and verify that you can do a restore from one of those backups
  4. Build a list of customers
  5. List the most important databases
  6. List upcoming deliverables/projects
  7. Establish environmental baselines
    1. Server configuration check
    2. Instance configuration check
    3. Database configuration check
  8. Compose your recovery plan (not your backup plan, your recovery plan)

 

Notice that the checklist is missing things people will tell you are a must for DBAs to be doing daily—things like index maintenance, performance tuning, reviewing event logs, etc. Sure, all of those things are necessary, but we are still on your list of items for Day Zero. Everything I have mentioned will take you more than a few days to gather. If you get tied up troubleshooting some stored procedure on Day Zero then, you are setting yourself up for a massive failure should a disaster hit and you have not had time to document your recovery plan.

 

Would you rather be a hero for telling that developer to stop writing cursors or a hero for informing a customer that you can have their database available again in less than 30 minutes? I know which choice I would make so soon after starting a new position.

 

On Day Zero, explain to your manager that you will be gathering this inventory data first. By taking the initiative to perform due diligence, you are showing them that your first mission is to safeguard their data, your job, and their job as well. They probably won’t be able to produce the inventory for you, and they are going to want it even more than you do. You will have plenty of time later on for the other stuff. It will fall naturally into your environment baseline and subsequent action plans as you bring standards to your enterprise.

 

Let’s look at why each of the items in the checklist is important to address from Day Zero.

 

Create a List of Servers

 

Trust me that at some point, someone will walk up to you and start talking about a server you never knew existed. And they will be very confused as to why you have never heard of the server, since they work with it all the time. That there is a database there, and you are the DBA, so you should already know all of this, right?

 

Do your best to gather as much information right away about the servers you are expected to administer. That way, you will know more about what you are up against and it will help you when it comes time to formulate your action plans. These plans will be very different if you have five or five hundred instances to look after.

 

Start compiling this list by asking your immediate supervisor and go from there. The trail may take you to application managers and server administrators. For example, your boss might say that you are responsible for the payroll databases, but what are “the payroll databases”? You will need to do some detective work to track down the specific databases involved. But this detective work will pay off by deepening your knowledge and understanding of where you work.

 

If you are looking for a technical solution to finding database servers, there are a handful of ways to get the job done. The easiest is to use a 3rd party monitoring tool that discovers servers and the applications running on them. You could also use free tools like SQL Power Doc out on Codeplex.

 

You should also have a list of servers that are not your responsibility. There is a chance that vendors maintain some systems in your environment. If something goes wrong with one of those servers it is important to know who is responsible for what. And if someone tells you that you do not need to worry about a server, my advice would be to get that in writing. When disaster strikes, you had better be able to provide proof about the systems that are and are not your responsibility.

 

Check Database Backups

 

Once you identify the servers you are responsible for, the next step is to verify that the databases are being backed up properly. Do not assume that everything is working perfectly. Check that the backup files exist (both system and user databases) and check to see if there have been any recent failures.

 

You will also want to note the backup schedule for the servers and databases. You can use that information later to verify that the databases are being backed up to meet business requirements. You would not want to find out that the business is expecting a point-in-time restore ability for a database that is only being backed up once a week.

 

I cannot stress this enough, but if there is one thing you need to focus on as a DBA, it is ensuring that you can recover in the event of a disaster. Any good recovery plan starts with having a reliable database backup strategy.

 

Verify That You Can Restore

 

There is one, and only one, way for you to verify that your backups are good: you need to test that they can be restored. Focus your efforts on any group or set of databases. The real goal here is for you to become familiar with the restore process in your new shop, as well as to verify that the backups are usable.

 

Make certain you know all aspects of the recovery process for your shop before you start poking around on any system of importance. It could save you some embarrassment later, should you sound the alarm that a backup is not valid when it turns out the only thing not valid is your understanding of how things work. And these practice restores are a great way to make certain you are able to meet the RPO and RTO requirements.

 

Build a List of Customers

 

You must find the customers for each of the servers you are responsible for administering. Note that this line of inquiry can result in a very large list. With shared systems, you could find that everyone has a piece of every server!

 

The list of customers is vital information. For example, if there you need to reboot a server, it is nice to know who you need to contact in order to explain that the server will be offline for five minutes while it is rebooted. And while you compile your list of customers, it does not hurt to know who the executives are and which servers they are most dependent upon.

 

When you start listing out the customers, you should also start asking about the applications and systems those customers use, and the time of day they are being used the most. You may be surprised to find some systems that people consider relatively minor are used throughout the day while other systems that are considered most important are used only once a month.

 

List the “Most Important” Databases

 

While you gather your list of customers, go one step further and find out what their most important databases are. This could be done by either (1) asking them or (2) asking others, and then (3) comparing those lists. You will be surprised to find how many people can forget about some of their systems and need a gentle reminder about their importance. As DBAs, we recognize that some databases are more important than others, especially given any particular time of day, week, or month.

 

For example, you could have a mission critical data warehouse. Everyone in the company could tell you that this system is vital. What they cannot tell you, however, is that it is only used for three days out of the month. The database could be offline for weeks and no one would say a word.

 

That does not mean that when these systems are not used, they are not important. But if 17 different groups mention some small tiny database, and they consider the database to be of minor importance, you may consider it very important because it is touched by so many different people.

 

List Upcoming Projects and Deliverables

 

You want to minimize the number of surprises that await you. Knowing what projects are currently planned helps you understand how much time you will be asked to allocate for each one. And do keep in mind that you will be expected to maintain a level of production support, in addition to your project support and the action tasks you are about to start compiling.

 

You’ll also want to know which servers will be decommissioned in the near future so that you don’t waste time performance tuning servers that are on death row.

 

Establish Environmental Baselines

 

Baselining your environment is a necessary function that gets overlooked. The importance of having a documented starting point cannot be stressed enough. Without a starting point as a reference, it will be difficult for you to chart and report your progress over time.

 

You have already done one baseline item: you have evaluated your database backups. You know how large they are, when they are started, and how long they take. Now take the time to document the configurations of the server, the instance, and the individual databases.

 

Then you can focus on the collecting basic performance metrics: memory, CPU, disk, and network. This is where 3rd party tools shine, as they do the heavy lifting for you.

 

Compose Your Recovery Plan

 

Notice how I said recovery plan as opposed to backup plan. In your checklist thus far, you have already verified your database backups are running, started to spot check that you can restore from your backups, and have gotten an idea of your important databases. Now is the time to put all of this together in the form of a disaster recovery (DR) plan.

 

Make no mistake about it: should a disaster happen, your job is on the line. If you fail to recover because you are not prepared, then you could easily find yourself reassigned to “special projects” by the end of the week. The best way to avoid that is to practice, practice, practice. Your business should have some scheduled DR tests perhaps once a year, but you should perform your own smaller DR tests on a more frequent basis.

 

And don’t forget about recovering from past days or weeks. If your customer needs a database backup restored from two months ago make sure you know every step in the process in order to get that job done. If your company uses an offsite tape storage company, and if it takes two days to recall a tape from offsite, then you need to communicate that fact to your users ahead of time as part of your DR plans.

 

Track Your Progress

As a DBA, a lot of your work is done behind the scenes. In fact, people will often wonder what it is you do all day, since much of your work is never actually seen by the end-users. Your checklist will serve you well when you try to show people some of the tangible results that you have been delivering.

 

No matter how many people you meet and greet in the coming weeks, unless you can provide some evidence of tangible results to your manager and others, people will inevitably wonder what it is you do all day. If your initial checklist shows that you have twenty-five servers, six of which have data and logs on the C: drive, and two others had no backups at all, it is going to be easy for you to report later that your twenty-five servers now have backups running and all drives configured properly.

 

One thing I have learned in my years as a DBA: no one cares about effort, only the end result. Make certain you keep track of your progress so that the facts can help provide a way to understand exactly what you have been delivering.

 

 

If you are looking for a technical solution to finding database servers, there are a handful of ways to get the job done. The easiest is to use a 3rd party monitoring tool that discovers servers and applications running on them. You could also use free tools like SQL Power Doc out on Codeplex.

 

You should also have a list of servers you are not responsible for. There is a chance that some systems in your environment are maintained by vendors. If something goes wrong with one of those servers it is important to know who is responsible for what. And if someone tells you that you do not need to worry about a server my advice would be to get that in writing. Believe me, when disaster strikes, you had better be able to provide proof about the systems that are, and are not, your responsibility.

 

Check Database Backups

 

Once you identify the servers you are responsible for the next step is to verify that the databases are being backed up properly. Do not assume that everything is working perfectly. Check that the backup files exist (both system and user databases) and check to see if there have been any recent failures.

 

You will also want to note the backup schedule for the servers and databases. You can use that information later to verify that the databases are being backed up to meet the business requirements. You would not want to find out that the business is expecting a point-in-time restore ability for a database that is only being backed up once a week.

 

I cannot stress this enough but if there is one thing, and only one thing for you to focus on as a DBA, it would be to ensure that you can recover in the event of a disaster.

 

And any good recovery plan starts with having a reliable database backup strategy.

 

Verify that You Can Restore

 

There is one, and only one, way for you to verify that your backups are good: You need to test that they can be restored. Focus your efforts on any group or set of databases. The real goal here is for you to become familiar with the restore process in your new shop as well as to verify that the backups are usable.

 

Make certain you know all aspects of the recovery process for your shop before you start poking around on any system of importance. It could save you some embarrassment later should you sound the alarm that a backup is not valid when it turns out the only thing not valid is your understanding of how things work. And these practice restores are a great way to make certain you are able to meet the RPO and RTO requirements.

 

Build a List of Customers

 

You must find the customers for each of the servers you are responsible for administering. Note that this line of inquiry can result in a very large list. With shared systems you could find that everyone has a piece of every server!

 

The list of customers is vital information to have. For example, if there is a need to reboot a server it is nice to know who you need to contact in order to explain that the server will be offline for five minutes while it is rebooted. And while you compile your list of customers it does not hurt to know who the executives are and which servers they are most dependent upon.

 

When you start listing out the customers you should also start asking about the applications and systems those customers use, and the time of day they are being used the most. You may be surprised to find some systems that people consider to be relatively minor are used throughout the day while other systems that are considered most important are used only once a month.

 

List the “Most Important” Databases

 

While you gather your list of customers go one step further and find out what their most important databases are. This could be done by either (1) asking them or (2) asking others and then (3) comparing those lists. You will be surprised to find how many people can forget about some of their systems and need a gentle reminder about their importance. As a DBA we recognize that some databases are more important than others, especially given any particular time of day, week, or month.

 

For example, you could have a mission critical data warehouse. Everyone in the company could tell you that this system is vital. What they cannot tell you, however, is that it is only used for three days out of the month. So, the database could be offline for weeks and no one would say a word.

 

That does not mean that when these systems are not used they are not important. But if 17 different groups mention some small tiny database, and they consider the database to be of minor importance, you may consider it very important because it is touched by so many different people.

 

List Upcoming Projects and Deliverables

 

You want to minimize the number of surprises that await you; knowing what projects are currently planned helps you to understand how much time you will be asked to allocate for each one. And do keep in mind that you will be expected to maintain a level of production support in addition to your project support in addition to the action tasks you are about to start compiling.

 

You’ll also want to know which servers will be decommissioned in the near future so that you don’t waste time performance tuning servers that are on Death Row.

 

Establish Environmental Baselines

 

Baselining your environment is a necessary function that gets overlooked. The importance of having a documented starting point cannot be stressed enough. Without a starting point as a reference it will be difficult for you to chart and report upon your progress over time.

 

You have already done one baseline item; you have evaluated your database backups. You know how large they are, when they are started, and how long they take. Now take the time to document the configurations of the server, the instance, and the individual databases.

 

Then you can focus on the collecting basic performance metrics for now: memory, CPU, disk, and network. This is where 3rd party tools shine, as they do the heavy lifting for you.

 

Compose Your Recovery Plan

 

Notice how I said ‘recovery’ plan as opposed to ‘backup plan’. In your checklist so far you have already verified your database backups are running, started to spot check that you can restore from your backups, and got an idea of your important databases. Now is the time to put all of this together in the form of a disaster recovery (DR) plan.

 

Make no mistake about it: should a disaster happen then your job is on the line. If you fail to recover because you are not prepared then you could easily find yourself reassigned to “special projects” by the end of the week. The best way to avoid that is to practice, practice, practice. Your business should have some scheduled DR tests perhaps once a year but you should perform your own smaller DR tests on a more frequent basis.

 

And don’t forget about recovering from past days or weeks. If your customer needs a database backup restored from two months ago make certain you know every step in the process in order to get that job done. If your company uses an offsite tape storage company, and if it takes two days to recall a tape from offsite then you need to communicate that fact to your users ahead of time as part of your DR plans.

 

Track Your Progress

As a DBA a lot of your work is done behind the scenes. In fact, people will often wonder what it is you do all day, since much of your work is never actually seen by the end users. Your checklist will serve you well when you try to show people some of the tangible results that you have been delivering.

 

No matter how many people you meet and greet in the coming weeks, unless you can provide some evidence of tangible results to your manager and others people will inevitably wonder what it is you do all day long. If your initial checklist shows that you have twenty-five servers, six of which have data and logs on the C: drive, and two others had no backups at all it is going to be easy for you to report later that your twenty-five servers now have backups running and all drives configured properly.

 

One thing I have learned in my years as a DBA: No one cares about effort, only the end result. Make certain you keep track of your progress so that the facts can help provide a way to understand exactly what you have been delivering.