Understanding Backup Technologies and What They Can Do for You
The backup technology landscape is almost as complex as the environments it serves to protect. Do you go with point solutions to solve specific problems or a broader solution to consolidate backups into one platform? In this session, we discuss the latest backup approaches for each of the different types of IT environments you may need to protect.
Hi, everyone, I'm Ali Mahmoud, Product Marketing Manager here at SolarWinds.
I'm Keith Young, the backup guru for SolarWinds Backup.
Today we're talking about understanding the backup technology landscape and how to protect your company. So, Keith, when it comes to backup, the thing that I talk to a lot of IT pros about, MSPs, anybody who has anything to do with backup and protecting data, recovery is the absolute number one thing that I hear. When people are screaming, when systems are down, when the business can't run, getting up and running again is absolutely priority number one.
Absolutely. In fact, you're only as good as your last mistake. So, if you're unable to recover in a fast time, allowing your customer back up and running within an hour, for example, they'll remember you when it's hour two.
Yep, absolutely. It's that real crunch time when people are hovering over your shoulder and you just want to make sure the data's there. You can recover it. You can get back to that particular system, no matter where it is- if it's local, if it's remote. It can be a real challenge given the variety of backup technologies that are out there.
Yeah, and not just even the backup technologies. Let's look at the environments that we're facing as well too. We're talking about virtual environments, VMware, Hyper-V, and different operating systems from MAC to Linux too. These are going to play an impact role on how you recover and where you're going to recover to and what that experience might look like.
Yeah, absolutely. Some of the things we want to talk about today are what are the metrics that you're holding yourself to. How fast do you need to recover? How frequently are you doing your backups to make sure that you haven't lost a certain amount of data? All of the different factors that come into it, and then there's the different locations, different environments, different technologies that you have to support every day.
So, Keith, you've helped a lot of people with their backup set-ups, configurations. What do you think are a couple things that make up a good backup strategy?
Great, let's start from the early days. Tape, disk- anybody can copy data onto a tape or a disk platform. That's really simple to do. That's been around for generations. Fast forward into today though. How does that work in a disaster scenario- in the on-demand scenarios we live in today? Three, four, five hours to recover from a disk or a tape. Maybe a couple of days. That's not good enough in our world today. But I want to talk a little bit about the 3-2-1 backup strategy because that's at the core of what we do. To some people, what does that exactly mean? Well, we have three copies of data. We have our production data. We have a local copy of the data, and we have an offsite copy. In my world, I like to think that's the Cloud. We're not going to ship that off to any specific site. We're going to do that into a Cloud- an automated process that makes life a lot easier. The backup copies are typically on two different types of medium. We want to have that availability for us from a recovery standpoint. This is going to make our lives a lot easier. If history has told us anything, time is of the essence when it comes to recovery.
Absolutely, yep. I picture- there's a couple IT people that I deal with on a regular basis. I just picture that person being stuck in that crunch moment and waiting on a tape to come back from some contractor that's taken it offsite for replication. There's got to be a better way, really, at the end of the day. We talk about this digital transformation and so many parts of IT, and the desire that so many IT people have to go away from tape, to go pure digital so they can have data access on demand.
I still don't think that tape is ever going to go away. I think there's still a place for it. It's just not in our world. The world that I live in with the many people that I work with and the IT admins that are dealing with this every day, it's just a real stressful point for them. We talk about cataloging tapes. We're talking about make proper inventory and proper stewardship of the actual data on the tape itself, and from the point of production into storage. This is really challenging for a lot of IT admins. They're busy doing other things- putting out fires in other areas. Let's automate this process. Create a Cloud-based or hybrid-type solution, looking at these types of environments to offload really just admin tasks that take up too much time.
Yeah, I really think automation is the key here. You've said it a couple times. The idea that I would have to get up from my desk, walk halfway across the building or to a separate campus to actually go physically retrieve a tape is just a time delay that most businesses can't stomach nowadays.
Actually, I've got an interesting story about that. A customer that I was serving many years ago in a tape environment, they had a massive loader- 40 tape LTO loader- and this thing had so many tapes and the guys couldn't keep up with cataloging them. Where they were storing them was in a fire safe three feet away from the actual tape loader itself, which, by all accounts, served no purpose because if something were to happen, all that data's right there. We finally figured out the data problem here and where the sprawl was taking place. We consolidated some of that. We moved some to the Cloud, and some actually was better served on the tape. We just shrunk that from five hours to 45 minutes. But the idea behind it is that we found a better way to automate and create some efficiencies using the hybrid backup solution.
Yeah, and I don't think you'll hear any of us say that there's one-size fits all. There is no one backup technology that solves absolutely every problem on the planet. But I like that strategy of taking 80% of tape away and keeping the 20% where it's absolutely necessary but automating the rest of it. That's, I think, what most businesses are going through right now.
I'm a big fan of a one A, one B solution. I think that it's important that we can't always rely 100% on one backup solution. I'd love to say we could in some instances but there's going to be those environments that are going to require us to have specific technologies that are going to do certain things for some legacy applications, for example, that can't be put in the Cloud or can't be changed in their current format. So these are some of the things that we have to be considerate of when we are planning out our backup strategy, of course.
Keith, you've dealt with a lot of different scenarios. When it comes to planning, when it comes to metrics, you know, strategizing out how to build those RTOs, RPOs, what are some metrics that you're typically seeing out there?
What I'm seeing typically in the SLA sort of world is how do we build them because that's the MSP's biggest problem that I've experienced. What does an hour SLA look like compared to a four hour? How long can we be down? How do we plan these to assure that our customer is satisfied and it makes sense for them too? We go through a strategic analysis of the business and prioritize the devices and the data that can and can't be down. Because if you ask the CEO of your organization, certainly he's going to say nothing can be down. I have a hundred servers in my organization. They all have to be up all the time. That's just how my business runs. But he has no idea what those hundred servers are doing. They could be development servers, et cetera, all these different things.
That's when you show him the bill. Say, well, that will cost this much. Right? [Laughs]
Absolutely. I'd be more than happy to prioritize all of your devices at the same level, but that doesn't make sense. So, we sit down with the IT administrators and analyze what makes sense to them as for the organization. We create a plan that's based on their budget, and it's also based on what they can live with and live without for the day.
Do you see any particular servers that are prioritized over others recovery time-wise?
Typically, we follow a flow of our communication servers, for example. So, exchange; that certainly wants to be a priority in pretty much any organization. If it's already not a hosted exchange in an Office 365 type environment, if it's on-premise exchange, certainly we want to have that up. Next up is our CAO applications. What does that look like in our organization? Is it our CRM? Is it our manufacturing products- applications that are looking after our manufacturing? Those types of things. Then we just scale down from there. Eventually, we find out there's a lot of devices that we just don't need to have. That's just based on, you know, something blows up in the organization. What I mean by that is the server goes down, blue screens; we need an hour or two to go rebuild it. Of course, we don't ever want to have to recover an entire inventory of servers. That sucks. That's the worst thing we could possibly do because that takes a lot of time to bring it up, bring it down, move it over, start it back up again- all these different things. So we really want to be focused on recovering the device itself. Sorry, I should say repairing the device before we recover. Of course, we are the insurance policy that says I have one hour to do this. If I don't and I'm unable to fix this server in that one hour, I have a backup image ready that I can install, inject, and get going right away. It creates a really good recovery strategy when we think about that. That all needs to be baked into that SLA conversation that we have.
So, Keith, in this talk we want to go through a couple different backup technology options that are out there, the different types of environments. Jumping right into it, I see a lot of different backup technology out there. We want to kind of break it down to a couple buckets that I think you're very familiar with. There's this workstation problem of every employee using Dropbox, using some unsecured file share program. We try to get them to backup to a network share. They just don't do it. Then when data gets lost, who they do they come crying to, right? So there's that file to Cloud option- making it as secure as possible, obviously. There's virtual image. I would say virtual servers is probably the next most common backup requirement. That's always going to be on a local storage or mandatory for critical systems. You talked about communication systems. Of course, critical applications. Anything that's critical like that that's going to have that fast restore time is definitely going to be local image-based. Then we are seeing a lot of solutions that are actually bringing virtual image up to Cloud. That's something we want to dig into a little bit on this talk because the common fear you hear with that is how am I going to offsite a terabyte of data and then how am I going to recover a terabyte of data? Obviously, you're not going to do it for your critical systems.
Sure, certainly. So let's start where you began in the first part- the file component. Now I think it's every IT's dirty little secret, or, sorry, let's rephrase that again. Every IT admin's dirty little secret that we keep stuff on the desktop, and we know we shouldn't. It's just not- It doesn't make sense when we have network share and there's some security wrapped around that. Obviously, it's the right thing to do for the business to keep everything in the same place for the organization. Now, my desktop looks like a Christmas tree. There's all kinds of colorful files and folders everywhere on the desktop. So I'm just as guilty, but I also have a really good backup solution to protect me, and I'm confident that it's doing its job for me on my behalf. I'd like to sort of address that in the sense that while we can appreciate the network share responsibility that we create for our organizations, we also have to appreciate the fact that the users- the everyday computer user- is keeping stuff on the desktop, and we have to be mindful of that desktop as well. That's even before we get to the servers. We touched on a few different things from the virtual image as well as inside the Cloud and getting that terabyte up into the Cloud. I mean, absolutely. We need to look at a technology that allows us to minimize the amount of data that we're taking to the Cloud. For example, using a DELTABLOC technology will allow us the ability to move data across the WAN once. Once we seed that data, or get that initial backup to the Cloud, that one terabyte, we're never going to continually replicate a terabyte every time we run a backup job. This is really important when it comes to data optimization, WAN optimization. We want to minimize the stress we put on the network when we're running our backup jobs.
I think this is what YouTube and Netflix did for the streaming industry, right? Whereas before I would have to buffer something entirely before I could view it, and if any point that failed, I lost everything. I couldn't download whatever I was trying to view. But being able to stream video and doing it incrementally at a block level is critical for backup, especially if you're talking about getting a terabyte offsite.
Absolutely, and so how do you do that efficiently? We go in with a 256-bit HAVAL hash code, and we mark all of the fun blocks and bytes of the data. We then take that out there and every time that a backup job runs, we scan all that information, and we move the stuff that is different. Real simple. I mean, let's not over-complicate things by constantly replicating the same thing over and over again, which is really easy to do in some legacy environments.
Yeah, it's just proper indexing, right? It's exactly what Google Desktop did of sorting everything, making it searchable. We never thought that was possible when we were living in the days of Windows search. So I think just having a proper indexing organization system and then only uploading what changes- how much data changes on a daily basis? Typically, it might be less than 1% on an average system, and that is not a lot of data to get offsite each day.
Absolutely, and we want to be offsite because that's the new millennium, right? We're all being told, and it's the absolute right thing to do is to 3-2-1- production, local, and offsite. Offsite in our world is the Cloud. There's a lot of great solutions out there from a Cloud storage capacity. Let's just look at it from raw storage. I can hear conversations from a lot of folks that I speak to daily and they're telling me well, Keith, I can pay next to nothing for cheap storage in the Cloud. I said, well, that's great. How much does it cost to recover that data? Do you have the tools and the software in place that is going to bring that information back to you efficiently and in exactly the same format that you had hoped for when you sent it up there in the first place. And, of course, how long is that going to take? Are you going to be throttled because what goes up should come back down quickly but doesn't always.
Yep. That reminds me of the two other technologies we wanted to talk about. One is the on-premise appliance. This all-in-one box that, you know, can I just put a SAN in that has brains in it that can just pull all the data and back it up. That's one option. Then there's the hybrid approach, which it's a combination of Cloud and on-premise. When I think about what you're talking about- putting all that data into the Cloud, recovering it- There's Glacier. There's all these technologies out there. They're cheap to upload. They're expensive to download. What's the recovery time? What's the reliability? What's the compliance around it? There's so many other factors that I think the average IT person would have to solve if they were to try and cobble together their own solution. I really think that goes to each individual's time. How much time in a day do you have to build your own backup solution, or are you going to go to the point solution for the server, the point solution for the workstations, or the combination system that has its own Cloud or its own appliance. There's a lot of choice out there for sure.
Absolutely, and good things happen and bad things happen. Bad things can happen to good people with the best of intentions. We could spend a lifetime buying life insurance and decide that I don't need it one day, and the following day get hit by a bus. That investment that you've made for the previous 20 years is for naught at that moment that you decided to change. So, backup is much of the same similar circumstance where it is an assurance. We'd never want to have to recover anything if we can avoid it. Let's look at it that way. Conceivably, it's an insurance policy. We can monitor this insurance policy in a good, better, best scenario, just like we do for our home, our cars, and we cherish those items. So, let's cherish our data in the same fashion and apply that methodology into our thinking when we look at backup as a solution, software as a service, and how it all works together.
And it all comes back to recoverability, right? It's like you said. The day that you need it, you wish you had backed it all up and you just to have- You know, pick your solution, but just have a solution, I think, at the end of the day.
Really, yeah, have a solution, even if it's not- It can just be something. I spoke to a VP of Finance once before, and he kept a lot of stuff on his workstation- a lot of spreadsheets, a lot of information. I asked him if we lost that workstation how long would it take you to recover that data- like rebuild it, not just recover it. Rebuild it, because we're not backing up this workstation. This is not happening right now. He said four years to rebuild all of the data that he had on his workstation. I said four years? How much do you think it's going to cost to backup that workstation a month? Eighty gigs? A hundred gigs? We're not very much at all- a cup of coffee a day. If we think about it in that made-for-T.V., Shamwow-type environments, I mean, these are the types of things that we have to consider and we're not. Twenty-six percent, I believe, from what I've understood, are backing up. So that means there's 74% of the environment on the workstations that are unprotected.
Yep, it's a very common scenario. I think that goes to our ability to pitch this to upper management. Again, it's not protect everything, right? You gave the tape solution earlier of reducing tape down to 20%. Maybe there's only 15% of the workstations that are critical. Maybe it's half of them. I think at the end of the day, you know the environment better than most people, and you're going to have to make that call on is it worth a couple dollars a month, or however much it is, to protect those workstations. I think the key and the fear and why workstations haven't been backed up is because to do full disaster recovery on a workstation, full image backup, is cumbersome. It's tedious, and why would you ever want to go configure that on a thousand workstations. But to have something that could do that a little bit more automatically makes it more appealing for sure, I think.
Simplify that process for yourself too. Just eliminate the thought of backing up the whole image of the workstation. Just get the core data. You and I both know, and we all know, that we can blow away a Windows 10 box and rebuild that in a matter of minutes. It's not something that's going to take a lot of time to do. It's the data that we need to bring back. I can reinstall all my applications. Most things are web-based anyways now, so what's the big deal? But all the files I'm working on, the folders, the important stuff, the presentation for my boss that's due tomorrow morning but I blue-screened and lost my computer- those are the things that I want to protect.
And it comes back to the recovery metrics that we were talking about earlier of you're going to have different metrics for every different type of data, every different type of workstation, every server. From workstation to workstation, it's okay to have different recovery times, and back up and running doesn't mean recovering a full system. Back up and running means I can do my job. Doing my job means you've recovered one PowerPoint for me or one Excel spreadsheet.
That's a great point. What does it mean to you- back up and running? I'm working, and I'm doing the job that I'm paid to do. That's important for the organization. So when we ask that CEO, when we talk about prioritizing those hundred servers, because we know that that question's going to come up, and we're always going to hear it, when we go through that prioritization, it starts to make sense. We start to uncover- peel back that onion. We find a lot of the truth of what we need to protect and how we're going to protect it. We create policies that allow us to do our job efficiently again. We automate with policy. We push that out and say these devices get this policy. Over here will receive this backup policy. Now we've automated this process. IT administrators are spending minutes instead of hours a day administering backup.
So, Keith, let's jump into the different environments that a typical IT pro is going to have to deal with. We've talked about different backup technologies. We've talked about recovery being the most important thing, but at the end of the day, you have your metric recovery speed, recoverability. Then you have the different types of technology you're using to solve those problems. But what about all the different environments because it might require different technologies for different environments.
Absolutely, and so cobbling solutions together is a really challenging task. We want to try to minimize those single points of failure. I often find that in speaking with some of the IT managers I work with, they're running three, four, five, six different backup solutions to complete a task within these different environments. So, we've got the head office, for example. Head office is a priority because they have the most of the devices, and they have the most of the servers. So they're going to get this backup solution to protect the environment. Then we'd start to dive into what about the satellite office? We need to protect them as well. So we open up a new dashboard and we create a new environment for them. Now we're using a different backup software because the one we use for the main office is too expensive. Now we can't afford to put it in our satellite office because we didn't plan for this budget cost overrun. No one's thinking about that when they made this plan because they didn't have a good advisor or a good consultant to help them navigate this path. So they add another solution. So they move on to the next satellite or the next remote office. They've got some remote workers as well, and they're using a different solution. So now we've got three different solutions, and we've got one guy managing this, and he's got to be this Jack of all trades. That's really hard to do when you have to be the master of so many different things. You have to be, you know, your virtualization, your workstation, your server, all these different things that you need to know from the network to the security, and now you want me to take on this backup task as well. Come on, this is crazy. I can't handle this. We come in and we produce an idea that says, look, let's centralize this. Let's manage it from a single pane of glass. Let's look at it all in one and protect every type of environment, regardless of location, and give them the same level of service. Huh, that seems pretty easy.
Yeah, and I think it's the Holy Grail that everybody searches for. We said earlier that there is no backup solution that solves every single problem on the planet. At head office, we might have various different kinds of servers on various OS's running different applications and you can't have the same product protect both. We have to have a dedicated Linux-type backup solution. We're lucky if we have three solutions, and that's what you're saying. At the end of the day, if we can solve the problem of having to log in to three, four, or five different applications just to make sure data's protected, I think that's one of the most important things we can do. If I can just see all the devices, all the data that I have to protect, make sure that all of my backups ran, that's a huge chunk of automation that just saved me so much time every single day of verifying backups.
Absolutely, and let's take that verifying component a little bit further. How do we know everything is running okay? We can log in. We can check the pane of glass, and we can verify that data seems to be intact, but what if something fails? How am I going to be notified? We create notification policies that send us emails and trigger tickets into our ticketing desk and do all these wonderful things to allow us the time to react to a backup issue. That could be the server is down- it's offline, there's something wrong with that device itself. So you already may have gotten a notification from your monitoring tools that there's something wrong with that server. But our backup could also provide you with the monitoring that your monitor could have missed.
I think that's critical. At the end of the day, just making sure that you're getting notifications in a timely manner because you're not sitting there staring at a dashboard all day long for any application that it comes to. So having ticketing go through your ticketing system, emailing you, notifying you in any way that works for you. Make sure that you know what's going on, and you can react to it pretty quickly.
Yeah, no one's going to complain if they get a lot of notifications, but they certainly will be sending you an advice notice that they haven't received anything in a while from you if you haven't communicated anything to them. So, a little over-communication is not a bad thing.
Absolutely. [Dramatic music] So, Keith, we talked about SLA as really at a service level before, but every company should have a recovery point objective and a recovery time objective. So, how much data are you willing to lose, and then from the time a system fails, how fast do you need to recover to get back up and running. A lot of people that are outside of IT don't get those concepts, so it's really important to talk to the CEO and say, no, you have to understand that I can only backup so frequently and then from the time it fails, that takes recovery time, so our total data loss is this much.
Absolutely, so when we're thinking about RTO and RPO, they should be equally considered when you're doing SLA. They complement each other. They sort of work hand in hand to ensure that we're providing the right recovery times and the recovery points that we have. When we think about the current climate of malware and Crypto viruses- things like that- having a good RPO. Let's just set aside RTO for a moment. Having a recovery point objective that can bring you back an hour ago, two hours ago, or two days ago- whatever that might be when you become compromised. But let's think about that, how happy that manager is going to be, that CEO will be, when I point click and recover that data in moments.
When we hear about hospitals and colleges and giant campuses going down and every workstation infected, and then the days, the weeks, and then I've even heard of a month to get back up and running. So when you have daily backups on every workstation, if you could recover instantly from any situation like that, you're the hero. I think that this comes down to Keith and I know that backup is important. You know that backup is important, but going to other people in other departments, going to the owners of companies and talking to them about thousands of dollars of cost; it's not an easy conversation. But when you have those kinds of stats with you when you talk about the pervasiveness of ransomware, how long it would take to recover, that's a real conversation. It's not me versus the CEO. It's me saying, Mr. CEO, what can you tolerate? If we were to get hit, are you okay with us being offline for a week? Well, of course he's not, right? It makes our job easier getting solutions in the door that are going to proactively solve these problems so that we don't have that nightmare six months down the road.
It creates that hero moment for you as well. You've recovered that data, whatever that is- that one file all the way down to the image. I like to break that hero moment down into three groups. What I've done is I've got my Cloud backup and restore. That's just everything going to the Cloud. It's the lowest cost. It's really easy to do. We put an agent on a device, and it just backs up to the Cloud at a set schedule. Really, it's good for workstations. I think that's the best place for it. I don't think servers really should be participating in that conversation.
Yeah, and we think about the remote workers, the satellite offices. It's a real challenge to have installed backup and manage it remotely, but if I can have something that I just kind of set it and forget it- it's pushing to the Cloud- the data's there. That's ideal for anything that's remote, that's out of the building.
Absolutely. Then I have also a local sort of Cloud and backup restore. This is simple. It's going to be fast recovery because I have a local copy. I can use it in my server environments where I have maybe an SLA of four to eight hours. So I've got that handled, and I can also put it on my workstations as well. Then I follow that up with my third part of the equation, which is the standby servers and the Cloud continuity. So, this is where we become proactive in our environments. Now we have the agents deployed. It's the exact same agent on the workstation that I'm going to use on my server. So that's going to make life a lot easier. Then follow along and I create some continuity planning, and I go in and I have some sort of mechanism where I have a standby server offsite somewhere in another location, in another Cloud. So I'm ready to spin that device up in minutes if the call is made. That gets my hero moment again. So that's my three-prong approach to creating hero moments for yourself.
Yeah, and I think another thing to add on to that is when you think of offsiting data and long-term archiving, where are you doing your monthly snapshots? Where are they going? Are you going to have that hero moment where someone needs to recover data that's two years old, three years old in a compliance-oriented environment.
Yeah, create a simplified archiving policy that's built right into your backup routine. And your hero moment becomes a massive hero moment when you can roll back to 2015 to get that file back in Finance because somebody had deleted it or moved it, whatever the case may be. But, wow, everyone really will show the love that moment when you can pull that back in a matter of minutes.
So, Keith, kind of wrapping this up as we head to the end of this session, I want to talk about how to make sure data is going to be there when you absolutely need it. There's a couple elements of that. There's what storage media are you using. Are you offsiting the data or separating it at least? Are you testing it and what kind of security policies do you have around it? So, starting with storage media, talk to me about tape versus disk versus SAN and other options that are available.
Well, that's great. I was hoping we would get a chance to pick on tape just a little bit more. I didn't feel we beat them up enough. There's still a home for them, right? It's still relevant. It's a legacy technology, but I think we have to plan and prepare for the future generations that are taking over these environments and what they are looking for when it comes to a backup and disaster recovery plan. They're not thinking tape. That's someone born in the 70's. [Laughs] So let's think of it that way.
It's a legacy technology. It's in maintenance mode. Let's not have it part of our strategy going forward. Let's not be buying more or making new decisions on systems based on leveraging this technology that fails 25% of the time for some IT pros that I've talked to.
Precisely. You don't buy a brand new car and put your old tires back on. It's just not something you do. So let's use that sort of thinking when we plan out these environments and how we're going to make this transition from tape away- even disk, for that matter. Disk does work in a lot of environments, but I believe there is a saying that says there's two types of hard drives- ones that have failed and ones that are about to fail. [Laughs] We really have to be mindful of that when we're planning out our backup plan. So what do we do? We eliminate that whole technology? We don't have to, per se, but we're still going to need some of it somewhere from our local appliance standpoint. We want to have that local backup copy. It's still containing disks, but we also know that we have the Cloud. We have an offsite repository, which we are able to recover from at any given moment.
And if we are offsiting data, maybe we're okay with tape being one of the copies. So we have that three-copy strategy. We're okay with tape being local, and we're okay if it has 75% success rate because we have the Cloud. We have something offsite that is more secure and more reliable.
Yeah, and it really allows you to have recovery options. Whether you want to recover a virtual image, recover a physical image back or just the data itself, you can still do that from the Cloud, albeit, it's going to be slower than a local copy, but we're planning for that. We already planned to have a local copy to recover much faster from. The Cloud is that last stop on the train ride of recovery.
Okay, so moving away from storage media to offsiting data, what are the options available today, and can I trust the Cloud? Can I offload that much data to the Cloud? What happens when I'm starting with a full server and I’ve got to dump that whole seed up to the Cloud? What does that look like?
It's a business decision, first of all. We talked a little bit about that. Do we take all 100 servers or we just parse them out and pick the right ones to go to the Cloud? We want to seed that data as well too. We want to introduce it to the Cloud first and then start backing up so we can just allow for a natural backup process to take shape, as opposed to fighting traffic.
Similar to how I would take a tape offsite, I'm shipping a hard drive or a tape or something to the Cloud to manually copy that into a data center.
That depends where you live, of course. In a lot of big cities now, we have got some great fiber. With 100 meg upload, putting 300 gigs up is not going to take more than the weekend. So, probably faster than shipping it. I would recommend looking at where you can upload from. Maybe you have a faster upload at your client's site or your office has got a faster upload. Bring it back to you and upload it from there and get that data to the Cloud. But there's more than one way to get the data there, and it's a lot easier now than it has ever been.
And we see a lot of large companies that have their own data centers. So they want to offsite to their own data centers into their private Cloud. Like you said, the larger cities have fast enough upload speeds. Companies have large enough infrastructure to take care of a lot of this themselves. It all comes down to just having those three copies, using the right types of media that are going to be there when you need it, and making sure that you absolutely have an offsite copy somewhere. We don't ever want to have all our eggs in one basket and then some disaster happens.
It's a recovery option. We don't want to be pigeonholed into, great, we've got to go recover that tape. Let's call the tape recovery place and wait.
Because nobody likes to wait- not today. [Laughs]
So, the next thing that we wanted to cover off was a little bit on security. When it comes to sending all of this data around, how can we make sure that the data is not vulnerable to being intercepted, that it's encrypted- things like that? What are best practices around security?
We want to ensure that the data is being encrypted at transit and at rest. That's primary for every- Any solution out there is going to tell us that because that's what we want. We want to have some choice as well. So, compliances tell us that we have 128, 256, or 448 Blowfish encryption. These are some of the more popular ones out there. So, have some options. Create a standard for your organization where you're going to sit within one of those three options I guess would be the best way to put it. From there, we also think about things like zero-knowledge encryption. Who owns these keys? Is it going to be my customer? Is it my customer's customer? I certainly don't want the vendor to be- or the software provider to be- owning these encryption keys. That's a point of access that I don't want anyone to have. So, move them from the equation. Now it's just down to the managed service provider or the customer who actually owns the keys. If they don't have those keys, we cannot recover the data.
Yeah and, I mean, it's important. It's a trust-building exercise. So much about backup and data protection, data security, has to do with trust. When you remove this part of the equation, the idea that the vendor can never access your data, even if they wanted to, you don't ever have to worry about that aspect of it. Nobody could hack their systems. Nobody could access your data. Your data is safe and secure.
Absolutely. Trust from the software provider to the managed service provider and another layer of trust from the managed service provider down to their customer. It all works in a great fashion from symbiotic relationship. But keep in mind too, we wanted to have these encryption keys that align with the current compliances, such as HIPAA or PCI, for example. These are the main ones that we hear in the market anyway. These encryption keys help our providers become compliant with those compliance requirements that they have. We're not going to make you compliant, but we certainly will help you get there.
Yeah, and your technology should do that. Your technology can't make you compliant. It has to fit into your environment, your system, but it should make it dead simple to comply with everything that's around the technology.
Yeah, for anyone who likes to have a bit of light reading, there's about 450 pages of the PCI Compliance Guide. It's a lot of fun to read. Have a whack at it. [Laughs]
I actually got assigned that at one point, and I did have to read some of it. [Laughs]
Yeah, it's a lot of reading but you'll certainly find that a lot of the responsibilities, there's a lot of shared responsibilities that go from the actual data owner to the managed service provider to the service providers. So it's a good idea to identify where you belong inside of those compliance requirements, and you'll find that if you read HIPAA and other compliances as well it's very similarly put with regards to how you fit into those requirements.
Absolutely. So, one of the last topics we want to touch on was testing. Probably save it for the last because [laughs] it's the one that nobody likes to talk about. But testing is one of those unfortunate things. It's difficult. It's time-consuming, but it's absolutely necessary. Like you said, we can't just depend on one backup technology for every single type of environment. Certainly, we can consolidate as much as possible, but to have a multiple-tiered strategy and testing being part of that, is just critical.
You're going to want to create some sort of standard operating procedure, or an SOP, and identify testing as part of that. What I think and what I talk about quite often is we have screenshot verification. That's really nice. We know that we can go in and verify that we can boot those devices. Let's take it a step further. Let's create a monthly or quarterly, maybe a bi-yearly plan where we go out and we actually bring these up in test environments. We show the customer that we can turn them on in production and light them up in a Cloud of their choosing, in their other environments, in their crash sites, or just some sort of environment that allows them to see that they can in the event of a disaster, we can do this in X time. We promised you one hour. We actually did it in fifteen minutes. It's better to over-promise and under-deliver, or under-deliver and over-promise? How does that go again? [Laughing] No one's going to complain.
At the end of the day, if we have no test plan and something goes wrong, it's our fault. If we have a test plan, we just verified that data. We have the SOP. We have the checkbox. We're in a much better position if something were to go wrong with the data. I think a lot of us have been in that kind of a situation where we were responsible for something, and you failed to tick the box on something, and really, we are at fault in that case. But having the checklist, and again- quarterly. Quarterly is acceptable. Just verifying--and this is where we see a lot of the failures in certain different types of storage media and other things-- but you want to know about those ahead of time so that you can make the copy because we always want to bring everything back up to that three copies of data, two different types of media, one offsite. At any point, one copy of data, one storage media, or the offsite copy could fail, and we are not 100% secure. So, testing is what promises that we'll have that 3-2-1 across the board.
Really, testing means to me that I've tested it so I never actually have to do it in a production environment. That's our true goal, right? We never want to actually recover a server. We just want to know that we can. Recovering a server means something bad has happened and it's irreversible and we now have to recover from another point in time. This is not something I would ever want to do just simply because of all the work that's involved. I have an SLA. I have a managed service agreement. There's a lot of money that is exchanging hands, and I want to keep that to myself as a business owner. That's my thinking. However, as an IT administrator, I also don't want to do it because there's a lot of legwork involved. I have to assemble my team. I have to revert back to my standard operating procedure to make sure we're following all the right rules of recovery that we have practiced and practiced. So, we know how to do this really well, but at the end of the day, I just don't want to do it. [Laughs]
Absolutely, but none of us want to do it, but none of us want to be in the situation where the company goes down for whatever reason and we failed to test the backup so that we cannot recover. Nobody wants to deliver that message anytime.
I'd rather be prepared for something I never want to do.
Yes. So, Keith, we deal with a lot of MSPs, a lot of IT pros, in a lot of different environments, and I just want to bring it all back home to talk about some of the things- you know, wrap this up. Recoverabilities, number one, absolutely. We talked about different technologies, different strategies, and picking the right technology for every single type of environment. To summarize that, for me, I see it kind of two sides. There's the file to on workstation to Cloud options. We’ve got to protect that desktop data that people aren't securing. There's, of course, those critical virtual servers, servers at headquarters, remote offices, and you need a system or a solution that's going to recover a full database down to the table, exchange down to the email inbox, or an entire server just from disaster recovery. You're going to be able to want to recover that from on-premise. We need fast recovery on that kind of stuff. That's a lot of the solutions I see out there. When it comes to SolarWinds's approach to backup, our backup product has this hybrid approach, and we think that just as a philosophy, hybrid backup is the best approach because it really gives you the best of those both worlds. I don't want to have to make a choice between workstations or servers- this or that- it's the and-and-and, right? So it's any type of file. So, files, databases, applications, and full systems. It's servers and workstations, and I can do it from on-premise or from the Cloud to remote offices, remote workers. It really is that one single pane that I log into one tool and I can protect everything.
Absolutely, and the combination of those factors give you the most fastest and the most reliable backups available, and the ability to scale too, and that's important as well from one to ten-thousand devices, all under that single pane that you mentioned. It's a real important part for an IT administrator's workday.
Absolutely. Well, we hope you've enjoyed today's session. We'd love to hear from you in the chat. Maybe share some of your horror stories or let us know what you think about today's talk. For THWACKcamp, I'm Ali Mahmoud.
And for THWACKcamp, I'm Keith Young, SolarWinds backup artist. [Upbeat music]