87. Putting ITIL 4 into Practice with SolarWinds Service Desk

ITIL 4 certifications demonstrate a thorough understanding of how to deliver value to the modern organization through services. Matching that understanding with the capabilities of your ITSM solution is a powerful combination.

Join ITIL 4 Managing Professional  and ITSM Senior Sales Engineer  for a crash course on applying the seven guiding principles of ITIL 4 and how they tie into using the SolarWinds Service Desk platform. Whether you’re a certified ITIL pro or a new help desk technician hoping to work more efficiently, learn five ways to deliver better service to your organization.

We’ll cover:

  • How to restore service quickly through incident management
  • Diagnosing causes of related incidents through problem management
  • Tips for change request intake and reviewing change records
  • Building relationships between record types for more complete data
  • Applying the seven guiding principles of ITIL 4 in daily tasks

Additional Resources


Transcript:

Hey there, everybody. Welcome, and thanks for taking the time to sit down with us this afternoon. My name is Liz Beavers, and I have my colleague Sean Sebring joining us.

Hello, howdy.

Hey there, Sean. Today we're going to be talking about the importance of having ITSM strategies backed by ITIL guiding principles. But before we dive in too deep, we did want to take a moment and say a huge thanks to our IT pros that are helping us continue to operate smoothly despite our remote office situation incurred with the pandemic. So, huge thank you to you all. Sean, can you tell us a little bit about what we're going to be talking through today?

I sure will. As Liz mentioned, things have been pretty bananas lately. So, what we're going to be focusing on is just some of the ITIL best practices that surround our every day-to-day things, and one of the most common ones that you might be relating to right now is incident management. We're gonna talk about incident management, how it can potentially transform or grow into a problem, problem management, as well as a little bit about change management as well.

That sounds like a great agenda. I think these are definitely going to be helpful hints, again, especially because we're not all in the same place, and for some of us, like myself, this is the first time that we're being challenged with working from home. So, with that, I know for me, personally, when I was first issued the stay-at-home order, I had to figure out what to do in the absence of having my keyboard and mouse. I actually had to put in a ticket so that I was able to go to my traditional office and retrieve those items so that I could still be effective from home. But it sounds like that's our typical one-off ticket. Sean, can you give us a bit of a rundown for what incident management is?

Yes, absolutely. Incident management, like you had mentioned, is really, it's helping to restore services to folks. So, an incident management to me is kind of like a break/fix. Something's not working. It's really close related with service request management, which might be something where you're asking to get something new, but this has been something that people have been challenged with a lot, especially in these last couple months.

Definitely.

So, asking for somebody to help you fix something in your environment, asking for something new, and the Service Desk, that backbone of providing the incident and service request management, the Service Desk providing those solutions, especially in this remote time, it's more important than ever, Liz.

I couldn't agree with you more. Certainly helpful to understand incident management as a whole. So, let's actually take a look at how that's orchestrated in SolarWinds Service Desk.

Awesome. So, as a technician, as I come in and sign into my IT tools, I can see that first thing in front of me, I'm looking at my Help Desk queue. As a Service Desk tech, I want to see the work that's relevant to me, maybe what came in last night, what do I need to pick up, and today, right in front of me, I can see that I've got a "Can't Connect to Wifi" issue. So, this could be a particularly standard case for someone who's just starting to work remotely. I can hover over this eagle eye, take a quick peek. Maybe I need to get a little bit more details about the ticket, popping it over here in the right. But really, I want to get into the meat and potatoes. Let's take a look at this ticket and see what's going on, why can't we connect to Wi-Fi?

So, as you're jumping into the ticket, Sean, this definitely sounds like and looks like this could have been assigned to multiple people. So, I definitely think having something like SolarWinds Service Desk, where you can easily shift the assignment, can certainly help in that ITIL guiding principle of collaborate and promote visibility, as it actually looks like there are two technicians already in here.

Yeah, I think that that's a good point. So, I'm able to assign this to myself because I can see over here in the top right someone else is reviewing this ticket with me. So, assigning it to myself will do a couple things. One, it'll help change the state so this ticket's now assigned, so the requester knows someone's working on it. They can see that Thomas has picked it up. But up in the top right, someone else, a technician on my team is looking at it with me, so something I can do to further that conversation about collaborating and promoting visibility is we can do an @ mention to another technician and just let him know, "Hey, we've got this."

What a great feature. I think that is immensely beneficial, particularly, again, in our remote climate, so that even though the technicians aren't physically in the same office together, they still have that understanding that they're working towards the same goal, which is a quick close for this user to have their Wi-Fi restored.

Absolutely, and if I go kind of back up towards the top, under where we saw that Jane was taking a look at the ticket with me, we've got this very fancy honeycomb icon which is saying we've got 26 suggestions in here. So, as a tech, we talked about incident management being restoring service quickly, so connection to Wi-Fi, that could be a pretty big issue for someone. They probably can't do anything if they can't connect to the internet, so if I expand these 26 suggestions, let's see what it's got for us.

That's really impressive, and I think a huge benefit that many of our technicians have gotten to experience with SolarWinds Service Desk is these artificial intelligence batch recommendations. So, particularly in terms of being able to share the appropriate content, a technician just like you're sharing here, Sean, is able to include with one click the appropriate knowledge-based article to help this requester continue progressing with their day.

Yeah, these Smart Suggestions are really helpful, especially when we talk about getting that service expedited. Without having to do any research myself, I was able to hit a drop-down menu. I was able to see a solution, find the one that's relative to this issue, and attach it to the ticket, pop it into the comment box, and it's sent out all with pretty much two clicks. So, sending this out to the requester is going to be my solution that I've provided, and I had to do little to no real leg work, troubleshooting, or investigation.

And just those few actions as well easily strengthen those suggestions moving forward. So, it really is a culmination of adoption over time.

So, not every ticket's going to feel this way, and we'll talk about this a little more later, but being able to quickly and efficiently resolve a person's issue just from being able to open it up, see a suggestion, and hand that to the requester so that they can move along with what they need to do for their work is a great way to look at incident management.

Wow, Sean. Just like you mentioned earlier, those different motions for handling a one-off ticket certainly were efficient so that that requester has their service established for their Wi-Fi connection and can keep moving forward with their day. But I know that there aren't always cases where it's so easy to wrap it up with a quick knowledge-based article, and I think that's a great opportunity to lean on our artificial intelligence when identifying trends if there have been multiple tickets reported of the same issue. So, with that, can you talk us through a little bit about problem management?

Yeah, let's take another look at the incident queue here. So, we haven't really left incident management yet. We're looking at network escalation queue. Previously, we were looking at the Help Desk queue, and one thing I want to call out is some of the frontline work that you do can lead into other practices, so right here we can see a trend of connectivity issues that have come in. So, that trend analysis that Liz was mentioning, automations are there, and we definitely want to leverage them, but it really starts at that first level, at the Service Desk. So, seeing this connectivity issue, it's at a critical priority. Let's take a deeper dive into it and see how the automations can help us. Last time we used the automation, we were looking at a solution that could help provide a quick fix for that issue connecting to the Wi-Fi. This time when we take a peek, we're going to see that trend that Liz was alluding to. It says we take a broader view. We can see that our Smart Suggestions are showing us that we've got multiple tickets coming in about a network slowness, including a few from Orion Alerts.

That's a great point, so with that, we now have that collaboration, not only with the Service Desk team, but also for your networking and your monitoring folks with the integration to those alerts from Orion.

It's awesome that they came from our monitoring tool. One of the important things not just about incident management, but about all these other practices we're going to talk about today is showing the relationships between these objects. So, from this Smart Suggestion, as well as just from an incident page here, we can make the association and build those relationships. One of the things I know because I've worked in this kind of environment before is that I want to look at the oldest issue relating to this to see when did this start, and see what kind of related issues it has. Right from the Smart Suggestions, I was able to build a relationship between these incidents. Catching those trends, whether it's from observing the trend myself, or having the Smart Suggestions help guide me to that, is going to be important for not just incident management, but the other practices that we're going to look at today.

Absolutely, and right from this screen, a technician is not only going to be able to make the associations among like incidents, but by simply clicking that attach button, they're also going to be able to report that as a problem. So, as we start moving through not only our one-off incident management practices, but also connecting like incidents and maybe identifying a larger issue, I think this is a great opportunity for us to take a step back and dive into problem management. So, Sean, talk us through what problem management entails.

So, in my ITIL opinion, problem management's one of the most underutilized practices, and it's got huge benefit that it can add to any organization. A lot of the times we're more focused on fighting the immediate fire in front of us than recognizing that that fire could spread. So, problem management is about identifying root causes, it's about seeing if there's a workaround, and then it's about, also, is this a problem that we can't get rid of today? Is this going to be a known error that fills in our known error database?

Awesome, so it looks like we're already in the problem record, Sean, and that looks like a ton of information. Walk me through what we should be generally identifying when we're working through identified problems.

So, the SolarWinds Service Desk makes it really easy. What we see on screen, a lot of this was actually already prebuilt for us. So, when we look at this, just like every other record type, there's a title, there's a description, and what this record has done for us is it's actually broken down some of those things I mentioned for problem management. It's got a field especially for defining that cause. It's got a field especially for identifying the impact, right? What are our symptoms that we're experiencing, and is there a workaround to this? In our example scenario that we've got where we're having this slow connectivity, these slow response times, it looks like we actually don't have a workaround for this. So, if this is a permanent, more immovable problem that we've got, this could actually be a known error that we just have continued to be reported, and a known error, by definition, is just a problem that can't be resolved at this time, and that's what populates the known error database.

Definitely helpful to break that down and understand that a bit further, and I think, generally, what we see for folks that are using known error within problem management, they're also crafting knowledge content. So, just like earlier, where we saw the recommendations for which knowledge base articles would be appropriate, maybe for those trending tickets, if this were a known error, we could have shared that here too.

As we look at this problem example a little bit more, we can see that the root cause here was a firmware update that took place on a device in our environment. Let's go back to the relationships down here and see if we can make a relationship to the change that caused that firmware update to result in these problems.

So, with this, Sean, it actually sounds like that the change that was implemented that we saw through that related tab actually is what caused those reported incidents, and even that problem, but I think generally, when I consider change management, or what they now are calling change enablement, I think of having to request a change. Let's actually talk through that a little bit.

So, I am very excited that this is our example case here, because like you mentioned, I think most people think of a change to fix a problem. That's not always the case, so here we're going to actually look at how a change caused a problem.

That's really helpful to understand that a little bit further. So, before we take a look at change management in the Service Desk, what is change management?

So, as I was mentioning, documentation is huge. Change management, from the definition, there's a few things that I like to call out, and that's the assessing of the change, that's the authorization of the change, and then managing the schedule, right? So, we need to assess what is your plan? What are you gonna do to fix it if it doesn't work, and how are we gonna make sure that it was successful? And then authorizing what are the key stakeholders involved? So, a lot of things that I like to see when an organization is adopting change management is the inclusion of more and more business stakeholders. It's not just IT involved in these changes. And then, lastly, the schedule, so if we've got multiple changes taking place at a time, there's gonna be a lot of risk for that conflict, so making sure that we're scheduling our changes appropriately is going to be a huge piece of change management.

I couldn't agree with you more, Sean. So, now that we've got a little bit of change management understanding under our belts, let's actually take a look at how that is enacted in the Service Desk.

So, we're jumping back into the Service Desk, and we're looking at that problem record where we had the network response issues. Going back to our related objects here, I can see the change has now been associated, and we're going to take a look at it.

So, just like you were mentioning earlier, Sean, documentation is key, and I think in having that streamline visibility through that one centralized tab is going to be very, very powerful for teams that use it. But now that we've gotten into the change record, this looks really similar to what we saw in the problem management space.

And I think that's a good thing. Having a consistent view, but with making sure that the information you're documenting is specific to what you're trying to record on. So, in this case, we're looking at a change versus a problem, but I kind of am already familiar with the format. I've still got a title of this change, I've still got a description, but rather than reviewing the things that I did on a problem, now I'm looking at my change plan. What am I going to do to implement this change? My rollback plan. What am I going to do to reverse it if this isn't a successful change? And then my test plan. What am I going to do to make sure that this is going to be successful, or say, "Hey, everything worked exactly as we had planned"?

And I think with this one, it looks like, given the life cycle that we've been following from those trending tickets, through the recorded problem, into the change, we're actually going to need to rely on our rollback plan.

Absolutely, and documentation, I'll say it again. That is why documenting in change management is so important. If you don't know what your rollback plan is, how would you come back from a negative change, right? Change can have big impact. That's why the assessment phase is so important in change management, and why having a very well-documented rollback plan, which we don't really have a whole lot here, but help us with your imaginations. Having that rollback plan is going to be critical to a change management practice.

Great, Sean. So, when you were initially walking us through this concept of change management, you also alluded to scheduling. I definitely know that time is going to be sensitive around this, because you're making a change to the infrastructure. So, let's dive in a little bit further to that scheduling element.

Yeah, and one of the great things about this Service Desk product is it actually does a conflict checker. So, if we're looking at this example, it looks like May 12th, starting around 9 p.m., actually ending at May 13th around 2 a.m., that's when this change took place. There weren't any changes around it, and so a conflict checker, having that automated feature built in, saying were there any other changes scheduled during this time, that's already part of the app, so that helps you manage that without having to worry about filling in a calender all on your own.

And I definitely think having that conflict calender in there as well helps us again provide visibility to those other teams. So, if there was another change that was being orchestrated around the same time, other technicians would be able to know that, given that we're not all in the same location anymore.

Yeah, and from this example, we can see that, hey, it looks like this took place last night, wrapping up this morning, so, hey, that could be exactly why we're having these issues. That just kind of solidifies our identifying this change was what resulted in that problem.

Now, I know when you were initially talking us through what change management entails, you mentioned the need to have approvers. So, let's talk a little bit about that process flow and the inclusion of receiving approval to move forward with the change.

So, I think that's a great thing to talk about, because after we assessed this, right, so an approver might see the change. They would review your plans, both change, rollback, and your test plan, and say, "Okay, I think that this "is going to be in good shape." Having those business stakeholders buy, and having that level of "let's keep talking about it" visibility and collaboration on the change record is going to be crucial. Not only that, but it's always nice to have a little bit of let's call it accountability. So, now we've done pretty much all that we need to do when we're looking at how a change is addressed, right? We've assessed our plans, we've managed the schedule, and we've seen that this has approval.

So, now that we're having to actually enact our rollback plan, I think this is a great time for us to actually rewind and take stock of all the different events that we reviewed today with you, Sean, through that full life cycle throughout the Service Desk.

Yeah, and let's do that rewind. This was a case where the rollback plan, thank goodness, was documented, and documentation, again, is going to be something I won't stop saying today. So, from here, we're going to enact that rollback plan, and this change was related now to the problem, right? And when we think about problem management, we know that this change was the root cause, right, so what is our solution going to be? The solution for the problem is the rollback plan on the change, right? And when we started, we were actually just at the incident management level.

That's right.

But that led us up to our problem management.

That's exactly right, Sean, so when we actually started today's journey, we reviewed incident management and how it could be applicable both to a one-off ticket about somebody having a Wi-Fi issue, or it could be recording and tracking multiple issues of the same nature.

And because we found that trend of incidents, we were able to identify this as a problem and move into the problem management section of the Service Desk.

And from the problem, that then prompted us to associate that to a change that had previously taken place to the infrastructure. So, as you mentioned before, and I don't know if our audience is tired of hearing it yet, documentation is key to identify all those different elements of what took place in the life cycle throughout our Service Desk. I do think, though, it's safe to say having those established strategies backed by that ITIL foundation is going to be very beneficial for teams so that they are able to react and adapt, given whatever circumstance that comes their way, even something like working remotely.

And one of my favorite things about ITIL 4 is that these are no longer called processes, they're called practices, and I like that because practice makes perfect. Not every change is gonna go according to plan. Problems won't always have an easy root cause to identify, so practice makes perfect. Keep trying, and let automated tools, like the SolarWinds Service Desk, with easy documentation fields help you get that done.

Thanks so much for your time today, Sean. It was great sitting down with you. I know that I took a lot from the conversation, and I hope everybody in our audience did too. Happy to take questions, and certainly, if you need additional assistance, you can always go to www.solarwinds.com/service-desk. Thanks, everybody. Take it easy.

  • Been using ITIL since around 1999 I think it was... that was when Remedy was a company of less than 20 people!  At the time working for GE and got my green belt six sigma.  It those days they used to give you a real green or black belt.  Remedy was one of the first helpdesk tools that fully was built around ITIL.  This was around the same time the US started to embrace the international ITIL standard that was born abroad.  At the time there were still competing frameworks out there.

    Bill