Community
Community

Find all you need to begin your THWACK journey, including documentation, missions, blogs, community groups, events, and media.
Command Central

Getting Started
MVP Program

MVP Program
Monthly Mission

THWACKcamp: The Mission
Blogs

Community Announcements

Product Blog
Groups

DevOps Discourse

Data Driven

SolarWinds Academy

See All Groups
Events

SolarWinds User Group

THWACKcamp

Bracket Battle
Media Vault

Movies & Mainframes

TechPod

THWACK Tech Tips

THWACK Livecast

SolarWinds Lab Archive

THWACKcamp Archive

See All Media
Products
Products

Find the best place to learn and ask questions about your SolarWinds products.
Observability

Observability Solutions

SolarWinds Observability

Hybrid Cloud Observability
Network Management

Network Performance Monitoring

NetFlow Traffic Analyzer

Network Configuration Manager

IP Address Manager

User Device Tracker

VoIP & Network Quality Manager

Log Analyzer

Engineer's Toolset

Network Topology Mapper

Kiwi CatTools

Kiwi Syslog Server

ipMonitor
Application Management

AppOptics

Loggly

Papertrail

Pingdom

DevOps
IT Security

Access Rights Manager

Identity Monitor

Security Event Manager

Patch Manager

Serv-U FTP & MFT
IT Service Management

SolarWinds Service Desk

Web Help Desk

DameWare Remote Support

DameWare Remote Everywhere

DameWare Mini Remote Control
System Management

Server & Application Monitor

Virtualization Manager

Storage Resource Monitor

Server Configuration Monitor

SolarWinds Backup

Web Performance Monitor
Database Management

Database Performance Analyzer

SQL Sentry

Database Performance Monitor

Task Factory
Content Exchange
Content Exchange

Find downloadable files and templates other users have built and found useful to share with others.
The Orion Platform

Alerts

Custom HTML

Custom Queries

Modern Dashboards

Reports

Scripts
Server & Application Monitor

API Pollers

Application Monitor Templates
Database Performance Analyzer

Custom Alerts

Custom Metrics

Custom Queries
Server Configuration Monitor

Policies

Profiles
Network Performance Monitor

Device Pollers

Universal Device Pollers
Network Configuration Manager

Config Change Scripts

Device Templates

Firmware Upgrade Templates

Policy Documents
SQL Sentry

Advisory Conditions
Web Help Desk

Style Sheets
Resources

Customer Portal
Create individual user accounts for your team, manage your licenses, download your SolarWinds software, create and track support tickets, and more.

Academy
A one-stop-shop for world-class training for SolarWinds products through on-demand videos, and instructor-led classes. All SolarWinds Academy content is included with every software purchase.

Support
Get help when you need it from a world-class support team, available to assist with technical product issues 24 hours a day, seven days a week, 365 days a year.

Partner Portal
Accelerate SolarWinds Partners’ ability to drive digital and IT transformation for customers with powerful tools, resources, and increased profit potential.
Free Tools & Trials
Store

Resources

Geek Speak Root Cause Paralysis

Root Cause Paralysis

_stump over 9 years ago 1 minute read time

So far this month, we've talked about the difficulty of monitoring complex, interconnected systems; the merging of traditional IT skills; and tool sprawl. You've shared some great insights to these common problems. I'd like to finish up my diplomatic tenure with yet another dreaded reality of life in IT: Root Cause Analysis.

Except... I've seen more occurrences of Root Cause Paralysis lately. I'll explain. I've seen a complex system suffer a major outage because of a simple misconfiguration on an unmonitored storage array. And that simple misconfiguration in turn revealed several bad design decisions that were predicated on the misconfiguration. Once the incident has been resolved, management demanded a root cause analysis to determine the exact cause of the outage, and to implement a permanent corrective action. All normal, reasonable stuff.

The Paralysis began when representatives from multiple engineering groups arrived to the RCA meeting. It was the usual suspects: network, application, storage, and virtualization. We began with a discussion on the network, and the network engineers presented a ton of performance and log data during the morning of the outage to indicate that all was well in Cisco-land. (To their credit, the network guys even suggested a few highly unlikely scenarios in which their equipment could have caused the problem.) We moved to the application team, who presented some SCOM reports that showed high latency just before and during the outage. But when we got to the virtualization and storage components, all we had was a hearty, "everything looked good." That was it. No data, no reports, no graphs to quantify "good."

So my final questions for you:

Has this situation played out in your office before?
What types of information do you bring with you to defend your part of the infrastructure?
Do you prep for these meetings, or do you just show up and hope for the best?

Go!

Top Comments

Parents

Sohail.Bhamani over 9 years ago

You have to come ready to defend your technology domain. Just saying its good would never work in any of the environments I have worked.
- Cancel
- Vote Up +1 Vote Down
- More
- Cancel

Comment

Sohail.Bhamani over 9 years ago

You have to come ready to defend your technology domain. Just saying its good would never work in any of the environments I have worked.
- Cancel
- Vote Up +1 Vote Down
- More
- Cancel

Children

No Data

Thwack - Symbolize TM, R, and C

SolarWinds solutions are rooted in our deep connection to our user base in the THWACK^® online community. More than 195,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.