1 15 16 17 18 19 Previous Next

Geek Speak

1,495 posts

One of the good parts of my job consists in helping customers benefit from implementing virtualization management solutions. In most cases they start looking for solutions only after they experience problems. Slow VMs, slow applications, you know….


Then what? Fire in the house. The first person to call is the “guy who manages” the whole system – my client. Complains of slow applications or network performance. It’s of course already too late to fix the problem. Especially if there is no virtualization management suite which would allow to not only monitor, but also predict with “what if scenarios”.

Wouldn't it be better to fix the problems before they occurs? Because the slow VMs performance do have source somewhere and it would be nice if you could be alerted that, for example, “In one month time you’ll run out of space in your datastore”, or “In 2 months the capacity of your 2 brand new ESXi hosts will not be sufficient” because you’re creating 5 new VMs per day…


Usually the admins have to firefight two things (but usually much more):


  • Internal problems on applications they run on their infrastructure.
  • Problems related to VM sprawl where the virtualization system becomes inefficient and the only solution would be to throw in more hardware.


One of the first functions of every virtualization management suite is the capability to get the « full picture », to see, for example, that half of the client’s VMs are performing poorly because they’re over (or under) provisioned with CPU or memory, that each second VM is running with snapshots.


Other problems can be arising from bad performance of existing hardware, which is often out of warranty and too old to do the job that is required.

Now what? Buy first and optimize later? No, that’s not what I would usually advise. The optimization phase would consist in helping the client solve their problem first and then give them advice to implement a solid virtualization management solution. The client thanks you not only for your help, which saved his bacon, but also for the advice you gave him to save him from future problems.

FTP, FTPS and SFTP are the most widely used file transfer protocols in the industry today. All 3 of them are different in terms of the data exchange process, security provisions and firewall considerations. Let’s discuss how these are different so it’s easier for you to select the right protocol based your requirement.



File Transfer Protocol (FTP)

FTP works in a client-server architecture. One computer acts as the server to store data and another acts as the client to send or request files from the server. FTP typically uses port 21 for communication and the FTP server will listen in for client communications on the port.

FTP exchanges data using two separate channels:

  • Command Channel: The command channel is typically used for transmitting (send and receive) commands (e.g. USER, PASS commands) over port 21 (on the server side) between the FTP client and server. This channel will remain open until the client sends out the QUIT command, or if the server forcibly disconnects due to inactivity.
  • Data Channel: The data channel is used for transmitting data. For an active mode FTP the data channel will normally be on port 20 (on the server side). And for passive mode, a random port will be selected and used. In this channel, data in the form of directory listings (e.g. LIST, STOR and RETR commands) and file transfers (e.g. normal uploading and downloading of a file). Unlike the command channel, the data channel will close connection on the port once the data transfer is complete.


FTP is an unencrypted protocol and is susceptible to interception and attacks. The requirement of ports to remain open also poses a security risk.


File Transfer Protocol over SSL (FTPS)

FTPS is just an extension to FTP which adds support for cryptographic protocols such as Transport Layer Security (TLS) and Secure Sockets Layer (SSL). FTPS allows the encryption of both the control and data channel connections either concurrently or independently. There are two types of FTPS methods possible:

  • Implicit FTPS: This is a simple technique which involves using standard secure TLS sockets in place of plain sockets at all points.  Since standard TLS sockets require an exchange of security data immediately upon connection, it is not possible to offer standard FTP and implicit FTPS on the same port.  For this reason another port needs to be opened – usually port 990 for FTPS control channel and port 989 for FTPS data channel.
  • Explicit FTPS: In this technique, the FTPS client must explicitly request security from an FTPS server, and then step up a mutually agreed encryption method. If a client does not request security, the FTPS server can either allow the client to continue in unsecure mode or refuse/limit the connection.


The primary difference between both the techniques is that in the explicit method the FTPS-aware clients can invoke security with an FTPS-aware server without breaking overall FTP functionality with non-FTPS-aware clients. Whereas in the implicit method, all clients of the FTPS server must be aware that SSL is to be used on the session, and so becomes incompatible with non-FTPS-aware clients.


SSH File Transfer Protocol (SFTP)

SFTP is not FTP run over SSH, but rather a new protocol designed from the ground up to provide secure file access, file transfer, and file management functionalities over any reliable data stream. Here, there is no concept of command channel or data channel. Instead, both data and commands are encrypted and transferred in specially formatted binary packets via a single connection secured via SSH.

  • For basic authentication, you may use a username and password to secure the file transfer, but for more advanced authentication, you can use SSH keys (combination of public and private keys).
  • Though SFTP clients are functionally similar, you cannot use a traditional FTP client to perform file transfer via SFTP. You must use an SFTP client for this.


A major functionality benefit in SFTP over FTP and FTPS is that in addition to just file transfer, you can also perform file management functions such as permission and attribute manipulation, file locking, etc.







Unencrypted information exchange in both command and data channels.

Communication is human readable.

Encryptions happens on both command and data channels via either implicit SSL or explicit SSL.

Communication is human-readable.

All information exchange between the FTP server and client are encrypted via SSH protocol. SFTP can also encrypts the session.

Communication is not human-readable as it’s in a binary format.

Firewall Port for Server

Allow inbound connections on port 21

Allow inbound connections on port 21 and/or 990, 989

Allow inbound connections on port 22

Firewall Port for Client

Allow outbound connections to port 21 and passive port range defined by server

Allow outbound connections to port 21 and passive port range defined by server

Allow outbound connections to port 22


Choosing which protocol you want to use for file transfer is totally dependent on what your requirement is and how secure you want the file sharing method to be. An effective way would be to use a third-party managed file transfer server which supports all these 3 options so it’s more convenient for you to adjust based on your need.

I’m very pleased to announce that Thomas LaRock, AKA sqlrockstar, has joined the Head Geek team. Many of you already know Tom from his work at Confio, where he played a critical part in their rapid growth as the Ignite product Evangelist. He remains in that role for what is now SolarWinds Database Performance Analyzer (DPA) and we’re thrilled to add his expertise to our group.


What makes Thomas such a highly ranked GEEK*?

  • He’s a Microsoft SQL Server Certified Master and six-time Microsoft SQL Server MVP. (He joins Lawrence Garvin as the second Microsoft MVP on the Head Geek team!)
  • He is known as “SQL Rockstar” in SQL circles and on his blog, which he started in 2003.
  • With over 15 years of IT industry experience, he has worked as programmer, developer, analyst, and database administrator (DBA), among other roles.
  • He wrote “DBA Survivor: Become a Rock Star DBA,” sharing his wisdom and experience with junior- and mid-level DBAs to help them excel in their careers.
  • He is President of the Board of Directors for the Professional Association for SQL Server (PASS), an independent, not-for-profit association dedicated to supporting, educating and promoting the global SQL Server community.


*GEEK: We use this term with all the affection we can muster. Geeks rule. But you knew that, right?

Tom really knows his stuff and he’s great at sharing that knowledge. Seriously, see for yourself on his blog: http://thomaslarock.com. Oh, and he’s already received his SolarWinds lab coat, has started appearing in SolarWinds Lab episodes, and is writing on Geek Speak. Keep it up, Thomas!


Tom lives in Massachusetts with his family and loves running, basketball and until recently, rugby. He also enjoys cooking and is a film junkie. He earned a Master’s in Mathematics from Washington State, and also loves bacon.


Welcome Thomas!


Twitter: @HeadGeeks, @SQLRockstar

thwack: SQLRockstar



Are you ready for another IT Blogger Spotlight? We sure are! This month, we managed to catch up with Brandon Carroll, who blogs over at Global Config and can be found on Twitter where he goes by @brandoncarroll. Here we go...


SWI: Brandon, Global Config is much more than a blog. How would you describe it in a nutshell?


BC: Global Config actually started out as just a blog when I was working on the CCIE Security lab exam. As time went by, though, and as I started doing training on my own I converted the blog into the frontend for my company, Global Config Technology Solutions, which is now a Cisco Learning Partner.


SWI: So then, what’s the Global Config blog all about these days?


BC: Well, I cover a number of different topics, primarily around network security and Cisco products. I also try to get some exposure to other vendors, too, though. For example, I do tutorial posts on how to do certain things with, for example, SolarWinds IP Address Manager and Engineer’s Toolset. Really, I only blog about products that I’m able to get my hands on and use personally. Still, since it’s my company, every now and then I sneak in a post about productivity and Mac, iPad and iPhone apps I think are particularly neat or handy.


SWI: So, you started blogging back when you were working the CCIE Security lab exam, but what are some of your favorite topics to blog about?


BC: Honestly, I just like to blog about whatever fascinates me. And I know that sounds weird, but sometimes I’m fascinated by a product and other times I’m fascinated by a concept or a topic that is covered in one of the courses I teach. Really, the most enjoyable things to write about are the topics that I come up with rather than the topics that somebody asks me to write about. That doesn’t mean that when my students ask me a question I don’t enjoy blogging the answer because I really do enjoy that as well. But there’s just something about taking a thought and putting it down in a blog post and then knowing that other people are reading it and finding value. That it might be helping them solve a problem.


SWI: Interesting. Do you find certain types of posts end up being more popular than others?


BC: Typically, my most popular posts are the tutorial posts. I also see quite a bit of interest in posts related to the Cisco ASA. Sometimes I’ll also do posts about great consumer products that end up being pretty popular. For example, last year I did a post about a D-Link product and using it for IPv6 connectivity. It ended up being one of my most popular posts.


SWI: So, how’d you get into IT in the first place?


BC: When I was 18 or 19, I was trying to become a firefighter. In fact, I joined the Air Force in hopes that I would become a firefighter. After I left the Air Force, however, I found it was very difficult to get a job in that line of work so I ended up working a number of odd jobs. During that time, I applied for a job at the phone company GTE and was hired as a field technician. As a field technician, not only did we install circuits for customers, but we also had laptops. When they would break, somebody had to fix them, and after some time that somebody ended up being me. I would spend part of my day fixing laptops for the other technicians that I worked with. From there, I transferred into a group called EPG, or the Enhanced Products Group, and it was there that I learned how frame-relay, frame relay switches and ATM switches worked, and I was also introduced to the world of Cisco routers and Cisco networking. That must've been right around 1998 or 1999.


SWI: Well, you’ve come a long way since then. As the experienced IT pro you are, what are some of your favorite tools?


BC: Oh, there are too many to list. It’s actually really hard to pick my favorites. One thing I tend to do is jump from tool to tool depending on what I’m trying to accomplish. I like SolarWinds Network Topology Mapper quite a bit because as a consultant I can quickly get a map of a customer’s network and compare it to what they tell me they have. I also like SecureCRT and ZOC6, which are terminal applications for the Mac. Of course, there’s NMap and Wireshark to name a few more.


SWI: OK, time for the tough question: What are the most significant trends you’re seeing in the IT industry right now and what’s to come?


Bc: Software Defined Networking. I think that a controller-based solution will ultimately be what everybody ends up using or at least how everybody implements their technology, and we’re going to see less and less of this hardcode configuration of data and control plans on individual devices. I think we’re also going to see a lot more virtualization. I don’t think we’re anywhere near being close to seeing the end of innovation there and I believe a lot of the newer products that we see in the virtual space are going to be security products. Overall, I think we are in a major transition right now, so being in IT or even starting out in IT at this point in time is going to be very interesting over the next couple years.


SWI: OK, last question: I’m sure running Global Config keeps you pretty busy, but what do like to do in your spare time?


BC: Well, I’m a family man and I like to do things with the kids, so when I’m not working or blogging we like to go camping and ride dirt bikes. We recently bought a truck and a fifth wheel trailer, so we’ve been visiting some local campgrounds. It’s an opportunity to disconnect the phones and teach my young ones what it’s like to play a board game for a couple hours. I don’t think people do that enough anymore.

Centralizing globally distributed corporate IT departments is a large challenge for enterprises. A distributed system not only taxes enterprise resources, but also threatens to impede the efficiency of the growing company. In such organizations, it’s the responsibility of the IT team to manage the infrastructure, technology solutions, and services spread out among thousands of employees across the globe.


In addition, IT teams must support all employee-facing technology including, networks, servers, help desk, asset management, and more. In short, it’s a tough job to support distributed site locations of varying sizes, mostly because of the large number of networking devices and several hundred systems and datacenters housing both virtualized and physical systems.

Disparate IT Management vs. Holistic IT Infrastructure Management

Large enterprises often end up operating individually by region with each location managing its own infrastructure. This fragmented approach consumes a huge amount of resources and diminishes operational efficiency.


Some additional consequences to this approach can include:


  • Regional accountability for individual growth versus the company as a whole
  • Absence of global alignment around service availability and monitoring
  • Duplication of efforts—multiple administrators performing the same tasks


A better, more efficient approach to managing globally distributed IT departments is to build and unify a team equipped to scale as the company grows. This can only be accomplished by adopting a holistic approach towards IT infrastructure management.


A holistic management method provides teams with greater global visibility, alert notification and management, capacity management, service availability, and the ability to measure metrics beyond ‘just uptime’. The key to achieving operational efficiency is to maintain a central access point for all information required for complete visibility, stability and capacity across the global IT infrastructure.


In a network of numerous multi-vendor devices and device types, it’s vital that monitoring and maintenance is centralized. It’s important to leverage end-to-end visibility by prioritizing and focusing efforts to understand the nodes that need attention, those that are nearing full capacity, and those that can be left alone for now.


By unifying monitoring capabilities, IT teams can increase operational efficiency and function efficiently as one unit as their organizations grow and evolve. Some early advantages of a unified approach include:


  • Shared monitoring information that provides faster response to downtime
  • Greater visibility into how changes to business critical devices impact the network
  • Ability to monitor and manage multi-vendor devices under one management umbrella
  • Successfully and confidently meeting end-user SLAs


Holistic IT infrastructure management can be achieved by investing in a software solution or tool. It’s important to choose a tool that helps meet the goals of the IT team, is cost effective, and requires minimal training.


NetSuite, is a leading provider of cloud-based business management software covering enterprise resource planning (ERP), customer relationship management (CRM), e-commerce, inventory and more to over 20,000 organizations and subsidiaries worldwide. NetSuite uses SolarWinds Solution Suite to centrally manage its globally distributed IT organization.

The Heartbleed survey results are in – and the good news is that the vast majority SolarWinds Thwack Users are in the know and on top of it (of course, this comes as no surprise!).


Here are the results



  • Of those 61 respondents surveyed – only 6.6% were not sure if they were effected by the Heartbleed vulnerability and 100% were aware of the vulnerability.

  • When asked if the organization had a clear action plan to address Heartbleed – a whopping 81% were not vulnerable or had fully addressed the vulnerability.  Only 5% were still trying to identify steps or didn’t know what to do.




  • The cleanup of Heartbleed has made an impact on IT, but there is confidence in fast remediation for the most part.  Almost half of respondents said it only took hours to address.  30% said days, 11% said weeks and 10% were not sure.






  • When asked about overall effort/cost of tasks associated with the cleanup, the largest cited effort was following up with vendors to determine if products were effected. In second place – replacing digital certificates was cited and addressing customer concerns about the privacy of data was in third.  Surprisingly, addressing internal concerns about the vulnerability was ranked last, which is either a promising indicator of fast and clear communication as part of the incident response process or a lack of security awareness.




  • And finally, in terms of cleanup effort – the answers were pretty even across operating systems, websites and third party applications.


So, to sum up, its great to see that in spite of all the hype, while it may have been painful, it wasn't devastating to most.  We put a lot of effort into providing a fast vendor response on our end, and we hope that made it a bit easier for our beloved IT pros out there.

We know mobile devices are must-have tools for your end users and they’re growing more accustomed to having options when it comes to picking their corporate-connected mobile devices. Two end user mobile enablement strategies seem to be leading the pack: BYOD (bring your own device) and CYOD (choose your own device). BYOD, of course, involves end users providing their own device(s) from a virtually unlimited set of possibilities and connecting it to the corporate network and resources. CYOD, on the other hand, involves end users selecting their soon-to-be corporate-connected mobile devices from a defined list of devices that IT supports and can have more control over. The idea being that the burden on you is lessened because you don’t have to be prepared to manage every mobile device under the sun.


So we’re curious, has your organization settled on one of these strategies over the other? If so, we’d love to hear about your first-hand experience implementing either of these policies—or a hybrid approach—into your organization, and how your company arrived at the decision to go the route they did. If you have implemented a CYOD policy, what benefits have you seen? Was it successful or did your employees revolt and turn to smuggling in their own devices anyway? I'm looking forward to hearing your feedback.


And if you haven’t already taken our BYOD vs. CYOD poll, you can find it here.

Just like how the network admins keep a watch on the configuration changes of networking hardware, it is also important to monitor config changes in a virtual environment. While network configuration errors are definitely a nightmare and can cause the network to go down, config errors in a virtualized host or VM can impact all the servers and applications dependent on them. In a virtual environment, most of the repair time is spent on investigating what changed in a system. Unlike network devices whose config files we want to keep in good state and not make changes unless required, the nature of the virtual environment is such that it is dynamic and VMs will be migrating between hosts and clusters, there will be resources provisioned and reallocated constantly. And all this results in configuration changes. Config changes are also possible during routine processes such as software updates, security patches, hot fixes, memory, CPU and disk upgrades. If we are not keeping a track of what is changing in the virtual layer, we wouldn’t be able to diagnose and troubleshoot performance problems easily.




The best solution is to map the state of your virtual environment and all its components (such as clusters, VMs, hosts and datastores) in accordance with time, and maintain historical status of the map as it evolves and encounters changes. You need to be in a position to compare the current and historical configuration state of a specific VM between different dates and times. And also compare one VM configuration with another over a specific time in consideration. In this way you’ll be able to what has changed and get visibility for troubleshooting config changes.

  • Compare configuration of VMs before and after vMotion or live migration
  • Monitor whether configuration of a VM has changed over time
  • Monitor resource allocation (CPU, memory, disk and network) to VMs as this directly impacts VM configuration
  • Monitor VM workload: To meet growing workload the hypervisor can provision more resources to a VM and this could result in a config change
  • VM Sprawl: Zombie/stale/inactive VMs present in the virtual environment can cause resource contention amongst the active VMs and, in turn, cause config changes at the host level


Virtualization config changes are part of the package and there are bound to be changes as VMs keep moving and migrating. This doesn’t mean VM migration is risky. Flexibility of VM movement is one of the benefits achieved with server virtualization and we should leverage it. Employ the right virtualization management tool to ensure your virtualization environment is mapped, and configuration changes are captured and reported!

Like everyone else, I am very busy at my job, and I tend to be a jack of all trades. I find myself running from fire to fire, getting things under control. I wish I had time to be more proactive and do everything according to the text book.  But, the reality of a short staffed global manufacturing company doesn’t always allow for such luxuries. 

Sometimes things are insanely busy and I don’t know how I get through the day, every once in a while there is a lull in the action and it’s during these times I try to focus on making my network management station better and catch up with a lot of proactive maintenance. I know that in the long run, every hour I spend on the NMS, will be multiplied my times over in stress savings and performance improvements

These lulls don’t happen very often and when they do, I don’t have months to learn an overly complex, high priced NMS, that requires a degree in computer programming in order to get the simplest tasks done.  I know what I need to get done.  I just need to translate that into a task in the NMS.

That’s why I really like the Solarwinds suite of network management products, they are cost effective, easy to use, easy to install, and very easy to run and maintain and if I ever have an issue , I can just go jump on over to the thwack community and find what I’m looking for quickly and efficiently.  There are so many great products from Solarwinds.

They are a company that truly gets it, they understand that I don’t have a lot of time, to spend on learning SQL programming or Oracle databases, or PERL reports.  I just want to monitor my network, with a user friendly graphical user interface.  A few clicks here and there and I am literally up and running. 

Network management is not the main part of my job, in fact it’s a rather small part. I need to run and protect the business and keep the lights on and the machines running. There is such a huge return on investment, with Orion, it has saved my bacon many times, while nothing is perfect , I have to say that I am very happy and sleep better at night knowing I got someone watching my back. I hate being woken up by the pager at 3am, but  that happens a lot less nowadays. I know now what the problem is and get to work on fixing it, instead of fumbling around wondering what happened, and having people asking me for status every 5 minutes.

How are you using your NMS, is it your full time job? I think most people are like me, they have a lot of other things to get done.  I am so thankful the product is so easy to use, it really makes my job that much better.

It’s hard to fix something if you don’t know what is broken. And my NMS is my eye in the sky. Those help desk tickets won’t stop coming in. The phone doesn’t stop ringing, but hey I got this. How bout you?

michael stump

Root Cause Paralysis

Posted by michael stump Apr 28, 2014

So far this month, we've talked about the difficulty of monitoring complex, interconnected systems; the merging of traditional IT skills; and tool sprawl. You've shared some great insights to these common problems. I'd like to finish up my diplomatic tenure with yet another dreaded reality of life in IT: Root Cause Analysis.


Except... I've seen more occurrences of Root Cause Paralysis lately. I'll explain. I've seen a complex system suffer a major outage because of a simple misconfiguration on an unmonitored storage array. And that simple misconfiguration in turn revealed several bad design decisions that were predicated on the misconfiguration. Once the incident has been resolved, management demanded a root cause analysis to determine the exact cause of the outage, and to implement a permanent corrective action. All normal, reasonable stuff.


The Paralysis began when representatives from multiple engineering groups arrived to the RCA meeting. It was the usual suspects: network, application, storage, and virtualization. We began with a discussion on the network, and the network engineers presented a ton of performance and log data during the morning of the outage to indicate that all was well in Cisco-land. (To their credit, the network guys even suggested a few highly unlikely scenarios in which their equipment could have caused the problem.) We moved to the application team, who presented some SCOM reports that showed high latency just before and during the outage. But when we got to the virtualization and storage components, all we had was a hearty, "everything looked good." That was it. No data, no reports, no graphs to quantify "good."


So my final questions for you:


  1. Has this situation played out in your office before?
  2. What types of information do you bring with you to defend your part of the infrastructure?
  3. Do you prep for these meetings, or do you just show up and hope for the best?



Before we discuss the “how” aspect of cleaning up the firewall rule base, let’s understand the “why” which scoped out the need to perform clean-up.

  • Firewall Performance Impact: The firewall rule base is something that always tends to grow as network admins and security admins keep adjusting them to address firewall policy changes. If left unchecked, your firewall rule base swell to have hundreds or even thousands of rules which will make it harder for the firewall to process – which leads to reduced performance.
  • Firewall Configuration Errors: With complex rule sets, there is the possibility of some unused rules and duplicate rules causing config errors. Given the massive size of the rule base, it become more difficult for the administrator to figure the cause of the error and rectify it.
  • Security Vulnerability: Unmanaged and unchecked firewall rule base can contain rules and objects that open up a security gap in your network. You may not intend them to be there. But you may never know there are these old and unused rules in your firewall that pose a threat to your network access control.
  • Regulatory Compliance Requirements: Compliance policies such as PCI DSS require cleaning up of unused firewall rules and objects. According to PCI DSs 3.0 requirement 1.1.7, firewall and router rule sets have to be reviewed at least every six months.


So, it comes back on the administrator to identify redundant, duplicate, old, unused, and shadowed rules and remove them from the rule base to achieve optimized firewall performance. Let’s discuss how you can do this.



Structural redundancy needs no additional data and is based on identifying rules that are covered by other rules and have the same action (redundant rules), or the opposite action (shadowed rules). In either case, a rule that is redundant or shadowed is a candidate for elimination. You can employ an automated firewall management tool to conduct a structural redundancy analysis to identify redundant rules. Automated tools help you generate a report and even a clean-up script. In addition to the redundant and shadowed rules, you should also find the rules that cause their redundancy, unreferenced objects, time inactive rules, disabled rules, and so on.



Log usage analysis identifies rules and objects that can be eliminated based on zero usage as analyzed using log data. Firewall management tools generally use two techniques to use log data. The first technique uses log data files, the second sets up log data collection directly from the device or management server. Here again, a report and clean-up script are generated.


For both cases, you can run the script to remove the identified rules and objects from the firewall rule base. It’ll be more effective if you conduct the log usage analysis first, and clean up unnecessary rules.  The cleaned up rules may be removed from the configuration or disabled. Then the structural cleanup report can be generated to identify additional rules that can be removed.



  • Redundant or duplicate rules slow firewall performance because they require the firewall to process more rules in its sequence
  • Orphaned or unused rules make rule management more complex, which creates a security risk by opening up a port or VPN tunnel
  • Shadowed rules can leave any other critical rule unimplemented
  • Conflicting rules may create backdoor entry points
  • Unnecessarily boated firewall rules can complicate firewall security audits
  • Erroneous or incorrect rules with typographical or specification inaccuracies can cause rules to malfunction



FSM WP.png


A Day in the Life of a DBA

Posted by karthik Apr 22, 2014

A database admin has responsibilities to fulfill around the clock. From ensuring the backing up of databases, attending to breakdowns of applications affecting database performance, verifying accuracy of information within the organizations’ database, and constantly monitoring the entire database server. Fulfilling all these responsibilities is what makes a DBA one of the most valuable players in an organization. On any given day, database admins have a set of routine tasks to attend to, these include:


SQL Server® Logs

DBAs view SQL logs to see whether SQL agent job statuses have completed all required operations. If the job status is incomplete, then this will lead to errors within the database. Looking at SQL logs regularly will ensure an issue or a database error doesn’t go unnoticed for an extended time period. Login failures, failed backups, database recovery time, etc. are key fields a DBA looks for in SQL logs. Looking up SQL logs are beneficial, especially when you have critical databases in your environment.


Performance Tuning

In order to fully maximize the potential of the database server and also ensure applications don’t have downtime due to a SQL issue, it’s become a best practice for DBAs to monitor SQL server performance and metrics. Whether an issue is due to an expensive query, fragmented indexes, or database capacity, DBAs can set up optimum baseline thresholds. In turn, they’re notified whenever a metric is about to reach the threshold. In addition, it helps to glance through these metrics to see workloads and throughput so you can adjust your database accordingly.


Database Backup

DBAs have to regularly test backups and make sure they’re restored. This allows them to be risk-free from issues pertaining to applications and user backups if they’ve been deployed to a different host, server, or datacenter. DBAs also regularly test backups because this helps them verify if they’re staying within the SLA.


Reporting & Dashboard

As the size of a database grows, complexity of maintaining and monitoring also grows. Database issues have to be addressed as soon as possible. Therefore, DBAs need real-time data on SQL performance before a disaster occurs. For this reason, having up to date information in the form of reports and dashboards provides visibility and reporting about the server, hardware resources, SQL queries, etc. DBAs need access to reports for database size, availability and performance, expensive queries, transaction logs, database tables by size, and so on.


Activities such as database maintenance, data import/export, running troubleshooting scripts, etc. are other areas DBAs focus on and spend their time. To manage and optimize SQL server, it’s essential to consider using a performance monitoring tool which comprehensively monitors vital metrics within your SQL server and simplifies your “to do” list of activities.


Read this whitepaper to learn more about how to monitor your SQL Server.

I had an opportunity recently to interview a long time SolarWinds Server & Application Monitor (SAM) customer, Prashant Sharma, IT Manager, Fresenius Kabi (India).


KR: As an IT manager, what are you primary roles and responsibilities?

PS: I’m in charge of the whole of IT where I manage the IT environment, look at IT security, data center performance, monitoring, and also application development.


KR: What other SolarWinds products do you currently own other than SAM and how are you using SAM?

PS: Other than SAM, we currently have NPM and NTA. For monitoring IT security, we use RSA, but I’m not happy with RSA as it’s too expensive to maintain and we are looking at replacing it with LEM. We have around 98 nodes and we use SAM to monitor the performance of servers, we monitor critical applications like SQL, SharePoint, IIS, AD, and we also have custom and out of the box applications we monitor for R&D using SAM. Since we have a huge R&D center, we do monitor applications in both development and production environment. We use the built in feature integrated virtualization module within Orion to monitor our virtual environment as we are a VMware shop.


KR: Why did you choose SAM and what other products did you look at before narrowing down on SAM?

PS: We chose SolarWinds products because they are easy to implement and troubleshoot. We are actually able to setup in about an hour. We have also never reached SolarWinds for any issues in the last 4 years of owning the product. SAM is cost effective software and we have all the features that we want in the product. We also evaluated other products from CA and Whatsup Gold. Ended up going with SolarWinds as it was fairly simple and straight forward. We also own NPM and NTA, and are able to monitor for issues easily with those as well.


KR: How was it before using SAM and how have things changed right now?

PS: SolarWinds is the only company we use for monitoring. Before SAM we didn't know how to troubleshoot for issues, didn't know where to go and look for issues and solve them. Now we are able to identify performance issues more easily with SAM and are able to save a lot of time and money. I also leverage thwack to find answers to anything I needs from other IT pros.


Learn more about Server & Application Monitor

I work at a global automotive manufacturer and they have offices all over the globe, and one of the ways we use our network management for is capacity planning. For example, if I have a plant in Bejing China, I can set an alert to let me know when a circuit is going over a predefined threshold and when it does, I can start the process of ordering a new circuit. As you might imagine, the logistics of ordering a circuit in China is full of all kind of red tape and is going to take a while to get into place. I can use all the head start that I can get and Orion gives me the lead time I need to get things done.


Another thing I really like NMS for is to be able to monitor manufacturing plants all over the globe, where their main job is to build cars and not information technology.  Where any outage can cost, thousands to millions of dollars, and the pressure is on.  Orion helps me do my job, so I can help the workers do their job.  This way we can deliver enterprise class IT support with a skeleton crew at the actual site.


Some Network management systems are hard to use and cost hundreds of thousands of dollars, and you need a PHD to figure them out.  Orion is so easy to use and there is so much help from the online community, Just click here, input a bit of management information and you’re up and running.  It’s not all roses and chocolate, but you can be up and running in a few hours , or a few days. This is in contrast to a few months or years with other overly complicated network management systems, that require plenty of well paid consultants eating up the projects budget.


One particular incident, I can recall recently is our chassis switches switching blades turned out to have a manufacturing defect in them. This defect only manifested itself above a certain temperature. When the defect hit, the blade just shut down.  Production stopped, pagers went off and phones rang, everyone wanted to know when the network was going to be back up again, and why it went down. I got the blade RMA’ed to the manufacturer, and the problem was solved.


I then found out this was a widespread defect and the only way to tell if a blade was affected was to send in the serial number to the manufacturer. Network management to the rescue, I was able to gather all the serial numbers from all the blades on all the switches globally. Let me tell you hundreds of blades were part of this problem and they are now being scheduled to be replaced instead of waiting for them to fail. NMS saved my hide this time, a massive failure like that would not have been fun. Orion has paid for itself so many times over, it’s just a no brainer, a necessary tool of the trade.


To Summarize, Network management Is good for Capacity planning, monitoring sites without a full IT staff, network inventory , so you know how your network is put together. And finally Orion is just so easy to use. Just set it up and let it do its job. I couldn’t imagine how I would do my job without it and stay sane.


It’s so much better to be proactive , then reactive, while nothing is perfect, NMS can make your job a lot easier and maybe even a bit of fun.  It’s the shock absorbers , that help you avoid the pot holes of an IT career, so relax and enjoy, it’s going to be a smoother ride from here on out.

michael stump

Too Many Tools

Posted by michael stump Apr 21, 2014

I'll make an assumption: if you're a Thwack user and you're reading this post, you've got an interest in systems and applications monitoring. Or maybe you just really want those points. Or that Jambox. Whatever's cool with me.


But if you're a tools geek, this post is for you.


Tool suites aspire to have the ability to monitor everything in your infrastructure, from hardware, through networking and virtualization, right up to the user. But to be honest, I've never seen a single solution that is capable of monitoring the entire environment. Many times this is not a technological problem; organizational structures often encourage multiple solutions to a single problem (e.g., every team purchases a different tool to monitor their systems, when a single tool (or a subset of tools) would suffice). Other times, tool sprawl is the result of staff or contractor turnover; everyone wants to bring the tools they know well with them to a new job, right?


Tool sprawl is a real problem for IT shops. The cost alone of maintaining multiple tools can be staggering. But you also run into the problem of too many tools. You'll likely end up monitoring some systems many times over, and will most certainly miss other systems because there's confusion over who is monitoring what. As a result, you'll end up with more tools than you need, and the job still won't be done.


How do you manage or prevent tool sprawl at work? Do you lean on a single vendor, like SolarWinds, to address every monitoring need that arises? Or do you seek out best-of-breed, point solutions for each monitoring need, and accept the reality that there is no single pane of glass?

Filter Blog

By date:
By tag: