In my last post, we discussed implementing voice into a new environment, now I figured we would discuss troubleshooting that environment after it's initially deployment. Only seems natural right?
Now, due to its nature troubleshooting VoIP issues can be quite different then troubleshooting your typical data TCP/UDP applications and more often then not we will have another set of tools to troubleshoot VoIP related issues. (And hopefully some of those tools are integrated with our day-to-day network management tools)
IP SLA/RPM Monitoring*:
This is definitely one of my favorite networking tools, IP SLA monitoring allows me see what the network looks like from a different perspective (usually a perspective closer to the end-user). There are a few different IP SLA operations we can use to monitor the performance of a VoIP network. UDP Jitter is one of those particular operations that allow us to get a deeper insight into VoIP performance. Discovering variances in jitter could point to an incorrectly sized voice queue or possible WAN/transit related issues. Have you ever considered implementing a DNS or DHCP IP SLA monitor?
*Keep in mind IP SLA monitoring can also be used outside of monitoring the VoIP infrastructure, other operations support TCP/HTTP/FTP/etc protocols so you can get the user's perspective for other mission critical applications.
Another great tool to have in the arsenal. NetFlow is an easy way to see a breakdown of traffic from the interface perspective. This allows you to verify your signaling and RTP streams are being marked correctly. It also provides you with the ability to verify other applications/traffic flows are not getting marked into the Voice queue unintentionally. Different vendors can run their own variations of NetFlow but at the end of the day they all provide very similar information, many of the newer NetFlow versions allow more granular control of what information is collected and where it collected from.
While this one is not an actual tool itself. Keeping an eye on your MOS scores can quickly identify trouble spots (if the end-users don't report it first that is) by identifying poor quality calls.
Depending on what you trying to troubleshooting you can definitely get some great insight from polling some more specific information using the UnDP:
(Some of these will only be manageable for smaller locations/deployments)
- Number of registered phones - displayed in a graph format so you can easily drops in registered phones
- CME Version - Specific for Cisco routers running CME, but keeping track of the CME versions could help isolate issues to a specific software set.
- Below are a few others I have created as well, below is a sample VoIP dashboard.