Troubleshooting Issues – Administrators Play Network Gumshoe
Network admins definitely play the role of Network Gumshoe. Dealing with daily network issues like bandwidth hogs, IP conflicts, rogue users, and more—administrators spend a considerable amount of time investigating and resolving network issues. But are they really equipped for this kind of troubleshooting? Is there a specific troubleshooting process involved in finding problematic users/devices while ensuring minimal downtime?
In a network, employees come in with devices pre-configured with IP addresses from prior Internet connections (home or elsewhere). This could result in an IP conflict with a critical application server that could cause an interruption of services. In other cases, IP conflicts happen when a network admin accidently assigns a duplicate IP address, or a rogue DHCP server operating in the network hands out IP addresses at will. Bandwidth issues creep up in the presence of a YouTube hog, or when someone misuses company resources for unofficial purposes. Finally, rogue users who’ve somehow gained entry to the network may attempt to access confidential data or restricted networks. All these frequently occurring incidents threaten to upset any smoothly functioning network.
In any case, the primary goal of a network admin is to fix an issue with minimal downtime and take steps to ensure that it doesn’t happen again. For issues associated with problematic users/devices in a network, here are four simple steps to follow when troubleshooting:
- Quickly identify and investigate the problematic user/device.
- Locate the problematic user/device.
- Immediately remediate the problematic user/device.
- Take steps to prevent the same situation from happening again.
- To quickly detect problems in the network, it’s best to have a monitoring tool in place. Depending on which specific area of the network needs monitoring, admins can set up timely alerts and notifications. Specific monitoring tools are available to help, including those that let you see the up/down status of your devices, IP address space, user/device presence in the network, etc. Once the bandwidth hog, IP conflict, or rogue DHCP is identified, the first step of the troubleshooting process is complete.
- The next critical step is determining whether the user/device in question actually caused the problem. You need to look at detailed data that reveals the amount of bandwidth used, who used it, and for what application. You should also look at details on devices in an IP conflict and determine what type of conflict it was, look for the presence of unauthorized devices in the network, and so on. This investigation should also provide data on the location of the user/device in the network, including details like switch port information, or the Wireless Access Point (WAP), if it’s a wireless device.
- The third step is remediation. Whatever caused the network interruption needs to be fixed. Knowing the location of the problem—as mentioned in the previous step—it’s very helpful in taking immediate steps. Admins can both physically locate the device and unplug network access, or they can use tools that enable the remote shutdown of devices. The remote facility especially helpful for admins working with networks spread over large areas or multiple locations. The critical point here is that network access needs to be revoked immediately.
- Finally, take steps to prevent the same problem from happening again. If it’s the case of a problematic user/device, make sure you block or notify entry of these systems into the network. Create check points and monitoring mechanisms so that you can take proactive measures and prevent unauthorized users from entering your network.
What troubleshooting processes do you follow in your organization? Feel free to share your experiences, which fellow network admins might find useful.