Network monitoring tracks the state of the network and is primarily looking for faults. At the most basic level, we want to know if devices and interfaces are "up." This is a simple binary reachability test. Your device is either reachable or not, it's either "up" or "down." However, just because a device is reachable does not mean there are no faults in the network. If a circuit is dropping packets, performance may be impacted and can make the circuit unusable even though it is "up." Time to stop thinking in terms of reachability and start thinking in terms of availability.
Availability is a service oriented concept that asks, "is the service this widget provides available to its users?" Is the service 100% available or is it degraded in some way? Here are some examples of situations that simple reachability monitoring has difficulty detecting:
In the first two cases, you will probably hear about it from the end users. In the last two cases, you might not know about them until something else changes in the network that causes a (possibly confusing) outage. And probably a bunch of trouble tickets.
Are you thinking in terms of availability or reachability? Is your NMS configured to match your mindset?