RichardLetts

Not enough information here: What type of devices? e.g. i have 13,000 wireless access points, but only 20 wireless controllers -> only 20 elements being polled. some of our switches have two monitored interfaces (3 elements) , some of our router have 1,000 VLAN interfaces (1000 elements) so, for NPM count elements = number…

in Server HW and SW specs for 20000 device environment Comment by RichardLetts February 2016

as aLTeReGo mentioned there is an OOTB alert for node rebooted, that actually looks for the restarrt of the SNMP agent, which is not useful if your admins restart the daemon/service. If system supports the host resource MIB (and the NET-SNMP commonly found on Linux and Solaris does) then look at: hrSystemUptime.UnDP and…

in OS Restart Monitoring Comment by RichardLetts April 2013

Note: though you think the node is not in NCM it may still there really... I've found that if I delete a node in NPM it stays in NCM, but unfindable through the webUI. I have to go into the database and manually delete the entry from the database manager. here is the query I use: SELECT n1.agentip,n1.nodeCaptionfrom…

in Finding out who removed a Node from NPM, when there is no log in the events, and node is not present in NCM anymore Comment by RichardLetts March 2014

1. Avocent. ACS 6000 Advanced Console Server | Serial Consoles | Avocent 2. 8 and 16 ports (there are larger units, but we do not need that many ports) 3. not much 4. varies by model -- its good to be able to mix and match the hardware for the location. 5. varies by model (~$3.5K) 6. See above. In our datacenters the…

in What serial-to-Ethernet terminal service hardware do you use to provide remote access to switches & routers when SSH isn't available? Comment by RichardLetts February 2016

If this is a 'carrier ethernet' type connection then we use exfo equipment to verify performance at Layer-2. http://www.exfo.com/products/quality-service-assessment/service-assurance-platform/bu5-verifier-series/bv-10 we've found that there can be unexpected bottlenecks inside vendor networks, and sending them a ITU-T…

in WAN speed testing tool needed Comment by RichardLetts January 2017

How far away is your poller from the device being monitored? if I am reading the graph correctly you have a round-trip time of more than 2 seconds on occasion. The default icmp ping timeout is 2500ms -- the command line ping may have a longer timeout (you have not included the output of ping so I can only guess as to what…

in False Packet Loss Comment by RichardLetts February 2015

Doesn't the standard one work? The Brocades I have seem to report their utilization just fine also, which model of brocade switch, what type of interfaces, and running in what mode; these all affect what gets exposed in the Brocade MIB (some vendors have very poor MIBs, it's up to us as product buyers to complain to them…

in Does anyone have a sample alert for monitoring ports on a Brocade switch? Comment by RichardLetts January 2013

What manufacturer? Normally its monitored through the controller.

in Need some help with setting up alerts for AP's Comment by RichardLetts September 2014

check out Re: Alerts on Groups there is the custom SQL query in there to check if the alerting note is a member of a group... AND ( nodes.nodeid IN ( SELECT cms.entityid FROM containers c INNER JOIN containermembersnapshots cms ON cms.containerid=c.containerid WHERE cms.entitytype='Orion.Nodes' AND c.name='Firewalls') )…

in Advanced Alert Question: How to create an Advanced Alert for nodes that belong to certain groups. Comment by RichardLetts February 2015

I've posted on this topic extensively, and there are several ideas out there for improving the alert engine. the Trap and syslog processors work in real-time on incoming messages, whereas the alert manager runs asynchronously based on what is in the database. You have to think hard about how this is going to work together…

in Alert / Event from Syslog or Trap Comment by RichardLetts October 2013

Moved to 11 [from 10.6.1] because we really want NCM 7.3.1 [and NPM11.x a pre-requsite]. 12 looks to be a big change, and I'm hoping we'll get one of the betas up on our test server once we have NCM 7.3.1. We have 10 Solarwinds servers in our install [five pairs running the FOE]

in Large scale deployments waiting for 12 or moving on 11? Comment by RichardLetts August 2014

BUMP... Heard today we might be adding some ADVA equipment [of some kind] to our network...

in How to Backup Config of ADVA TDM? Comment by RichardLetts January 2015

We use iperf for performance testing: when commissioning circuits we create a private network and protect the rest of the traffic from it to ensure consistent and meaningful results. We have bulk transfer tests, and sweep-ping tests during bulk transfers. telco sometimes fail to deliver the committed bandwidth on the first…

in Feature Request: NPM:: Speed Test Comment by RichardLetts October 2014

Which system time? the NPM server or a network component? If the latter, and the device supports the HostResource MIB then use UnDP to pull hrSystemDate Why do you need this information? what are you going to do with it? I'm going to point out that not every device supports that MIB, and it would be far better to make sure…

in Getting system time with NPM Comment by RichardLetts July 2015

Yes you can do this, but I'm not a powershell programmer; I do everything with Perl & SQL (and python once I get an editor that copes with a language where white space is significant -- whoever came up with that was having a throwback moment to FORTRAN) so i can't help with the powershell part. I see there is an upcoming…

in Help with Powershell Script to Bulk Change Polling Method from SNMP/ICMP to WMI Comment by RichardLetts January 2016

What do you mean by: 'but I want to monitor their secondary internet connection. ' -- what are you trying to find out. [Excuse the poor windows Paint picture, but I work a lot better with pictures in cases like this] if you want to know when traffic on A instead of B then I would put a custom property on the interface A,…

in Interface Alerting Comment by RichardLetts May 2018

You should not need to turn off the firewall, but you may need to punch holes through it. what is displayed when you click on each of the red dots. e.g. a working msmq TCP shows "Found process 'mqsvc' listening on TCP port 1801." The diagnostic message there can be helpful. Note: some of the 'errors' come from solarwinds…

in Solarwinds Active Diagnostics: Network Ports failing Comment by RichardLetts November 2015

As far as I can see only the SolarwindsInformationServicev3 and Solarwinds.Alerting.Service are 64bit. the others are all 32bit.

in 64 bit process on NPM 11.5.2? Comment by RichardLetts November 2015

mibs.cfg has no relationships to machineType -- those values appear hard coded into the application. After 6+ years I still don't know why machineType has not been pulled out into a separate table that we can edit. you can only fix this by opening a support case and getting a buddydrop. OR you can define your own F5 device…

in F5's - Unknown Machine Type Comment by RichardLetts March 2016

Yes note: this is absolutely nothing to do with solarwinds, and is purely a matter of you configuring your network and windows network properly Which IP address is used as the source? - SolarWinds Worldwide, LLC. Help and Support basically, if the solarwinds server has an IP address physically on the same subnet as the…

in Can Orion primary engine work on 2 IP addresses? Comment by RichardLetts June 2018

we put the ticket number in the Acknowledgement of the alert. and use a custom swql query: SELECT Nodeid,NotesFROM Orion.AlertStatusINNER JOIN Orion.Nodeson Nodes.NodeID=AlertStatus.ActiveObjectand AlertStatus.ObjectType ='Node'Where Nodes.NodeID = ${NodeID}

in Visible Ticket number or comment in Node Down Resource Comment by RichardLetts October 2013

something like this: WHERE ( apm_alertsandreportsdata.componentname = 'Service 1A' AND apm_alertsandreportsdata.componentstatus = 'Down' ) AND EXISTS (SELECT 1 FROM apm_alertsandreportsdata apm2 WHERE ( apm2.componentname = 'Service 1B' AND apm2.componentstatus = 'Down' )) should generate an alert on 1A being down only if…

in Advanced Alerting - Auto correlation Comment by RichardLetts November 2013

follow this document: Rebuild the Orion Website - SolarWinds Worldwide, LLC. Help and Support there is nothing that needs to be kepts from the inetpub\Solarwinds website -- re-running the config wizard after following the steps should fix it right up.

in Orion NPM webconsole will not start Comment by RichardLetts September 2017

Note for people doing this: you should probably supply multiple SNMP walks for the same device-- one for the core instance and others for [some of] the VRF instances showing what attributes are in the 'root' SNMP instance, and those parts which are global/ filtered / specific to the VRF instance. e.g. Global: system.*…

in Having Juniper and want to see VRFs in NPM? Comment by RichardLetts September 2013

Given CPU load can never be negative; all possible valid values are >=0, so the first condition is always true.

in CPU Alert only fires once Comment by RichardLetts January 2015

I was using groupofgroups as view limitations -- basically they are broken because the query that check to see what should be displayable on the page is huge. I think in previous versions the calculation was done in SWIS by evaluating the container snapshot membership, instead this version appears to be re-evaluating the…

in NPM 11.5.2 GroupofGroups limitations breaking Information Service v3 Comment by RichardLetts October 2015

in my environment any discards are significant -- I don't really care what the percentage is (1% discards on a 100Gbps link is 1Gbps of network traffic not reaching its destination.) Discards normally indicate some fairly significant engineering issue (e.g. bandwidth mismatch) that needs to be addressed

in Discard Percentage Alert Comment by RichardLetts June 2016

the ACL on the WebImageCache may be improperly ordered and cannot be repaired by the PermissionChecker tool. You will need to open the permissions editor from the Windows folder browser and correct the order before re-running the permissions checker.

in erro to do update NPM Comment by RichardLetts August 2016

Look at using the Dynamic alert thresholds in 10.7 (I don't think you can use the 95th percentile figure as that is dynamically calculated on the graphs from the reported time period) I'm looking at this as we're establishing more metrics for the correct/expected operation of the network as part of our ITIL maturity…

in Alert on node latency over 95th percentile Comment by RichardLetts January 2016

You need to define rules in the SNMP trapviewer on the server. You can color the trap, or tag it with a value, or do something else with it...

in Replacing SNMP Trap Content with Custom Text Comment by RichardLetts June 2015

RichardLetts ✭✭✭✭✭

Comments