kb1lxm

Comments

  • For me it is usually the same four or five processes that keep showing as "process not found" by APM, also at random intervals. While it has been doing it frequently the past two weeks, it seems to have stopped since the 2nd. Not something that I can cause to happen on demand.
  • Actually noticing this too, sometimes it will say the process is running even though the APM says it's "Down". Sometimes I do catch it and actually see the process running in the chart though.
  • Just opened up a ticket on this issue, Case # 345781. Once again had no issues purging them, it's now more a question of why is this happening.
  • The Linux boxes have two dual core processors, so I guess what I'm seeing makes sense, however sometimes for a particular process it would read as 1069% utilization. If Solarwinds is simply passing along information from the system, could the initial reply from the system be formatted differently than expected? The…
  • It's not possible to set up a loop so it goes out and returns each row that meets the RawStatus of 3 until it hits the end? Is that a SQL limitation or an Alert limitation? I only ask because I have to do something similar but with hard drive status.
  • jrich, Yeah I ran into an issue with monitoring cluster Virtual Resources behind a firewall. Here's the deal, when you send an snmp query to a cluster resource IP, it will get picked up on that IP. However because of how the binding works the SNMP Agent on the cluster node is what will reply. So we have a clustered SQL…
  • Well I set up a report that I get every morning that shows me which volumes are "unknown". This usually helps me find the following problems: 1.) Servers where the SNMP service has failed/hung 2.) Servers where the SNMP settings are misconfigured 3.) Misconfigured firewall rules 4.) Servers that are hung or locked up 5.)…
  • Hmm and two months later I have another 4.2gb of these orphaned files in my Temp directory. While I could just clear these out, I would really like to understand why they are appearing in the first place. If the jobengines are crashing randomly, could that be a factor?
  • The IBM Director can get those items especially since I can see alerts for low disk space in the IBM Director Console (It has the default thresholds that we don't use.) We monitor the blades individually with Solarwinds, using SNMP to get the CPU/Memory/Network Interface/and Disk Space. We really only use IBM Director for…
  • Actually we have been doing that, I have a Process Monitor checking to see if the application is running and I am using the Event Log Monitor to look for any app crashes or errors. It would be nice to be able to tell if the service account is no longer connected as a way to explain the "Why it's down" and not "What is…
  • It's not a firewall issue. We generally don't have firewalls between the remote sites and the Solarwinds servers so in these cases they shouldn't be interfering. The Firewall Rule for Solarwinds also includes all three polling engines.
  • Thanks, I'm sure you can come up with something more elegant than I. After playing with NMAP for a little bit I've come up with one way to do it. I also have a script that parses a text file looking for a string within a text file and if it sees it will quit with the appropriate exit code/statistic/message combo. So my…
  • Moderator please delete this double post.
  • Stopped all the Solarwinds services and was able to delete the 4gbs of folders with no issues. Thanks for the help!
  • I'm not sure here's my setup VMware Box is currently running ESX 3.5 Update 2 and I am also running NPM 9.1 SP5 The Validated Answer is geared towards older versions of ESX and people using slightly older versions of NPM.
  • I am seeing this issue as well with SNMP Process Monitors on our AIX Servers. We're running NPM 9.5.1 and I'm seeing multiple processes reporting that they are using 99% of the CPU, and there are six of them doing it. We're running IBM SNMPBASE, 3.1.03-v3, Feb 11 2000, Compiled Jun 22 2009 12:00:35but we also see this…
  • Jiri, I meant the components site of the page, the right hand side where I see the Component Name, Port (if applicable), statistic, and cpu/memory stats. This is where I see duplicate entries with different values. I only had this particular template assigned to one server and seeing the the duplicates. As a test I just…
  • So Solarwinds is set up to Identify the operating system of the device and use the appropriate OID? Using the previously mentioned OID with various tools I can see volumes on my Red Hat, Windows, and HPUX systems. Would just like to know which ones Solarwinds is looking for and how it determines what type they are.
  • Also, if you click check "List Resources" in node management and it says something to the effect of the system "maybe down or unavailable" chances are you just need to restart the SNMP Service on the customers servers.
  • Jspanitz, If you take one of the nodes and select "List Resources" what happens? Do you get an error saying the node is down or not responding to SNMP? You did go to "Edit Settings" and change the community string on these nodes right? Orion has no clue if you change the community string on the server. You have to tell it…
  • Hmm... isn't the data collected by the "Node Description" and "Machine Type" information simply gathered from the SNMP Agent on the device itself? Sounds like you might need to reach out to the device vendor.
  • Well you could try a TCP Port Monitor on 3389 to check to see if the MS RDP Port is open. Next time it locks up, try the following command from your desk: telnet servername 3389 That will try to connect using Port 3389, if you get a connection then the server is listening, but if it fails you can use this as a trigger.…
  • All, after I cleared out the files the last time we set out Antivirus software not to scan that particular folder anymore. This seems to have stopped the problem. We also excluded it from backups as well.
  • With either UnDP or the SNMP Poller in APM look at OID 1.3.6.1.2.1.33.1.6.1. This is the OID for the number of alarms present on the Liebert device. I have used this for the GXT and the 610 series Liebert UPS with no problems.
  • Actually I wonder if maybe Solarwinds isn't recognizing the response it gets back from an SNMP Query. I notice that in VMware, the Win 2k8 R2 boxes are being reported as "Windows NT" on the ESX 3.5 Hosts and technically this isn't wrong because Win 2k8 R2 is in the NT family of OS. Anyone know the OID that Solarwinds uses…
  • We monitor the Cisco Switches in our Blade Chassis via SNMP so we can get their health status and what not. We do ICMP Ping on the Advanced Management Modules and looking into SNMP custom pollers for the various aspects of it. We also use SNMP Traps to get any realtime errors forwarded by the AMM. We do monitor the blades…
  • So then what does the TransformExpression field do and why is it there?
  • I've been having this same issue with the MSA 500, MSA2012SA and Spectra Logic's nTier500, looks like a lot of companies are buying up dotHILL devices and rebranding. I reached out to HP support and the only thing I was able to get from them are the trap MIBs. I'm not sure on how HP Insight monitors these, but I also did…
  • Well I think part of the issue is how the reboot alert is written, usually it's: If Lastboottime has Changed, trigger alert Which works great most of the time, however when you unmanage a node it stops polling, which means it's no longer tracking the the lastboottime. So when the node is remanaged it sees that the…
  • Getting similar issues with the Windows Process Monitor as well, I'm trying to monitor w3wp.exe on a server the component keeps going into "unknown status" with "unexpected error occured. Invalid class" I guess first of all, what WMI call does Solarwinds to determine if the process is running? Does it use Win32_Process in…