Comments
-
@"ahbrook" Here are a few examples I use: Measuring an increase over a five minute span. /api/v1/query?query=increase(<some metric>{cluster=~"<some cluster name>"}[5m]) This one will give you a per second rate over a five minute period. /api/v1/query?query=rate(<some metric>[5m]) Depending on the metric, you can also grab…
-
We fall into this group as well and would like to participate.
-
A little update on this: I've started using the new SAM API poller to query the Prometheus API directly. While there are limits to how many metrics I can collect, it's definitely a move in the right direction. :smileylol:
-
What I suspect you're looking for is going to come down to the component type that you're using. There's a fair amount of tables that store historical statistic data but which one depends on the component. For example: To query for historical data from something like an ODBC component, you can look in…
-
I suppose it depends on what the site looks like when it's down. I've run into cases where a site wasn't working but still returning HTTP 200 response codes. I think Solarwinds will always assume the page is up at that point. You could try adding a text match if there is a word on the page you can expect to be there when…
-
We upgraded to 12.0.1 last month and it looks to have fixed the problem. Thank you Solarwinds Dev Team!
-
If the terminal can support something like a bash script you should be able to drop the perl portion off the script monitor and just execute the line but as was mentioned earlier you'll need to return a numeric value for the monitor to work properly. I do something similar to parse netstat output and check for read only…
-
If this is a HTTP monitor component, would adding the hostname/FQDN to the header field suffice? I have to go that route sometimes for our web front ends that host multiple sites.
-
Following up on what I've tried on my side. I suspect this had something to do with the Solarwinds Cortex service as it also had a massive memory footprint. I traced this back to disk queuing issues on the Solarwinds SQL service and it appears this had cascaded its way back to the individual agents if that's even possible.…
-
Would setting the alert trigger against the "Process Instance Count" field on the process monitor component work for you? I think that corresponds to the number of separate PIDs running for a given process.
-
Is this setup as an external node or ICMP? The way I understand how the DHCP option works for node polling is Solarwinds has to be able to successfully resolve the hostname by pinging it. I don't believe the external nodes will do this since they don't check status by default. On a possibly related note, you'll also need…
-
Yes, I believe that is the case.
-
I applied 2018.4 this morning and it doesn't look any better. The leak seems slower but it's still present. Oddly enough, it seems to only be really affecting my servers in AWS that are hosted in Europe.
-
I've encountered some similar problems in the past where some custom performance counters just wouldn't work. You might also try using the WMI Monitor and use a WQL query to see if that will return the right value.
-
Had the same problem that came back when we upgraded to WPM 2.2.1 and NPM 12.0.1. There's a workaround that was given to me back when I first ran into it prior to this upgrade. If you're interested, here are the instructions that were provided to me. We haven't observed any ill effects from this, but YMMV. Here are the…
-
I'm seeing the same problem with the High CPU alert after upgrading from 11.5 to 11.5.2 this morning. Some of the servers have never gone above 25% CPU usage and in one case a server that has been powered off for three days. I've temporarily moved the alert off the threshold flag and it seems to have cleared the problem up…
-
FWIW, I had similar issues. Adding single users worked fine but groups would always fail. After some trial and error, I was able to get the groups to work by changing the group claim from "Security Groups" to "Groups Assigned to the Application". The groups were created as security groups, so I'm not sure why it wasn't…
-
A question about the disk space portion Is it your intent to alert on all volumes under 10% and under x available bytes? If so, you should be able to move the available space to the first condition and drop the second condition. May I also inquire, what is the logic behind combining these into a single alert? Separating…
-
The first thing that comes to mind is if there are any command line filters set in the component or possibly permissions on the server itself. I ran into some deal a while back where the UID mapping on the Linux host was messed up and Solarwinds couldn't pull a process list. A quick test would be to check if you can see…
-
This was confirmed as a bug by support.
-
Off the top of my head this is "sort of" possible. You'll likely need to define cross-account roles in each sub-organization to delegate access to an IAM user within the root account. Tutorial: Delegate Access Across AWS Accounts Using IAM Roles - AWS Identity and Access Management It would still require you to define each…
-
The 'Edit Alert' page. The trick to reproducing this is to set the 'Group By' dropdown to [No Grouping] prior to editing an alert.
-
Following up on this post. I applied hotfix 2 to one of my installations and I'm happy to report the initial results have been very positive. CPU and Memory charts have started populating again and memory usage on the Windows Agents appears to be under control. I'm in a cool down right now to see how this one works before…
-
This last 12.4 update has been a handful for me. 1.) One of my installs has started having invalid subscription issues which has caused some stability issues. It was crashing at least once a day for awhile. I was able to mitigate it some by moving nodes away from Agent polling. The problem is still there, but it's been…
-
If I understand what you're describing, I believe that's how it's supposed to work. The 'show list' link will display the objects in your environment that meet the trigger condition. If you wanted to test the alert, you could force the transaction to go down and leave the trigger condition as is to see if it fires the…
-
I don't think the credentials on the component are getting passed through to the script. Get-Counter may be one of the cmdlets that doesn't allow for credentials to be specified and will only send whatever account was used to kick off the command (i.e. the one used to start the Windows service for the poller). It kind of…
-
The HTTP/HTTPS component will do a GET and check for text matches. It does a pretty good job of keeping tabs on the response codes and I use it for some endpoints that will present a text change or HTTP error code when something is wrong. In cases where you have to collect counters or other text, I would try using one of…
-
A great deal of our application monitors are non-standard as well. For the large portion of them, we've gone with either WMI or SNMP process monitors (depending on how the node is polled) and a second port monitor component in case the process hangs around but isn't accepting connections.
-
Have you tried: ${N=SwisEntity;M=ComponentAlert.StatisticData} I think that's the one I use for the ODBC components in my alerts.
-
Sorry for the late reply..... I've been able to get away with just restarting the services.