mesverrum · Observability Architect · ✭✭✭✭✭

Comments

  • The way that ootb alert checks for a "reboot" is by watching for a time change on a sysuptime counter. Unfortunately those time changes can happen for many reasons beside a reboot. For windows servers I would disable that alert and trigger on the windows event log 6008, if it was my environment. 
  • That's not how solarwinds licenses work. A UPS counts against one "Node" in your NPM license. However many data points you collect is not relevant to licensing. It gets a little more complicated when it comes to switches as they do charge by the number of monitored interfaces, but when it comes to power, it's just nodes. 
  • I'd say the only use case for SNMP polling of esxi hosts directly is if you run local disks, as those don't show up under the vcenter integration.
  • The database maintenance does defrag indexes, but io dont believe it does rebuilds or checkdb, i've always set those up manually with my own maintenance jobs on the server. Also its not "continuously" maintaining itself, it does it once each night at a scheduled time, 2:15 am server time is default. In big environments…
  • Yeah the only way be confident that the time is up to date would be to delay your recovery message by at least one full polling interval.
  • I checked my repo and I have this query for the custom query widget that i used before, it might be able to give you enough clues to get where you want to be in your dashboard https://github.com/Mesverrum/MyPublicWork/blob/master/SWQL/Node%20Log%20Events
  • Two things You can't do a group by and a having without an aggregation function like count(*), so this isn't going to be valid sql or swql anyway. Your select line needs to be changed to some kind of count to make sense with the rest of the query. Second thing is a question about how many log events you get on this server.…
  • You should be able to just remove the remoteport stuff from the where. This command runs as the current user on the system, so you don't pass it credentials anywhere. You do need to pay attention to the SAM setting for powershell scripts where it asks if you want to do local or remote execution, because local mode would…
  • There are two approaches i have done with this scenario. The "use this device" button does work, but its slightly limiting. Not understanding exactly what wasn't working for you when you used it, but the more flexible way I usually do it is with the Custom Query widget and writing my SWQL query in such a way that I filter…
  • If the ticketing system is SNOW then using the integration will populate the SNOW INC number, but that only works natively for that one scenario. I've hijacked that field in the DB before and used it for custom integrations I made to other tools, but you have to be pretty comfortable with the idea of hacking away at your…
  • While it may seem simple this is a pretty big departure from the whole way discovery works today and would require the many pollers to coordinate in ways they currently dont, so I'm just saying don't hold your breath on this getting picked up. Also might be good for you to include the link to your actual FR you made…
  • Probably because that SWQL folder was written long before i was using Grafana for anything and they were all written for the Orion Custom Query widget and include variables that wouldn't exist for a Grafana dashboard. I just pulled up some of them to take a look and any variables ${anything} would need to be removed, and…
  • In the dashboard itself you don't have to specify a URL, that part was a bit confusing for me the first time. Basically you are already inheriting the base url from the data source config, and since pretty much all operations we would use for a dashboard are aimed at the same url i just leave it blank. I expect you are…
  • Don't have IPAM in my current role, but it was something pretty basic like update ipam_ipnode <this is maybe not the table name for this, but its been a long time since i saw it> set engineid = x where <whatever logic made sense at the time>
  • Honestly I found the GUI for this to be pretty frustrating, so i always ended up just editing them directly in the DB by changing the associated engineid
  • Anything that requests data from the database does put load on it. I've brought solarwinds to their knees running poorly thought out queries in the past. Just depends how much data you are trying to extract and how often.
  • I presented on this exact topic in 2020 for a thwack camp thwack.solarwinds.com/.../custom-properties-the-2020-edition And also here's a deck from a presentation I did at a swug that has some more detail about ways to auto set properties (and by extension automatically get nodes into a group where the SAM templates would…
  • The API respects their user permissions, so if the user doesnt have node management they can't delete. Now if you want to get to a situation where they only have a subset of the admin permissions you are probably just going to want to stand up a reverse proxy with rules restricting the endpoints and lock down the actual…
  • Neighbor change alerts are one of the out of the box examples, but I dont think it has all the details about what the changes were. I don't think the event data actually contains all of that so how would you anticipate that SW would include them in the error?
  • No, the pdc software is the a proxy between the grafana cloud and your internal environments. This is how basically all grafana cloud customers use it.
  • Correct. the custom html widget doesn't normally support that kind of thing (except when you DIY it in javascript). That's why i suggested to use the custom query widget.
  • I'll also mention that the Grafana PDC doesn't have to be on the same box, it can be set up to essentially act as a proxy within your environment so it can reach anything that the server you run the PDC on can reach. Directly connecting your dashboards to the DB is not recommended for a variety of reasons so you should…
  • As an FYI, you can order by math expressions, you just have to make sure there is already a column for that math. If you made a total errors column you would just sort that one and be good. You can hide columns from the output report. Or just use the linked report mentioned already. 
  • Don't have a live environment to flesh out this query but from memory the swql will be roughly like this for the custom query widget. Select c.name From orion.containers c Join orion.containermembersnapshots cs on cs.containerid = c.containerid Join orion.nodes n on n.nodeid = cs.objectid Join orion.ipaddresses ip on…
  • Q: Why does it say the node is down when I can RDP into it A: Because we can't ping the IP shown here, is this still the IP of the device? Have there been any firewall changes that would stop me from pinging it?
  • That query looks like it should still work, although i kind of hate the way they did that join and you have to be really careful about using like with wildcards that way because they can be awful for performance. If you are going down that road I expect you would get better results by using parsing rules and applying tags…
  • Your polling engine is a windows server and it would connect to the target over https to get the cert, doesnt matter what OS is on the target server.
  • I was just "reminiscing" about those days with a colleague. My first in person consulting gig was a SolarWinds upgrade that I completely butchered the first time around, but thankfully we had taken backups and so I took a second crack at it and everything worked. It was back when you needed to navigate through the matrix…
  • So I'll just throw it out there that I don't love relying on UDT for rogue detection because it only checks the tables every 30 minutes. That's a LONG time for someone to be on your network, so I wouldn't want my boss to have a false sense of security about things.
  • SolarWinds used to have a product Patch Manager that covered your B scenario, I'm not sure if its still around or if its been retired as it was getting really long in the tooth and works essentially as an addon to WSUS. It was minimally integrated with the core Orion platform, so you wouldn't necessarily be alerted about…