mesverrum · Observability Architect · ✭✭✭✭✭

Comments

  • Not sure where you might want to include it, but similar to the INT problem there is a case where a value is in the DB as a STRING and we want to do math with it. I often do the trick with table.attribute * 1 as a way to cast a STRING as an INT. Then the other common one that gets people is when they want to turn an INT…
  • There isn't exactly anything specially for PowerBI, but it has an option where you can set it up to use a connection to a database to get data. You just need to give your PowerBI the correct username and password to get the data from the SQL server directly and set it up with all the relevant SQL queries to populate…
  • No, generally speaking i tell people not to bother with Acknowledging anymore if they are integrated to Service Now. It already has all the relevant reporting on ticket assignment, timestamps, etc so there's not really much to be gained by trying to also track that stuff in Orion.
  • siteid is an integer so using the + syntax makes swql think you are trying to do math and it gives up. Use a concat('https://eoc.com/server',S.SiteID,N.DetailsURL) and see if that works better
  • I don't like the directlink account for a variety of reasons, i usually disable it because if annoying situations similar to this and just add an AD group into Orion with minimal permissions that covers all Active Directory users. My power users and admins all just get logged into their proper accounts every time, and…
  • Do you actually need to know the CurrentUsers? if you didn't then I'd suggest the least painful way to deal with this would be using SCM. You could create a custom profile that just dumps this whole table into Orion every 5m or so and then just notify you when the contents change. If you don't have SCM, or want to stick to…
  • Superficially the SWQL you are showing looks like it would work, but i think its weird that you are joining to the NCM nodes table instead of the orion.nodes. It's probably not going to make a functional difference, just a less intuitive way to do this query. I'll just give you a heads up, if you have a good sized…
  • There's no field that was available in the old report writer that isnt available in the web based reports now, and if you want to get really fancy you can use a custom sql/swql data source to do basically anything you want.
  • The report writer got deprecated many many years ago, its been web based reporting for probably about 10 years IIRC documentation.solarwinds.com/.../core-creating-a-new-web-based-report-sw1322.htm
  • You probably should do a manually defined join instead of just using the navigation properties across all those tables. When you bring in the Nodes.NodeIPAddresses tables then you are getting a row for every IP * every interface because it is just joining at the nodeid level. A DBA would call it a cartesian product if you…
  • There is a subcategory lower down on the page specifically for SAM permissions, thats where i would expect to see the ones you need.
  • I'm not sure that you have the right cause in mind, because snmp nodes totally can have SAM apps assigned to them, there are a few types of components specifically for snmp nodes, and several others that do not care what protocol the node is polled with because the components use their own protocols. Hard to troubleshoot…
  • I do the same, use powershell scripts to parse the logs for things that I think are relevant. The big caution there is that Orion logs are super verbose and it can take a lot of time and compute to crunch through them all. Alternatively you could use a log aggregation tool and run reports out of those.
  • Without getting into some pretty complicated hacks, the short answer is no. Android doesnt use the protocols that NPM uses such as SNMP and WMI and their security model makes it pretty hard to just collect system level stats from a phone or tablet or whatever unless they are rooted and allow for things to run outside the…
  • If you are also trying to capture workstations you are probably going to need to segment it across multiple SEM servers. Have all the servers report to one instance, have some workstations on another, etc etc. Break it up as much as you have to in order to keep the performance and scalability where you need it. SEM…
  • My guess is this OID is a tabular poller set up to handle two temp sensors, but one of them isn't in use and so it probably returns a 0, which gets transformed to 32
  • Alerting at the app level is less noisy, but the downside is that it also tends to be less detailed. There just aren't good ootb variables to get specific component level stats if you aren't alerting on the specific components. If you changed your alerting strategy to alert at the component level you can pull in a lot more…
  • Making it so each node can speak to all pollers makes life much easier from an administrative and failure recovery perspective, and locking them down doesn't seem like it gains you much in terms of benefits? I always found the agent to be a bit flaky the same way you describe and fortunately didn't have the need to use…
  • What makes you think groupname is an available attribute in SWQL? Is that a custom property you have or are you trying to find all nodes that are a member of the PI servers group? If it's a custom property you would want to join to the noes custom properties table, if it's a group you would join to the tables for container…
  • In the past I used to sink a LOT of time into building GUI's and layouts like this but over time I found that as I got better at SQL it was just easier/faster/more flexible to just write out as a SQL query that captures all the same info and lays it out in a table. Still able to get all the color coordination and…
  • Are you maybe getting timeouts when trying to poll the node? How do you poll? WMI, SNMP, Agent? Might be something funky going on there.
  • This problem isn't really a SEM specific issue. How would you prevent any Linux server that does dns lookups in the course of normal activity from making lookups against specific addresses? I feel like a hacky solution would be to insert all those bad domains into a hosts file on the server mapped to 0.0.0.0. Maybe someone…
  • Thats a contributing factor to why I always tell people who run big environments to offload as much of the polling as possible to APE's. The primary has a bunch of other jobs to do and if it gets stuck on something then you get all kinds of errors manifesting all over the system. If you have to option I would try to cut…
  • The message is kind of vague but what actually happens is that the node level data is still there, so things like cpu and memory history will usually continue without issue. The part where you potentially lose data is that SNMP and WMI/Agent have very different ways of reporting on the data of child objects like volumes…
  • How big is your environment? How many alerts actions trigger per day? I've definitely seen the alert engine get overwhelmed in big environments or when there is a very high number of alert actions to take. Common symptoms I've seen would be for the alert actions to not fire off at all, or if they use variables the…
  • I've been living inside that database for about 10 years, I can guarantee you that the alertobjects table does not show alerts that have not triggered in the past before. If you add a brand new node go look and see what it has in that table, it will be nothing until that node triggers an alert, and then it will only be…
    in Alerting Comment by mesverrum April 2024
  • For the reboot it is stored in the device templates, you can see some examples of the OOTB templates below in the docs link to see how the syntax works. The command that runs for that jobs is whatever that device has in it's profile for the Command Name="Reboot"…
  • Looking at your earlier query I believe this would work SELECT Component.Uri, Component.DisplayName FROM Orion.APM.Component AS Component JOIN orion.APM.MultipleStatisticData data on data.ComponentID = component.ComponentID WHERE data.ComponentID = '1234' AND data.Name != 'Offline' AND (SELECT SUM(data.NumericData) as…
  • The limitation of that logic is it will only show you alerts that have already triggered on each object before. So its not a report of all possible alerts, just the ones you have run into historically.
    in Alerting Comment by mesverrum March 2024
  • You need to check and make sure on the events table which eventid or eventtype, i dont recall the exact column name, actually shows up for node down events, then just add to your where conditions and downevents.eventid = 2 or whatever the correct one is.