Comments
-
We are running an Active/Active implementation across two distinct data centers. Each data center has 8 Pollers and a dedicated database server. Our challenge is their is no way we would ever convince our admins to make changes in two environments. So we need a way to synchronize the databases if a DR situation occurs…
-
ServiceNow does it pretty well, so does WorkDay. But I am with you AlTeReGo, a monitoring tool/system with the amount of data flowing, good luck with that implementation :-)
-
An Observation: Just had it in Houston. I really enjoyed this event. We were finally able to meet folks that we have been working with for years. Many of which have had a direct impact on our success and helped us prevent failures for any number of reasons. And for me personally it was good to meet Zach and Jess two people…
-
Deja Vu.... After many years of doing this I have a different perspective. I have seen the evolution of "what do you want to monitor/alert on", answer = EVERYTHING (so you want me to page you at 2:30 in the morning on EVERYTHING) to let the Subject Matter Expert decide. As I see it, Solarwinds is already capturing Average…
-
An Observation: Just had it in Houston. I really enjoyed this event. We were finally able to meet folks that we have been working with for years. Many of which have had a direct impact on our success and helped us prevent failures for any number of reasons. And for me personally it was good to meet Zach and Jess two people…
-
Just had this event in Houston. After being a SolarWinds customer for many years, it was good to finally meet some of the folks that have had a direct impact on our success and helped us navigate through a couple difficult times. Starting with Zach and Jess, two people I have 'always' had the greatest respect for, Zach is…
-
We are running a POC with the new HA solution now. We started testing the multi-segment Always On database configuration this morning. So far we are extremely impressed. CJ is providing oversight, albeit to this point we have had no issues. After we complete the database testing we will start on the Orion HA testing. In…
-
To me, it's a no-brainer, you have to do a proof-of-concept. You are going to uncover things that most of us have not seen. We are in flight with a much smaller HA implementation, 2 data centers, same city about 40 miles apart. We have 8 pollers at each DC. We are running about 75000 elements across this infra. WE have…
-
I will take a shot at this. First, I went to the Houston event. Some things you will take away from this event, (1) Insights into the direction of the platform (future roadmap), (2) Peer discussions, (3) Engagement with some of the top folks at SolarWinds. The SolarWinds folks that are there, are TOP SHELF, trust me, I…
-
Thanks for the correction aLTeReGO and thanks for your help in getting through that one, it was a huge get.
-
Around 10:00 in Kevin you mention how SWQL handles orphaned interfaces, volumes and assigned applications when you delete a node, and SQL Server Studio (or other query editor, any) does not. Question, does not out-of-the-box Database Maintenance attempt to delete the orphaned records associated with a deleted node?
-
I like it, get catch SnackEye.
-
I need to check this, thanks for the question....
-
.net patches will be the most fun, ugh....
-
I agree with Martian, will add, anytime you have an issue like this try running your browser in "incognito" mode, this then bypasses the settings that typically store browsing history, cookies, and site data on your device. A good debug technique. Sorry for the colors, whoops....
-
Assuming your going with Virtuals, VMWare Workstation or Oracle VirtualBox. You firstly need the memory to allocate, say 32gb, 64gb better but not needed. One lesson learned I will tell you is where you will get the most contention (depending on the number of Virtuals you create in your lab, (Domain Controller, Database…
-
My case number was: 01580006 The bug description and solution is here: community.pivotal.io/.../ReferenceError-display-is-not-defined-ReferenceError-display-is-not-defined-at-Array-process
-
My case number was: 01580006 The solution bug description and solution is here: community.pivotal.io/.../ReferenceError-display-is-not-defined-ReferenceError-display-is-not-defined-at-Array-process
-
OK, here was my solution (hate to admit it). The problem for me was we had never seen this problem, additionally it was working on 3 of the 4 upgraded poller (1)s (multiple HA environments, long story). The RabbitMQ Bug and Solution is documented here…
-
Question: Are APE's supported for HA deployments? Yes they are. Question: if so, I assume the steps for adding APEs is identical to a standalone MPE. Simple (once you have done it a few times), go to SETTINGS/My Deployment and create a pool between your MPE and your APE. Keep in mind, you will HAVE to purchase a "HA…
-
SELECT COUNT(*) AS 'Number of Alerts' FROM [NetPerfMon].[dbo].[AlertHistory] WHERE TimeStamp > DATEADD(hour, -24, GETDATE())
-
8572 over the last 24 hours
-
Figured it out myself (got lucky I guess), anyway in two of our environments (no idea why or how) our password changed on one of our member servers in our RabbitMQ clusters. Was not patching, cyber changes, firewall changes, I have no clue. Regardless to fix it I ran these three commands on the problem member. The Issue:…
-
We are "not" currently having issues with the alerting engine (running 2023.2.1), however, we made (9) production changes and optimizations to alleviate the load, this apparently gave us the headroom we needed to prevent the issue from occuring "initially", daily. @acurrent, If you are heavily using trigger actions,…
-
Bob did you ever get this working? So we have tested source to target through the firewall (on-premise to EC2 instance(s)), working. But now I am thinking, ok, we are going to build 100s of servers in our cloud instance so I need a single firewall rule (that contains all ports required) to allow us to monitor from…
-
Seems like we are talking about more than 1 found issue here in this thread. What I know is, there is an alerting issue we found in 2023.2 and it also exists in 2023.2.1, 2023.2.2 and 2023.3. Development yesterday confirmed it is a bug, however, there are mitigation steps that can be made to mitigate the problem until a…
-
Not sure this will help, but if you are blocked (this is a little dated, but) * What about Carbon Black (if you run it on your servers) * What about CrowdStrike (if you run it on your servers) I get you were able to download other files, but this kinda sounds suspect b/c I am unaware of any one else seeing this issue.…
-
I wish you the best of luck. I did just confirm in our bi-weekly meeting with SolarWinds this bug DOES exist in 2023.3. For this issue: (Alerting engine STALLS and stops processing Alerts) In the Alerting.Service.V2.log on your main poller. Look for (1) WARN SolarWinds.Orion.Core.Alerting.Service.AlertConfigurationLock -…
-
YES, it is also in 2023.3, found out today! We are running 2023.2.1 across (4) environments, Test, QA (a small HA test environment), our Watcher Environment (monitors our production Orion environment exclusively) and production, a HA environment with 9 APEs. Have no experience with 2023.3 yet. I was waiting on you to tell…
-
Development was able to recreate this issue today, confirmed bug in 2023.2.1. Description: The Alerting Service/Engine stops processing alerts for an extended period of time. Initially for us, hours. We then made several changes to lesson the runtime load on the engine this in turn lesson the time for the stall to minutes…