cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Highlighted
Level 12

Re: Multi-Subnet Failover (WAN/DR) Deployment

Can you help me understand the "Virtual hostname"?  Is that only known internally to solarwinds or is that the actual name of the record in DNS?

0 Kudos
Highlighted
Product Manager
Product Manager

Re: Multi-Subnet Failover (WAN/DR) Deployment

The virtual hostname is optional. It is a DNS name which is dynamically updated which is typically used for accessing the Orion web interface. it ensures that users are always directed to the 'active' member in the pool.

Highlighted
Level 12

Re: Multi-Subnet Failover (WAN/DR) Deployment

So, thinking about this a little more, doesn't that make it a requirement?  If the virtual hostname is the record in DNS, doesn't that mean traps have to point to that record as well to avoid trap disruption in a failover?

0 Kudos
Highlighted
Product Manager
Product Manager

Re: Multi-Subnet Failover (WAN/DR) Deployment

For some customers prefer not to deal with DNS for one reason or another. They instead frontend the Orion server with a network load balancer like an F5. For Syslog and Traps, they configure their devices to send to both members of the pool. So in those cases, a virtual hostname is completely optional.

0 Kudos
Highlighted
Level 12

Re: Multi-Subnet Failover (WAN/DR) Deployment

Oh wow, an LB just seems like overkill, won't be doing that for sure.  Virtual hostname it is, thanks for your help.

0 Kudos
Highlighted
Level 13

Re: Multi-Subnet Failover (WAN/DR) Deployment

I'm running into a similar question. I'm not highly knowledgeable on "the network side" of things, so hopefully I'm not going too far down the rabbit hole for something obvious.

To be clear on how setting up a multi-subnet failover works: The only method supported today is to use a "virtual hostname" which is a DNS CNAME/Alias record (I don't know the record type).

This means that anything sending data *to* Orion, via SNMP/Syslog/etc will have to use the DNS "virtual hostname" name so it will route to the current IP address/active server.

When an Orion failover occurs the new "active" Orion server updates the DNS record of the "virtual hostname" with a new IP address (of the new active server).

My questions revolve around the caching of the old IP associated with the DNS virtual hostname scenario:

There are a few warnings in the docs about the IP address caching on anything connecting to Orion, since you're using a DNS name with a changing IP address.

For users, this means they may have to refresh their browser cache.  I'm not too concerned about them for this scenario

For external devices sending in SNMP/Syslog data, I'm not sure how this is handled as we have old (ancient?) and "weirdo" things sending in SNMP/Syslog.  I don't think I could get all of the device owners to make sure their device is flushing their DNS caches, nor if it' seven possible to configure thisfor some devices.

In addition, I've asked around and apparently some devices can *only* be configured to use a single IP address (no DNS names) to send SNMP/Syslog data to.

This means that, when Orion fails over I really can't say how much SNMP/Syslog data I may lose due to external devices not being able to pick up the new IP, and some can't even use a DNS name so what do I do with those?

What it looks like is I need to have some network device with a static IP address that all the remote devices connect to, that then routes to my DNS "virtual hostname" entry. This device then has a low DNS cache refresh time...or something.

I talked to my network team and they indicated that something like a load balancer can route traffic based on testing what node is "available".  I'm not too sure, but both Orion primary/secondary servers should both be "up", so it comes to running some specific tests from the load balancer to determine which one is the primary. They mentioned checking an HTTP status page/URL, etc. but I don't know what Orion services would be "up" on the secondary or what's a good test.

My questions:

- Has anyone decided to use a network load balancer or other solution to handle the above scenario?  If so, are you running "tests" for determining traffic routing or just keeping the DNS caching of the load balancer time low?

- If you didn't load balance and just used a network device to replicate all incoming SNMP/Syslog data to both IP addresses of the servers in the HA pool (bypassing the virtual hostname) will the secondary Orion server pick it up?  From what I've read, some secondary services will be running to handle failover but I don't know what's "not running".  I assume SNMP/Syslog would not be running services on the non-active server.

Highlighted
Product Manager
Product Manager

Re: Multi-Subnet Failover (WAN/DR) Deployment

tigger2  wrote:

The only method supported today is to use a "virtual hostname" which is a DNS CNAME/Alias record (I don't know the record type).

In a multi-subnet failover configuration a virtual hostname is optional and provided as a convenience feature. Some customers opt to use alternative means of directing traffic to the Orion server, such as a Network Load Balancer.

This means that anything sending data *to* Orion, via SNMP/Syslog/etc will have to use the DNS "virtual hostname" name so it will route to the current IP address/active server.

That's certainly one option, though most customers opt instead to configure their devices to send NetFlow, Syslog & SNMP Traps to both members of the pool. A few have created NCM Configuration Alert Actions to update the Syslog, Trap, NetFlow destinations on their devices to point to the 'Active' pool member when a failover occurs. There really are quite a few options available. You just need to pick the option that works best for you in your environment.

tigger2

For users, this means they may have to refresh their browser cache.

Modern browsers maintain their own DNS cache, separate from the operating system. Unfortunately, this browser cache does not respect certain key components of DNS, such as the TTL for when a DNS entry should expire from the cache. This means that users who are actively working in the Orion web interface when a failover occurs may need to close their browser and reopen it before they can resume their session.  A load balancer, or transparent proxy like nginx can be used as a workaround if this is bothersome.

tigger2

What it looks like is I need to have some network device with a static IP address that all the remote devices and users connect to, that then routes to my DNS "virtual hostname" entry. This device then has a low DNS cache refresh time...or something.

The TTL used by HA for the virtual hostname is already very low, at one minute. The issue is that browsers do not respect that value within their own cache, even though the operating system fully does.

Highlighted
Level 16

Re: Multi-Subnet Failover (WAN/DR) Deployment

aLTeReGo

Are there any use cases available of folks using F5 load balancer for Solarwinds HA? And does this require Solarwinds to be Active- Active mode?

0 Kudos
Highlighted
Product Manager
Product Manager

Re: Multi-Subnet Failover (WAN/DR) Deployment

Many Orion HA customers utilize Load Balancers in lieu of a virtual hostname. HA is still active/passive. The Load Balancer simply watches to see which server is 'alive' (usually through health checks) and directs traffic to the active member.

0 Kudos
Highlighted
Level 16

Re: Multi-Subnet Failover (WAN/DR) Deployment

So if F5 is used then do we still need HA configuration to be done at

application level?

On Fri, Jun 29, 2018, 12:42 AM aLTeReGo

0 Kudos