1 2 3 Previous Next

Product Blog

739 posts

After four months, it is time again to write another article about another product.
As it happens, we’ve added a new toy to our portfolio:

SolarWinds Access Rights Manager (ARM)

Some of you may know it under its former name, 8MAN.

 

What exactly does ARM do? And who came up with this TLA?

The tool validates permissions within Active Directory®, Exchange™, SharePoint®, and file servers. So who has access to what, and where does the permission come from?

Users, groups, and effective permissions can be created, modified, or even deleted.

Reports and instant analysis complete the package.

Everything works out of an elegant user interface, and you can operate it—even if you aren’t a rocket scientist.

 

ARM will be installed on any member server and comes with minimal requirements.
The OS can be anything up from 2008SP1; give it two cores and four gigs of RAM, and you’re golden, even for some production environments. The data is stored on an SQL 2008 or later.

The install process is quick.

 

 

Once installed, the first step is to click the configuration icon on the right-hand side. The color is 04C9D7, and according to the internet, it is called “vivid arctic blue,” but let’s call it turquoise.
On that note, let me tell you: I am German and unable to pronounce turquoise, so I am calling it Türkis instead.

 

 

The next step is to create an AD and SQL® user and connect to the database:

 

 

ARM is now available, but not yet ready to use.

 

 

We need to define a data source, so let’s attach AD. The default settings will use the credentials already stored in ARM for directory access.

 

 

In my example, an automated search kicks off in the evening. When you set it up for the first time, I suggest clicking the arrow manually once to get some data to work with.
Attention: Don’t do this with 10,000 users in the early morning.

Alright, that’s it.


Now click the orange—sorry, F99D1C—icon to start the tool.

 

 

Login:

 

 

The first thing we see is the dashboard:

 

 

Let’s deal with the typical question, “Why was that punk able to access X at all?”
The main reason for this is probably a nested authorization, which isn’t obvious at first glance.
But now ARM comes into play.
Click on Accounts and enter Mr. Punk’s name into the search box above:


 

The result is a tree diagram showing the group memberships, and it is easy to see where the permission is coming from.

 

 

If you click on a random icon, you will see more details—give it a try.
You can also export the graphic as a picture.
On the right side, you will find AD attributes:

 

 

Now it is getting comfortable. It is possible to edit any record just from here:

 

 

Oh yes, I don’t trust vegetarians!

By the way, this box here is mandatory on any change, as proper change management requires the setting of notes.

 

 

And while we’re at it, right-click on an account:

 

 

Let’s walk from AD to file permissions. It’s only a short walk, I promise.
Click Show access rights to resources as seen above.

Now we need to select a file server:

 

 

On the right, we see the permissions in detail:

 

 

We ship ARM with a second GUI in addition to the client—a web interface accessible from anywhere, where you find tools for other tasks.

Typical risks are ready for your review out of the box. Just click on Risks. I know you want to do it:

 

 

You’ll find some interesting information, like inactive accounts:

 

 

Permanent passwords:

 

 

Or everybody’s darling, the popular “Everyone” permission on folders:

 

 

One does not simply “Minimize Risks,” but give it a try:

 

 

I could initiate changes directly from here – also in bulk.

 

By the way, any change made via ARM will be automatically logged.
The logbook is at the top of the local client, and we can generate and export reports:

 

You may have seen this above already, but you can find more predefined reports directly on the Start dashboard:

 

 

Let’s address one or two specific topics.

Since Server 2016, there is a new feature available called temporary group membership.
It can be quite useful; for example, in the case of an employee working in a project team who requires access to specific elements for the duration of the project. That additional authorization will expire automatically after whatever time has been set.

Practical, isn’t it?

 

But also consider this: Someone might have used an opportunity and given him- or herself temporary access to a resource with the understanding that the change of membership will disappear again, which makes the whole process difficult—if not impossible—to comprehend.

But not anymore! Here we go:

 

 

If you hover over this box here…


…you will find objects on the right side:

 

 

For this scenario, these two guys here might be interesting:

 

Unfortunately, in my lab, there’s nothing to see right now, so let’s move on.

 

ARM allows routine tasks to be performed right from the UI; for example, creating new users or groups, assigning or removing permissions, and much more.
This becomes even more interesting when templates, or profiles, are introduced.

Let’s change into the web client. Click the cogwheel on top, then choose Department Profiles:

 

 

At the right side, click Create New.

 

 

The profile needs a shiny name:

 

 

Always make sure people who operate microwaves receive proper training. But that’s a different story.

More buttons on the left side; I will save it for now:

 

 

Starting now, you can assign new hires to these profiles, and everything else is taken care of by the tool, like assigning group memberships or setting AD attributes.

 

Of course, these profiles are also baselines, and there is a predefined report available showing any deviations from the standard. Just click Analysis and User Accounts.

 

 

Select a profile and off you go:

 

 

Elyne is compliant; congratulations. But that’s hardly surprising, as she is the only employee in Marketing:

 

 

These are just a few features of ARM. Other interesting topics would be the integration of different sources, or scripts for more complex automation. This is food for future postings.

 

But you know what I like most about ARM, as a computer gamer?
You can click on just about anything.

Try this out; it’s at the left side of the Start dashboard:

 

 

Have fun exploring.

Woes of Flow

A poem for Joe

 

It uncovers source and destination

without hesitation.

Both port and address

to troubleshoot they will clearly assess.

Beware the bytes and packets

bundled in quintuplet jackets,

for they are accompanied by a wild hog

that will drown your network in a bog.

The hero boldly proclaims thrice,

sampling is not sacrifice!

He brings data to fight

but progress is slow in this plight.

 

Mav Turner

 

As network operators, one of the most common—and important—troubleshooting tasks revolves around tracking down bandwidth hogs consuming capacity in our network infrastructure. We have a wealth of data at our fingertips to accomplish this, but it’s sometimes challenging to reconcile into a clear picture.

 

Troubleshooting high utilization usually begins with an alert for exceeding a threshold. In the Orion Platform’s alerting facility, there are several conditions we can set up to identify these thresholds for action. The classic—and simple—approach is to set a threshold for utilization defined as a percentage of the available capacity. The Orion Platform also supports baselining utilization in a trailing window and setting adaptive thresholds. Next, you need to investigate to determine what’s driving utilization and decide what action to take.

 

Usually, the culprit is a particular application generating an unusual level of traffic. We can get some insights into application traffic volumes from a NetFlow analyzer tool like NetFlow Traffic Analyzer.

 

So, why don’t the volume measurements match exactly from these two sources of data? Aren’t interface utilization values the same as traffic volume data from NetFlow?

 

Let’s review the metrics we’re working with, and how this data comes to us.

 

Interface capacity—the rate at which we can move data through an interface—is modeled as an object in SNMP, and we pick that up from each interface as part of the discovery and import process into Network Performance Monitor network monitoring software. It can be overridden manually; some agents don’t populate that object in SNMP correctly.

 

Interface utilization is calculated from the difference in total data sent and received between polls, divided by the time interval between polls. The chipset provides a count of octets transmitted or received through the interface, and this value is exposed through SNMP. The Orion Platform polls it, then normalizes it to a rate at which the interface speed is expressed. That speed is usually “bits per second.”

 

SNMP Polled Utilization

 

The metrics reported by SNMP about data received or sent through the interface includes all traffic—layer two traffic that isn’t propagated beyond a router, as well as application traffic that is routed. Some of the data that flows through the interface isn’t application traffic. Examples include address resolution protocol traffic, some link-layer discovery protocols, some link-layer authentication protocols, some encapsulation protocols, some routing protocols, and some control/signaling protocols.

 

For a breakdown of application traffic, we look to flow technologies like NetFlow. Flow export and flow sampling technologies are normalized into a common flow record, which is populated with network and transport layer data. Basic NetFlow records include ICMP traffic, as well as TCP and UDP traffic. While it’s possible on some platforms to enable an extended template that includes metrics on layer 2 protocols, this is not the default behavior for NetFlow, or any of the other flow export protocols.

 

Top N Applications traffic volumes

 

The sFlow protocol takes samples from layer 2 frames, and forwards those. So, while it’s possible to parse out layer 2 protocols from sFlow sample packets, we generally normalize sFlow along with the flow export protocols to capture ICMP, TCP, and UDP traffic, and discard the layer 2 headers.

 

When we work with flow data, we’re focusing on the traffic that is generally most variable and represents the applications that most often drive that high utilization that we’re investigating. But you can see that in terms of the volumes represented, flow technologies are examining only a subset of the total utilization we see through SNMP polled values.

 

SNMP Polled versus application flow volumes

 

An additional consideration is timing. SNMP polling and NetFlow exports are designed to work on independent schedules and are not synchronized by design. Therefore, we may poll using SNMP every five minutes and average the rate of bandwidth utilization over that entire period. In the meantime, we may have NetFlow exports from our devices configured to send every minute, or we may be using sFlow and continuously receiving samples. Looking at the same one-minute period, we may see very different values at a particular interval for interface utilization and application traffic that is likely the main driver for our high utilization.

 

SNMP Polling and flow export over time intervals

 

If we’re using sFlow exclusively, our accuracy can be mathematically quantified. The accuracy of randomly sampled data—sFlow, or sampled NetFlow—depends solely on the number of samples arriving over a specific interval. For example, a sample arrival rate of ~1/sec for a 10G interface running at 35% utilization and sampling at 1:10000 yields an accuracy of +/-3.91% for one minute at a 98% confidence interval. That accuracy increases as utilization grows or over time as we receive a larger volume of samples. You can explore this in more detail using the sFlow Traffic Characterization Worksheet, available here: https://thwack.solarwinds.com/docs/DOC-203350

 

So, what’s the best way to think about the relationship between utilization and flow-reported application traffic?

 

  • Utilization is my leading indicator for interface capacity. This is the trigger for investigating bandwidth hogs.
  • Generally, utilization will alert me when there’s sustained traffic over my polling interval.
  • Application traffic volumes are almost always the driver for high utilization.
  • I should expect that the utilization metric and the application flow metrics will never be identical. The longer the time period, the closer they will track.
  • SNMP-based interface utilization provides the tools to answer the questions:
    • What is the capacity of the interface?
    • How much traffic is being sent or received over an interface?
    • How much of the capacity is being used?
  • Flow data provides the tools to answer the questions:
    • What application or applications?
    • How much, over what interval?
    • Where’s it coming from?
    • Where is it going?
    • What’s the trend over time?
    • How does this traffic compare to other applications?
    • How broadly am I seeing this application traffic in my network?

 

Where can I learn more about flow and utilization?

 

An Overview of Flow Technologies

https://www.youtube.com/watch?v=HJhQaMN1ddo

 

Visibility in the Data Center

https://thwack.solarwinds.com/community/thwackcamp-2018/visibility-in-the-data-center

 

Calculate interface bandwidth utilization

https://support.solarwinds.com/Success_Center/Network_Performance_Monitor_(NPM)/Knowledgebase_Articles/Calculate_interface_bandwidth_utilization

 

sFlow Traffic Characterization Worksheet

https://thwack.solarwinds.com/docs/DOC-203350

Choosing the right monitoring tool can be difficult. You have fires to put out, time is limited, and your allocated budget may rival that of a first grader's allowance. When budgets are tight, there's nothing better than free, and many of you may lean on open-source solutions. These tools usually have no price tag and are essentially "free," but we have a saying here at SolarWinds®... "Is it free like a puppy, or free like a beer?"

 

While there isn't an actual cost through a purchase with open-source software, the caveat is that you usually need to put extensive work into getting them up and running. What if you had an alternative? A monitoring solution already purpose built for you, that is intuitive and helps cover the essentials. I'd like to introduce you to SolarWinds ipMonitor® Free Edition. The free edition of ipMonitor offers all the same functionality as paid software and supports up to 50 monitors.

 

ipMonitor is a comprehensive monitoring solution for your network devices, servers, and applications in a consolidated view. The tool is streamlined for simple agent-less monitoring of availability, status, and performance metrics in a lightweight tool that can be installed almost anywhere.

 

Perfect for even the smallest satellite office, ipMonitor sets up in minutes, uses minimal resources, and is completely self-contained, so there is no need to install a web front end or separate database and be forced to maintain it.

 

 

Use and customize built in dashboards to organize the critical data in your environment.Easily track response time, hardware health, or bandwidth of your firewalls, routers, and switches.Monitor servers for cpu, memory, drive space, and even critical services.

(Click image to enlarge)

(Click image to enlarge)

(Click image to enlarge)

 

 

Drill down to investigate in more granular detail and view historical statistics.Click a chart to instantly generate an automated report to share, print, or save.

Leverage built in service monitors or assign port checks.

Pull performance counters or simulate user experience through built in wizards.

(Click image to enlarge)

(Click image to enlarge)

(Click image to enlarge)

 

  Take advantage of simplified NOC views to quickly pinpoint areas of concern.

(Click image to enlarge)

 

 

There is a ton of power packed in such a small package, and best of all - it's FREE!  Download it for yourself. Check it out here: ipMonitor Free Edition | SolarWinds

 

Want to learn more? Check out the upcoming webinar: https://launch.solarwinds.com/essential-monitoring-with-ipmonitor-re-broadcast.html

 

Share feedback or see how others are leveraging ipMonitor in the ipMonitor forum on THWACK.

 


Need to expand beyond the free edition? ipMonitor offers the ability to scale to help stay ahead of the next crisis, without emptying the pocket book. Whether you run a small business or need dedicated monitoring for a particular project fast, ipMonitor is designed to simplify the day-to-day.

 

Check out the ipMonitor documentation in the SolarWinds Success Center

SolarWinds® Access Rights Manager (ARM) v9.1 is now available on the customer portal!  For a broad overview of this release, the releasehttps://support.solarwinds.com/Success_Center/Access_Rights_Manager_ARM/Access_Rights_Manager_9_1_Release_Notesnotes are a great place to start. 

 

Feature Summary

View and Manage Azure AD Accounts with ARM

Create Azure AD accounts with ARM

Identify shared directories and files on OneDrive

Create a report about directories and files shared on OneDrive Identify users assigned to a transaction code in SAP R/3

Identify multiple authorizations for transaction codes in SAP R/3 Identify critical basic permissions in SAP R/3 Conclusion

Feature Summary

 

The primary changes you will see in this new release are designed to extend support for your critical applications and simplify integration with other systems and business processes, with explicit design to save you time on repetitive tasks.  

 

1.    Rebranded interface.The legacy 8MAN branding has been removed and the UI now looks similar to other SolarWinds products.  This is a small change but the first step in making ARM an important part of the SolarWinds security portfolio.

 

2.    Microsoft Azure Active Directory.  SolarWinds ARM now provides the ability to see and change permissions within Azure Active Directory.  By extending ARM to Azure-based Active Directory deployments, organizations who are directly leveraging Azure or who have hybrid environments can now utilize ARM to get better visibility and control over both. 

 

3.    Microsoft OneDrive.  SolarWinds ARM has been extended to include permissions visibility and change for Microsoft OneDrive, complementing the existing access rights permission visibility with Active Directory, Exchange, and file servers. Gain visibility into key areas, such as which files an employee has shared externally, and who has shared what files and directories internally with which employees.

 

4.    SAP R/3.  With this release, SolarWinds ARM introduces support for SAP R/3, allowing you to search for security-critical transaction codes, find authorization paths, and recognize multiple authorizations.  See which Active Directory users are assigned to each SAP account through the Access Rights Manager interface.

 

 

5.    UI/UX Improvements.  The ARM UI now has a more modern look.  The loading indicators have been improved.  We’ve added user pictures next to the comment boxes.  And, the user experience was improved by introducing tables with persistence in areas such as the resource view.  No need any more to re-apply your changes to the order or size of columns.  They stay with you after you set them.  Also, Analyze & Act scenarios can now be selected much easier by the new grouping and filtering functionality.  We heard you and made these improvements to make your job easier.

 

6.    Microsoft SQL Server Express Integration.  To make the installation for smaller environments easier, ARM now supports the automatic installation and configuration of Microsoft SQL Server Express directly from the ARM configuration page.  Use this option out-of-the box or utilize Microsoft SQL Server instead if you need a higher performance database.

 

7.    ARM Sync!  Most companies have several systems in place to manage users and their data.  This includes Active Directory, HR systems, and ERP systems.  Without proper synchronization processes, the systems may have an inconsistent view of the user’s data, resulting in administrators and HR employees having a difficult time identifying the correct set of data. ARM Sync! Helps to automate the data exchange between third-party systems and a system administered with ARM. With ARM Sync!, you can automatically create, deactivate, or delete user accounts.

 

8.    Recurring Task Scripting. Scripts are often used by administrators to ease the execution of recurring or repetitive tasks.  ARM now allows you to make a script available to users via the cockpit in a safe way to allow those users to execute an action immediately without an approval workflow.  These scripts can be executed before or after user provisioning processes, making it flexible and easy to apply.

 

9.    Create SharePoint Permission Groups.Industry best practices for SharePoint and file servers is not to grant permissions directly to users, but instead via group memberships to resource groups. With the Group Wizard for SharePoint, ARM relieves you of the many manual work steps needed to do this.  ARM now let’s you assign authorizations through a simple drag-and-drop procedure, and ARM will automatically create authorization groups and group memberships for both SharePoint online and SharePoint on-premises.

 

The SolarWinds product team is excited to make this new set of features available to you.  We hope you enjoy them.  Of course, please be sure to create new feature requests for any additional functionality you would like to see with ARM in general.

 

To help get you going quickly with this new version, below is a quick walk-through of the new Azure Active Directory feature, SharePoint, and OneDrive.

 

View and Manage Azure AD Accounts with ARM

ARM helps you to view, manage, and get control of your accounts in Azure AD and on-premises AD through a common interface.

 

1. Use the search box to find an Azure AD (AAD) account.  Use the search configuration (arrow) to ensure that Azure AD accounts are included in your search results.

 

 

2. Click on the desired entry. The icon with the cloud symbolizes an AAD account.

3. ARM focuses on the account. After right-clicking, select the appropriate action you want to perform.

 

Create Azure AD accounts with ARM

Create new Azure AD accounts or groups based on templates. Ensure the correct attributes and data is set.

 

1. On the start page, click "Create new user or group". 

2. Click on the desired template for a new user or new group in the AAD.

3. Enter the required information.

The information requested by the template can be fully customized.

 

4. Specify the logon information used to create the account in the AAD.

 

5. Enter a comment.

 

6. Start the execution.

 

Identify shared directories and files on OneDrive

OneDrive is an easy tool to let your employees share resources with each other and/or external users. ARM makes it easy for you to check which files an employee has shared externally, and who has shared what files and directories internally with which employees.

 

Option A: Browse through the OneDrive structure.

 

1. Select the resource view.

 

2. Expand OneDrive.

 

3. Browse the OneDrive structure.

 

4. ARM displays the permissions.

 

5. ARM shows you the authorized users.

 

"External" is used to identify files or folders shared externally. OneDrive creates a link (hence the symbol used). Anyone who owns the link can read or change it.

"Internal" identifies files or folders that are shared within the organization.

 

If a file or folder is shared with a specific user (email address) within the organization, this user is given permission (not a link).

 

Option B: Search for shared resources on OneDrive.

1. Search for "Internal" or "External" in …

 

2. OneDrive Accounts. 

 

3. This will open a scenario that displays all with OneDrive internally or externally shared files and folders.

 

Create a report about directories and files shared on OneDrive

Sometimes a report is easier to share, or you just want to follow up later on something you found. ARM allows you to easily generate a report about the files and folders your employees share on OneDrive.

1. Select the resource view.

 

2. Expand OneDrive and select a resource.

 

3. Select "Who has access where?".

4. The previously selected resource is preset.

 

5. Optional: Delete the preselected resources.

 

6. Use Drag-&-Drop procedure to add resources.

 

7. Start report creation.

 

Identify users assigned to a transaction code in SAP R/3

Transaction codes are important entities of SAP permissions. ARM helps you to identify which users are assigned to a specific transaction code, either direct or indirect, via membership in roles.

 

1. Use the search to find the transaction code you are looking for.

2. Click on the search result.

 

3. ARM automatically expands the tree view of the permission structure and focuses on the transaction code you are looking for.

 

4. ARM displays all permissions.

 

5. ARM displays all SAP users that have assigned the transaction code.

 

Identify multiple authorizations for transaction codes in SAP R/3

As with all permissions, there is often more than just one way a transaction code has been assigned to a user. ARM resolves all of these authorization paths and clearly visualizes these, leaving no room for ambiguity.

 

1. Use the search to find the transaction code you are looking for.

2. Click on the search result.

3. ARM automatically expands the tree view of the authorization structure and focuses on the transaction code you are searching for.

 

4. In the user list, ARM shows you how many authorization paths (arrows) have been set for the transaction code. Click on the user.

 

5. ARM shows you the authorization paths.

 

Identify critical basic permissions in SAP R/3

Use ARM to check regularly for critical basic authorizations following the principle of least privilege, and reduce the risk of data leakage.

 

1. Use the search box to find and select the critical basic authorization you are looking for. ARM opens the SAP authorization structure and focuses on the entry you are looking for.

 

2. Browse through the subordinate structure to analyze the use of the critical basic authorization.

 

Conclusion

That is all I have for now on this release.  I hope that this summary gives you a good understanding of the new features and how they can help you more effectively manage the permissions of your Azure AD, SharePoint, OneDrive, and SAP R/3 applications. 

I look forward to hearing your feedback once you have this new release up and running in your environment!

 

If you are reading this and not already using SolarWinds Access Rights Manager, we encourage you to check out the free download.  It’s free. It’s easy.  Give it a shot.

The Ghosts of Config Past, Present, and Future (Well, Sort Of)

 

The scene is set: the curtains open to a person in bed trying to get a good night’s sleep during a dark and windy night. The hair on the back of their neck is standing on end, and with one big gust their worst features come true! In bursts, a flurry of emails demanding proof for configs of old.

 

Okay, okay, while I’m no Hemingway, I can tell you that we’ve all experienced the nightmare of being visited by configs of old. Being bothered to prove an older configuration was in compliance is a real pain, and the thought of doing this manually makes skin crawl. Enter SolarWinds® Network Configuration Manager (NCM) network configuration management software v7.9 and the Favorite config.

 

Being a “favorite” is always a good thing, and the same can be said for Favorite configs inside of Network Configuration Manager. Just as any favorite gets special handling, Favorite configs are granted special privileges within compliance policies. Compliance Policies are always evaluating the most recent version of a configuration file. If you’re trying to prove compliance of an old file, you need to tell NCM to use that file instead. You do that by setting the config as Favorite.

 

If you set one config from each node as Favorite, then those Favorites will forever be the most recent. This means that you, as the user, would be able to prove these configs’ compliance at any point in the future from that day without any extraordinary effort. The best part of getting this setup is that it can be fairly easy, if you have established rules and policies.

 

Simply mark a config as Favorite either through the UI or, for the savvy user, through the SDK. This is done by navigating to the Configuration Management page and expanding the list of configs nested under a node.

 

Once this is done, you need to make sure to set up or modify your Policies to use this config type.

 

After the policies are set, just add these policies to a Compliance Report. 

 

 

After the Compliance Report is set up, update the report and click on it to see the output. You can verify that this is evaluating the correct config by drilling into any violation and clicking the “View Config” link.

 

If everything is set up correctly, you will see the details for the Favorite config. 


 

And there you have it! You’ll no longer be pressed to manually evaluate older configs for audit review or documentation. If you find this useful, have any comments, or would like to see how this can be done through the SDK, please let me know below!

The team continues to hammer away on enhanced and new application template content for Server & Application Monitor. The list below adds to what has been discussed in recent earlier posts, which you can find here, here, and here.

 

In this update, we will walk through the latest updates, including:

 

  • Verson 2 of Office 365 monitoring – We’ve reorganized the templates a bit, but more importantly, fixed an issue some customers were experiencing where components would randomly go into unknown status
  • Citrix XenServer – Net-new template support
  • Citrix PVS Accelerator for XenServer – Net-new template support
  • Oracle RAC (Real Application Clusters) – Net-new template support

 

As always, please let us know if you have any comments about these templates or requests to add to our list for new template creation. 

 

The info provided in this post is relatively high-level. Click on the links to see the complete detail for each new or updated template.

 

With that, let’s jump right in!

 

Office 365 Exchange Enhancements:

 

As mentioned above, besides reorganizing the templates a bit, the main update here is the fix to the issue some folks had reported where components would randomly go unknown. The issue was due to the fact that Microsoft has a “Global Throttling Policy,” which limits simultaneous connections from one client for O365 and maximum three simultaneous connections are allowed.

 

To overcome this concurrency issue, we have implemented a locking mechanism and restricted three scripts establishing a connection with Office 365.

 

Oracle RAC:

Next up, following on from the previous Oracle template updates, we are also releasing a new template for Oracle RAC, which you can download and read more about at https://thwack.solarwinds.com/docs/DOC-203744

 

The list of metrics available for monitoring include:

  • Average MTS response time
  • Average MTS wait time
  • Sort ratio
  • MTS UGA memory
  • Database file I/O reads
  • User locks
  • Locked users
  • Global cache service utilization
  • Global cache block lost
  • Global cache average block receive time
  • Long queries elapsed time
  • Redo logs contentions
  • Active users
  • Buffer cache hit ratio
  • Dictionary cache hit ratio
  • Average enqueue timeouts
  • Global cache block access latency
  • Nodes down
  • Long queries count
  • Database file I/O write operation
  • Global cache corrupt blocks

 

 

The thing to keep in mind about this template is, just like our other Oracle templates, it requires some prerequisites be set up on the Orion Server and/or Poller for it to work.

 

 

 

 

Citrix XenServer:

The third template we are releasing is for XenServer, which you can download and read more about here – https://thwack.solarwinds.com/docs/DOC-203745

 

Monitors the host as well as the guest VMs running on that host, including the following metrics:

 

  • Host - Free Memory
  • Host - Average CPU
  • Host - Control Domain Load
  • Host - Reclaimed Memory
  • Host - Potential Reclaimed Memory
  • Host - Total Memory
  • Host - Total NIC Receive
  • Host - Total NIC Send
  • Host - Agent Memory Allocation
  • Host - Agent Memory Usage
  • Host - Agent Memory Free
  • Host - Agent Memory Live
  • Host - Physical Interface Receive
  • Host - Physical Interface Sent
  • Host - Physical Interface Receive Error
  • Host - Physical Interface Send Error
  • Host - Storage Repository Cache Size
  • Host - Storage Repository Cache Hits
  • Host - Storage Repository Cache Misses
  • Host - Storage Repository Inflight Requests
  • Host - Storage Repository Read Throughput
  • Host - Storage Repository Write Throughput
  • Host - Storage Repository Total Throughput
  • Host - Storage Repository Write IOPS
  • Host - Storage Repository Read IOPS
  • Host - Storage Repository Total IOPS
  • Host - Storage Repository I/O Wait
  • Host - Storage Repository Read Latency
  • Host - Storage Repository Write Latency
  • Host - Storage Repository Total Latency
  • Host - CPU C State
  • Host - CPU P State
  • Host - CPU Utilization
  • Host - HA Statefile Latency
  • Host - Tapdisks_in_low_memory_mode
  • Host - Storage Repository Write
  • Host - Storage Repository Read
  • Host - Xapi Open FDS
  • Host - Pool Task Count
  • Host - Pool Session Count
  • VM - CPU Utilization
  • VM - Total Memory
  • VM - Memory Target
  • VM - Free Memory
  • VM - vCPUs Full Run
  • VM - vCPUs Full Contention
  • VM - vCPUs Concurrency Hazard
  • VM - vCPUs Idle
  • VM - vCPUs Partial Run
  • VM - vCPUs Partial Contention
  • VM - Disk Write
  • VM - Disk Read
  • VM - Disk Write Latency
  • VM - Disk Read Latency
  • VM - Disk Read IOPs
  • VM - Disk Write IOPs
  • VM - Disk Total IOPs
  • VM - Disk IO Wait
  • VM - Disk Inflight Requests
  • VM - Disk IO Throughput Total
  • VM - Disk IO Throughput Write
  • VM - Disk IO Throughput Read
  • VM - VIF Receive
  • VM - VIF Send
  • VM - VIF Receive Errors
  • VM - VIF Send Errors

 

 

Citrix PVS Accelerator for XenServer

Last but not least, we added a net-new template for Citrix PVS Accelerator for XenServer, which you can read more about and download here - https://thwack.solarwinds.com/docs/DOC-203773

 

Includes the following metrics available for collection:

 

  • PVS - Accelerator Eviction Rate
  • PVS - Accelerator Hit Rate
  • PVS - Accelerator Miss Rate
  • PVS - Accelerator Traffic Clients Sent
  • PVS - Accelerator Traffic Servers Sent
  • PVS - Accelerator Read Rate
  • PVS - Accelerator Saved Network Traffic
  • PVS - Accelerator Space Utilization

 

That’s it for this round of content updates! We have more in process and will post to let you all know as soon as they are ready. As always, you can suggest new templates or features for SAM by creating a Feature Request.

 

Network Configuration Manager (NCM) v7.9 is available today on the customer portal! For a broad overview of this release, the release notes are a great place to start. This is a particularly pleasing release as we are delivering a feature that has received over 470 votes: Multi-Device Baselines.

 

What are Configuration Baselines?

Baselines are often attached to the act of measuring and rating the performance of a given object (interface, device, or similar) in real time. In configuration management terms, baselines are used to provide a framework for change control and management. The configuration baselines measure and evaluate the content set within the config and indicate whether the content is aligned to the baseline or not.      

 

Given that configuration changes over time are more difficult to directly observe and more complex to manage, this means that baselines play a role in monitoring and preventing unwanted changes. I find that this definition of baselines from Techopedia is interesting and accurate:

“It is the center of an effective configuration management program whose purpose is to give a definite basis for change control in a project by controlling various configuration items like work, features, product performance and other measurable configuration.”

 

This means that monitoring may be possible for a small number of nodes, but it is not practical nor is it reasonable to scale this type of manual monitoring framework. Actively monitoring each device’s config makes the validation of consistency and alignment to corporate or regulatory requirements reliable and possible.

 

Baselines

The great news is that NCM already helps with mitigating the challenges related to monitoring configuration drift by providing config change reports, Real Time Change Detection, rules and policies that monitor configurations based on a set of user-defined conditions, and a one-to-one configuration baselining. What we implemented in the latest version of NCM extends and improves configuration baselines to include:

  1. Creating new baseline(s) through
    1. Promoting an existing config to be a baseline, or
    2. Creating a new baseline by copy/paste or loading a file
  2. Ignoring unnecessary configuration lines (or lines unique to each device)
  3. Applying baseline(s) to a single node or multiple nodes

 

<New!> Baseline Management

In this release, there is a new list view of all baselines that have been created or migrated from an upgrade. From this new page, users can create new baselines, edit existing, apply or remove nodes for a given baseline, enable or disable a baseline, update the status of the baseline, or delete a baseline.

 

<New!> Updated Diff Viewer

A major improvement in this release is the implementation of a new diff viewer for baselines. This new diff viewer will collapse lines that are unchanged, highlight ignored lines as gray, and mark all changes as yellow.

 

 

More Ways to Create a Baseline

The process of creating baselines should be easy—take an existing config and simply apply it against a set of nodes, right? In NCM, you can do just that by promoting an existing configuration, loading a config from file, or copying and pasting.

 

Promoting a config is now nested under the node and in the baseline column:

 

Creating a new baseline can be done through the new Baseline Management Page:

 

No matter the steps to create the baseline, each will ultimately lead to applying the baseline to the nodes and configs.

 

Ignoring Extraneous Config Lines

One of the key challenges with baselines is being able to get an accurate assessment of the config and not having false positives for config lines that are unique to a node or not relevant to the baseline. In NCM v7.9, we have introduced an ignore line capability that allows users to click through lines that are not relevant to the baseline to aid in reducing false positives. To read more on this, check out this link.

 

Baseline Status Indicators

To monitor whether or not a node (config) is in compliance with a baseline or baselines, there needs to be a visual and written indication. Baseline Management, Configuration Management, and ‘Baseline vs. Config Conflicts’ report all now have visual and written indicators. On the Configuration Management page, there is a new baseline column that contains the visual and written indication of whether or not that node is in alignment with the baselines applied.

 

For each status, there is a hover that provides a list of all the baselines and their associated status for that node.

 

The new Baseline Management view provides a complete list view of all baselines that have been created. This view is meant to show the alignment of all the nodes that are applied against a single baseline.

 

Each baseline can be expanded to show the status for different nodes to which it is applied (similar to the hover for Configuration Management). Each one of the statuses is clickable and will load the diff of that baseline vs. the config selected.

 

Lastly, the “Baseline vs. Config Conflicts” report also inherits the visual indicators and now shows the status of a node to one or many baselines.

 

This is a major step forward for baselines and the monitoring of configuration drift within NCM. Of course, please be sure to create new feature requests for any additional functionality you would like to see with baselines or NCM in general.

 

Helpful Links:

NCM v7.9 Releases Notes

NCM Support Documentation

Network Configuration Management Software

Network Performance Monitor (NPM) 12.4 and the Orion Platform 2018.4 are now generally available in your customer portal. For those of you subscribing to the updates in What We're Working on for NPM (Updated June 1st, 2018)  you may have noticed a line item called "Centralized Upgrades." This update will give you the first chance to experience Centralized Upgrades on your environment.

 

Great news this upgrade is going to be easier than ever!

 

 

Planning for Your Upgrade to 2018.4

 

Read the release notes and minimum system requirements prior to installation as you may be required to migrate to new server or database infrastructure. For quick reference, I have provided a consolidated list of release notes below.

Note: Customers running Windows Server 2012, 2012 R2, and SQL 2012 will be unable to upgrade to these latest releases prior to migrating to a newer Windows operating system or SQL database version. Check for the recommended Microsoft upgrade path through the upgrade center.

 

See more information about why these infrastructures are deprecated here: Preparing Your Upgrade to Orion Platform 2018.4 and Beyond - Deprecation & Other Important Items

 

SolarWinds strongly recommends that you update to Windows Server 2016 or higher and SQL Server 2016 or higher at your earliest convenience. 

 

 

 

 

 

Refresh your upgrade knowledge with the following upgrade planning references.

 

 

Always back up your database and if possible take a snapshot of your Orion environment.

 

 

Start Your Upgrade on the Main Polling Engine

 

Download any one of the latest release installers to your main polling engine.

 

For the screenshots that follow I'm upgrading my Orion deployment with the following setup:

  • Main Polling Engine is installed with Virtualization Manager (VMAN) 8.3 and will be upgraded to VMAN 8.3.1
    • Utilizes a SQL 2016 database
  • Three scalability engines
    • One Free Additional Polling Engine for VMAN on Windows 2012
    • One Free Additional Polling Engine for VMAN on Windows 2016
    • One HA Backup on Windows 2016

 

My first screen confirms my upgrade path to go from 8.3 to 8.3.1.

  • If I'm out of maintenance for a specific product, I would see indicators here first on the screen. Being out of active maintenance will prevent you from upgrading this installation to the latest, so please pay attention to the messaging here.
  • The SolarWinds installer will upgrade all of the products on this server to the versions of product that are compatible with this version of the Orion Platform for optimal stability. This may mean that you'll be upgrading more than just one product.
  • When in doubt, feel free to run the installer to see the upgrade path provided, so you can plan for your downtime. Cancelling out at the pre-flight check stage will give you all the information needed to plan ahead, without surprises and without changes to your environment.  This information can also be used for your change request before scheduling downtime for your organization.

The second step will run pre-flight checks to see if anything would prevent my upgrade from being successful on the main polling engine.

  • In case there are no blocking, warning, or informational pre-flight checks, we will proceed straight to the next step, accepting the EULA.
    • My main polling engine server and DB meet all infrastructure system requirements for the 2018.4 Orion Platform, so I am not shown any blocking pre-flight checks at this stage.
  • Pre-flight checks can block you from moving forward with your installation
    • You  may need to confirm whether you meet new infrastructure requirements (e.g. NTA 4.2.3 -> 4.4 upgrade) to proceed. Blockers will prevent you from successfully installing or upgrading, so the installer will not allow you to proceed until those issues have been resolved.
    • Warning pre-flight checks give you important information that could affect the functionality of your install after upgrade but will not prevent you from successfully installing or upgrading.
    • Informational pre-flight checks give you helpful troubleshooting information for "what if" scenarios, in case we don't have enough information to determine whether this would be an active issue for your installation.

 

The online installer will start to download all installers needed from the internet

  • SolarWinds recommends that you use the online installer because it will be able to auto-update and download exactly what's needed for the installation. Not only is it more efficient, but it will save you from downloading unnecessary or outdated bits.

 

This screen gives you an overview of next steps. The Configuration wizard will launch next, to allow you to configure database settings and website settings.

In this release, all scalability engines, including Additional Polling Engines, Additional Websites and HA Backup Servers, can be upgraded in parallel manually, using the scalability engine installer. Manual upgrades are still supported, but if you have scalability engines, please try our centralized upgrade workflow to save you time.

 

Follow the configuration wizard steps to completion. If you only have a main polling engine to upgrade, your installation is now complete. Log in to your SolarWinds deployment and enjoy the new features that have been built with care for your use cases.

 

Centralized Upgrades of the Scalability Engines

For those customers who have chosen to scale out their environment using scalability engines, such as Additional Polling Engines, HA Backup Servers or Additional Websites this is the section for you.

 

If you kept the "Launch Orion Web Console" checkbox checked in the final step of the Configuration Wizard, the launched web browser session will navigate you directly to the Updates Available page, where you can continue with the Centralized Upgrade workflow. If you want to open a new web browser session on a different system, you can quickly navigate to where you want to go by following these steps.

 

Launch the web browser and log in.

Navigate to 'My Orion Deployment' from the Settings drop-down.

 

Click to the UPDATES AVAILABLE tab. If this tab is not showing, that means there are no updates available for you to deploy.

Click Start, to begin the process of connecting to your scalability engines.

My environment is not experiencing any issues connecting to my scalability engines.

Bookmark this page Connection problems during an Orion Deployment upgrade - SolarWinds Worldwide, LLC. Help and Support  for future guidance on common "gotcha" scenarios, and how to handle them.

After the contact with scalability engines has been established, pre-flight checks will be run against all scalability engines

Looking at my pre-flight checks you can see that one server PRODMGMT-49 has a blocker that would prevent upgrades from occurring, mainly that it does not meet infrastructure requirements for this version of the Orion Platform.

However, my "Start Upgrade" is enabled. This is because if at least one scalability engine is eligible for upgrade, we will allow you to proceed. Only when none of the scalability engines are eligible will this button be disabled. Pay attention to servers that have blocking pre-flight checks, as you will have to manually upgrade them or move items being monitored via this scalability engine to one that is upgraded.

 

Clicking "Start Upgrade" begins the centralized upgrade process, first by downloading all the necessary bits to all the scalability engines in parallel. Notice how my scalability engine that was on incompatible 2012 infrastructure is not being upgraded.

Grab a coffee as the rest of your installation and configuration happens silently on each of the servers being centrally upgraded.

Oh no, an error occurred. What can you do at this point?

  • Click Retry download after troubleshooting (e.g. did the scalability engine lose connectivity to the main polling engine?)
  • RDP directly into the server using the convenient RDP link that is provided

 

Common scenarios to investigate:

  • Is this scalability engine set up inconsistently from the other servers? For instance, you may have Engineer's Toolset on the Web installed on this server and not on the others.
  • Do some of the installed products have dependencies on .NET 3.5? Engineer's Toolset on the Web has a dependency on .NET 3.5 to be able to upgrade. Ensure that if you have enabled .NET 3.5 and try again.
  • Check the Customer Success Center for more scenarios to help while troubleshooting.

 

In my case, I clicked Retry and was able to get past the issue.

My upgrade is complete! Congratulations on an upgrade well done.

Click Finish to complete your Centralized Upgrade session.

 

Gotchas - What to do with Unreachable Servers

If your server isn't being blocked because of incompatible infrastructure, you have an opportunity to manually upgrade that server in parallel while the rest of your environment is being centrally upgraded.

 

In the installation example captured below, if I were to run the installer on the Additional Website that is currently being upgraded by Centralized Upgrades, I would be blocked from running the installer on that server. However for the listed unreachable Additional Website, I can run that upgrade manually with no problem in parallel.

If you're blocked from proceeding on a manual upgrade, you would see the following. Only until you have finished the Centralized Upgrade process will you be allowed to proceed with a manual upgrade that is blocked in this fashion. For these scenarios, simply navigate to My Orion Deployment and exit out of the deployment wizard flow to cancel the centralized upgrade session.

 

Manual Upgrades

Manual upgrades of your deployment are still supported. If you have only one scalability engine, Centralized Upgrades may not be the fastest way to upgrade. However, if you have more, it is. This upgrade is still beneficial for those considering using manual upgrades for their deployment, and the reason is the installation and configuration wizard process can now be run in parallel. Existing customers have always known that there were some scenarios where you could run the configuration wizard in parallel across servers (e.g. same server type) and some that you could not. It took time and training to understand what scenarios those were. In this release, that limitation is lifted, and all server types can be configured in parallel.

 

There are times where you may need to consider falling back to manual upgrades in combination with your Centralized Upgrade. As an example, take this installation: two have completed, one has the configuration wizard in process.

If the download, installation, or configuration is taking a long time for one of your scalability engines, and you need to see more information that is only available in the client, you may consider canceling out of the Centralized Upgrade session to resume the rest of your upgrade manually. The servers that have been upgraded thus far will remain in a good spot, so you can cancel out with confidence. Proceed with this option carefully, as you will want to ensure that you have upgraded everything by the end of your scheduled downtime.

Check the My Orion Deployment page to ensure that all the servers in your Orion deployment are upgraded.

 

Support

We have all been there, despite all the best intentions and all the preparation in the world, something went wrong. No worries! File a support ticket Submit a Ticket | SolarWinds Customer Portal  and start gathering diagnostics via our new web based and centralized diagnostics.

 

Click to the Diagnostics tab

Select all the servers in your deployment,

and click "Collect Diagnostics."

Sit back and relax as your diagnostics are centrally gathered in preparation for your support call.

 

Customer Experience

 

Early adopters and those who have participated in our release candidates have already begun to enjoy the benefits of centralized upgrades. Check out our THWACK forums for testimonials from customers just like you as they experience the new and improved "Easy Button" upgrade experience. Here's a link to one from one of our very own THWACK MVPs  The "Easy Button" has arrived with the December 2018 install of NAM (and other Solarwinds modules) If you'd like to share your upgrades with me, I'm very interested, and we'd love to see screenshots and your feedback on this new way to upgrade your SolarWinds deployment.

 

More centralized upgrade success - Success with Centralized Upgrades 

IPAM 4.8 has arrived and is now generally available! You can find this latest release in your Customer Portal. In recent releases, we’ve brought you integration with VMware vRealize Automation and Orchestrator and monitoring support for Amazon Web Services (AWS) Route 53 and Azure DNS. In this release, we have extended our support (yet again) to additional platforms and bring you these goodies:

 

Monitoring Support for Infoblox

You asked for it, you got it! This is our #1 integration feature request on THWACK®, and I’ve spoken to many of you at tech conferences about wanting us to monitor your Infoblox DHCP and DNS environments. IPAM provides valuable resources, alerting, and reporting capabilities without having to purchase add-ons, as well as a centralized management console across heterogeneous environments.


 

Migration to Core Custom Properties
We have migrated from product-specific custom fields to the unified custom properties designed to be simple and powerful for you to use with other Orion® Platform products. Now you can add new custom properties the same way you would for other modules and use them for IPAM entities in Reports and Alerts.

Support for More Linux Versions
We have extended DHCP and DNS support to the following Linux distributions:

    • Ubuntu 14.04
    • Ubuntu 16.04
    • Debian 9.5
    • Debian 8.6 (DHCP only)

 

HELPFUL LINKS:

 

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates.  All other trademarks are the property of their respective owners.

·         IPAM IP address management software[MJ1]

We’re delighted to announce the release of version 4.5 of NetFlow Traffic Analyzer (NTA)!

 

The latest release of SolarWinds® NetFlow Traffic Analyzer is designed to help create alerts based on application flows. In past releases, we could alert on the overall utilization of an interface and provide a view of the top talkers when the configured threshold was exceeded. In this release, you can set a threshold on the volume of a specific application in order to trigger an alert. We're making use of the Orion Platform alerting framework, so that flexibility is available to you.

 

You’ve outlined a small set of critical problems in multiple requests, and in this release, we’re delivering on the five most popular of these.

 

  • Application traffic exceeds a threshold – Alert triggered when we observe a specific application rate exceeds a user-defined threshold
  • Application traffic falls below a threshold – Alert that can provide visibility when an application “goes off the air” and stops communicating
  • Application traffic appears in the “TopN” list of applications – This alert triggers when application traffic increases suddenly relative to other applications
  • Application traffic drops from the “TopN” list of applications – Likewise, alert triggers for a sudden reduction relative to other applications
  • Flow data stops from a configured flow source – Alerts on the loss of flow instrumentation, and prompts to take action to help restore visibility

 

Contextual Alerting

The approach we're using to create alerts is built to guide users into a particular context—a source of flow where we see the application traffic—and then offers a simple user experience to create the alert.

To create an alert based upon any these triggers, we must first select a source of flow data as a point of reference. We can do these one of two ways.

 

We can visit the NTA Summary Page, and navigate to a particular source of flow data:

 

 

If the application of interest is in the TopN, we can expand it to see where this application is visible and select that source. That will take us to a detail page, which is already filtered by both application and source of the flow data.

 

We can also select our source of flow data directly in the Flow Navigator. We can build our alert based upon a node that reports flow, or upon a specific interface:

 

 

Once we have a context for an alert, we can select an application. If we use the "TopN Applications" resource, we have already identified both the application and the node or interface where it's visible.

Another way to arrive at this context can make use of the Flow Navigator, where we can explicitly select the application we’re interested in:

 

 

 

We can select either Applications, or NBAR2 Applications, to help describe the traffic. With the context now fully described, we are able to open the "Create a Flow Alert" panel and create our first alert:

 

 

At the top of the panel, we'll see the source of the flow data that we'll evaluate, and a default alert name prefix. We can customize the alert name to help make searching simpler. The severity of the alert is configurable:

 

For the Trigger Condition, we'll select one of the options described above. In this case, we'll select "Application Traffic exceeds Threshold," and we'll set a threshold of 50MBps on the ingress. We'll evaluate the last five minutes of traffic; this is configurable. This threshold will trigger when our traffic rate averages greater than 50MBps over the five min. time period.

 

Finally, we can specify one or several protocols; if we specify more than one, we'll sum the traffic volumes for all the protocols.

 

To create the alert, there are two options. We can select the "Create Alert" immediately, and this will simply log the alert when it triggers. Or, we can check the box to open the alert in the Advanced Alert Editor and then select "Create Alert." Selecting this option will redirect us to the last step in the "Add New Alert" wizard, where we can modify the trigger actions, reset actions, or time of day schedule.

 

 

The trigger condition is an advanced SWQL query, pre-populated with the contextual information on the source and application.

 

Before submitting this new alert, we'll see a message indicating whether the alert will trigger immediately.

 

Practical Alert Scenarios

Use the "exceeds threshold" alert for application traffic levels that average above or below the specified threshold.

Use the operation for ">" (greater than) or "<=" (less than or equal to) to determine then you can alert above or below the threshold. For example:

  • To determine when backup application traffic is running out of schedule
  • To identify large file transfers in the middle of the day
  • To identify DDOS attacks, or when Port 0 traffic is present at all

Use the <= “exceeds threshold” to help detect when an application server process goes offline and stops sending traffic.

  • The application service may have crashed
  • An intermediate connectivity problem (firewall or outage) may have reduced traffic

Use alerts related to applications appearing in—or dropping out of—the TopN can be useful for detecting sudden changes in traffic volume relative to other applications. Examples include:

  • Detecting streaming or peer-to-peer file sharing applications that are transient
  • Detecting changes in the mix of applications that usually traverse an interface

 

You can also set up an alert for each of your NetFlow sources to help take action if the configuration is modified, or firewall rules block flow traffic.

 

User Experience Improvements

This release of NTA also includes a number of small but significant improvements in the user interface to help enhance scalability and improve ease of use. Several long lists are now uniformly ordered, and we’ve changed how we label certain features to be clearer in the navigation.

 

Additional Resources

Check out the Release Notes, download the new release on the Customer Portal, and get additional help with the upgrade at the Success Center.

 

You can see these new features in action in the webcast, “Up, Down, and Gone: A Tale of Applications and Flow.”

 

This is an initial introduction of the traffic alerting feature. Be sure to enter additional feature requests and expanded functionality that you'd like to see with this capability!

 

jreves

NPM 12.4 is available today, December 4, on the Customer Portal! The release notes are a great place to get a broad overview of everything in the release. Here, I'd like to go into greater depth on the brand-new Cisco ACI support. Let’s talk a bit about how software-defined networks are different than traditional networks, what that means for monitoring, and how to get the most out of the new ACI monitoring feature.

 

What is SDN?

 

The first time I heard the term Software Defined Network, I thought it was stupid. All networks are defined by software. Software moves packets and frames, or programs the hardware that does it. Software is used to manually configure networks via CLI. Software is used to automatically configure networks with protocols like OSPF, STP, and LLDP. Networks were alreadysoftware-defined!

 

Whether SDN is a good name or not, it is an important concept. There’s a lot of people trying to define SDN, usually with some ulterior motive of placing themselves in a favorable position. For a slightly less biased view, check out the Wikipedia definition. The thing that stands out to me is:

SDN suggests to centralize network intelligence in one network component by disassociating the forwarding process of network packets (data plane) from the routing process (control plane).

 

This is a big change. In an SDN environment, network devices like routers and switches become simple devices that just move traffic at a high rate. All the intelligence is in a separate device called the controller. The controller learns how everything is connected, what connectivity applications need, and writes instructions to all of the network devices so they know how to forward traffic.

 

There are a ton of SDN solutions available today. The two most popular commercial solutions seem to be Cisco ACI and VMware NSX. Cisco ACI is more commonly requested by our customers (see NPM Monitor Cisco ACI and Support of a Cisco ACI networks in Network Performance Monitor compared to Vmware NSX Support), so we’ve built support for it first.

 

How Do I Monitor SDN?

 

An SDN fabric consists of a data plane and a control plane. The data plane is comprised of physical devices, Nexus switches, and, in the case of Cisco ACI, cabling. The control plane is comprised of many logical components that fit together to define what endpoints are allowed to send network traffic to each other. The modular nature of the configuration reminds me of Cisco’s MQC. To make sure your SDN environment is running well, you need to monitor both layers.

 

Data Plane (aka Underlay aka Infrastructure Layer)

 

AKA the boring stuff. This is not the glamorous part of SDN. It’s the stuff you’ve been doing for years: power supplies, fans, temperatures, CPU, RAM, and interface stats. The fact of the matter is, these things all need to function properly for your SDN environment to be performant and reliable.

 

The data plane for Cisco ACI environments is made up of the Cisco Nexus model line. Fortunately, NPM 12.3, the release before this one, introduced Network Insight for Nexus. This gave NPM better than ever support for this hardware.

 

It’s easy to set up. Navigate over to Settings (top menu bar) -> Manage Nodes -> Add Node. Add your spine switches and leaf switches as SNMP nodes. On the last step, make sure to check this box:

 

 

If you already have your switches in NPM, you can find the same checkbox when you edit a node.

 

You’ll be prompted for your CLI credentials. CLI is the only way some of this very important data is available, so that’s how NPM gets it. This will cover the basics like power supplies, fans, temperature sensors, CPU, RAM, and interface statistics, plus the advanced stuff like VPC.  Those of you with NCM can also get access list version control and analysis. Those of you with NTA will get flow analysis. You can check all of that out on our demo site here.

 

Okay, let’s get to the new interesting stuff.

 

 

Control Plane (aka Overlay aka Control Layer)

 

In an SDN environment, the controller has all the intelligence. This has a big impact on monitoring. Instead of polling dozens or hundreds of devices that each have their own very narrow view of the network, we can poll the controller directly. It has to know where everything is or it couldn’t control it. This means we can learn a lot from monitoring it.

 

This part is also easy to set up. Navigate again to Settings (top menu bar) -> Manage Nodes -> Add Node. In a Cisco ACI environment, the controller is called an APIC. Add your controller as SNMP nodes. At the bottom of the first screen you’ll see this checkbox:

 

 

Check it! If you’ve already got your APIC added, edit the node and you can find the same box to check.

 

Cisco strongly recommends each ACI fabric have three APICs. Since each APIC must be able to control the entire network if necessary, each APIC has a complete view of the network. Polling them all results in a lot of duplication of work and potentially duplicate alerts. You have a choice in how you approach monitoring of these devices:

  1. 1)    Add all three APICs to monitor but enable API-based ACI polling (the checkbox) for only one controller.
    1. a.    Pros: efficient for the APICs and efficient for NPM.
    2. b.    Cons: if the controller you’re doing API-based polling on goes down, you’ll see the APIC is down, but you’ll lose visibility to the control plane until you fix it or enable API-based polling for another controller.
  2. 2)    Add all three APICs to monitoring and enable API based ACI polling for all three controllers.
    1. a.    Pros: Control plane monitoring works, even if one or two of the three APICs go down.
    2. b.    Cons: NPM has to poll the same data three times. APICs have to provide the same data three times. You will get duplicate alerts and reporting data unless you’re careful to write your alerts in consideration of the duplicate data. More on this in a future post.

 

Our recommendation is to do #1, but either way will work.

 

The API-based polling runs over TLS. If you have a valid cert on your controllers, everything will add fine and you’ll be good to go. If you have a self-signed cert, you will receive a warning about it and you’ll have to accept the risk or replace it with a properly signed cert before proceeding. You do have a real cert on your APIC, right?

 

Once you complete the add node wizard, navigate on over to Node Details for one of your APICs with API-based polling enabled. You can click along with me right now on the Online Demo.  On the left side, you’ll see two new views: Members and Map. Let’s look at Members first.

 

 

The Members view shows all of the logical components we have discovered. This includes Tenants, Application Profiles, and EndPoint Groups. It also includes the APIC’s view of the physical components: leaf switches and spine switches.

 

 

This uses the framework’s List View, which is a polished way to deal with large lists. You can do multilevel filtering on the left, like sort, and search. The list contains the name of the component (example: Tenant3), the type of component (example: Tenant), and the distinguished name (example: uni/tn-Tenant3). On the right, we see the health score. Let’s talk about that.

 

Since the controller has visibility into all components and their relationships, for the first time, part of the network infrastructure is in a position to accurately assess its health. Cisco ACI does this by assigning a health score. The health score is an integer from 1 to 100, where 100 is perfectly healthy and less than 100... isn’t. The health score takes into consideration both parents and descendants in the ACI model. You can check out the exact formula here. Since health scores represent status, they’re polled at the status interval in NPM. As always, you can adjust this interval. All of this data is polled via Cortex, incidentally, our new polling framework that you previously saw powering PerfStack Real-Time Polling.

 

Health scores will be colored red, yellow, or green according to thresholds. There are thresholds on the APIC already for this that determine what color that score is in the APIC GUI. To stay consistent, NPM learns the thresholds from the APIC and applies those. If you customize the thresholds on the APIC, NPM will learn and apply the new threshold settings.

 

You can click on a health score to get the history in the PerfStack dashboard:

 

 

Thanks to this being in PerfStack, it’s easy to start correlating other metrics about the APIC, leaf switches, and spine switches. It gets more interesting when you start correlating to end node availability, latency, and other data NPM has. If you own other modules on the Orion Platform, you can correlate that data too; for example, application counters, database wait time, IOPs, logs, and all the rest. Seeing all this data normalized on the same shared timeline is powerful for troubleshooting. If a health score is in bad shape and you think the issue is on the controller, it’s time to log in to the APIC itself. The APIC can tell you what is causing the score to be what it is and has a bunch of additional ways to troubleshoot.

 

Returning to the sub-view menu on the left, let’s check out the Map tab.

 

When you first open the map, you’re only going to see the APIC in the center. To get more on the map, select the APIC. On the right side, the inspector panel will open. Here you can check the box next to related entities and press Add at the bottom to add them to the map. You can use this method to continue to spider through your ACI environment. This works well for creating a map of a small ACI environment or of a specific section of a larger ACI environment, like a tenant or an app. Once you’ve got a map you like, you can select to Save as a group in the top right. From that point forward, you can navigate to that group and press the Map tab to see the map again. Here’s an example of one I saved in my lab:

 

 

Pretty slick! One important note: the APIC GUI already has some capability to map an ACI environment. In talking to NPM users who run ACI environments, I frequently heard that they would like to grant read-only access via a common platform for folks who don’t have access to the APIC directly, like NOC engineers. This accomplishes that goal and lets you correlate and visualize with all of the other data currently available in Orion Maps.

 

Next Steps

 

To upgrade now, customers with NPM under active maintenance can head over to the Customer Portal and download NPM 12.4. Thanks to the improved Orion Installer, upgrade is faster than ever with centralized upgrade of additional polling engines. Once you’re installed, add those ACI nodes and reply here to let us know how it’s working for you!

animelov

Applying SWQL Part 2

Posted by animelov Employee Nov 13, 2018

Hi, all! Welcome back to the continuation of our Primer posts on SWQL and the Orion® SDK. In the last post, we showed how to create dashboards using SWQL queries. Now we’re going to take it one step further with some other uses for SWQL:

 

Dashboards:

As with the reports, you can also add a custom SolarWinds Query Language (SWQL) query to a dashboard. If you aren’t familiar with customizing dashboards and widgets, check out these videos first:
Creating a New View
Adding and Customizing Resources

 

To get started, make sure you’re logged in as an admin to SolarWinds, or a user that has rights to make updates/changes to views. Once that’s confirmed, go to the page you wish to update, and go to the left-hand drawer and select “Customize Page.”

Search for “Custom Table” and drag and drop the widget onto your dashboard. There’s also “Custom Query,” and we’ll explore the advantages further down:

Select “Done Adding,” then “Done Editing” when complete. With your newly created widget, go ahead and select either “Edit” in the upper-right, or “Configure this resource” in the middle:

This should look familiar to the report writer’s interface at this point. Give this table a title, then click “Select Datasource.”

Change the Selection Method to “Advanced Database Query (SQL, SWQL)” and make sure the radio button is set to “SWQL.” Then copy/paste your query and preview results to make sure everything looks okay:

Select “Update Datasource” when complete. Just like the report writer above, you can select and format your columns. Once you’re finished, click submit, and you now have a custom table on your dashboard!

 

Note: The Host Name column being blank isn’t an error, these machines are not associated to a host.  We’ll explore formatting in a later post to show these as “N/A” instead.

 

Now, let’s try the custom query instead. With the “Custom Query” widget, you don’t have as many options in formatting, but it gives you two distinct advantages: the ability to paginate, and the ability to add searches. Pagination will be very important for larger lists, not only for cleanliness, but also for load times on the page you’re viewing, by restricting to X number of results at a time.

 

Again, go to the left-hand drawer and select “Customize Page,” then “Add Widget.” This time, search for “Custom Query” and drag/drop this widget to your dashboard:

Now, select “Edit” in the upper-right corner of the widget:

Notice here, you only get a box where “Select Datasource” would normally be. Go ahead and copy/paste your query in here, but since you don’t get the option of selecting the order of the columns, make sure your columns in the select statement are in the order you want them in. For example, with our query:
select OAA.Displayname, OAA.Status, OAA.Node.Caption, OAA.Node.VirtualMachine.Host.HostName from Orion.APM.Application OAA

Displayname” will be the first column, “Status” will be the second column, and so forth. So now that we have this:

That will result in a widget that looks like this:

Notice the “Page 1 of 2” at the bottom? This will help reduce clutter on your dashboards by keeping the list neat and tidy, and at the same time help with page loads, since we’re restricting to only five results. Another cool feature is the “Search” function. Edit the widget again, and this time check the “Enable Search” box:

Now you have another box to insert your query, and a note about adding a where clause for the search string. When we’re finished, we’ll have a search box on the widget page, and whatever you put in that box will go into the ${SEARCH_STRING} variable. This will change our query to add the where clause. In this case, we’re going to search on the Application name, which is our first column:

 

select OAA.Displayname, OAA.Status, OAA.Node.Caption, OAA.Node.VirtualMachine.Host.HostName from Orion.APM.Application OAA WHERE OAA.DisplayName like ‘%${SEARCH_STRING}%’

 

The keen-eyed individuals will notice we added just a little bit more here. In SWQL, if you want to do a wildcard match instead of an exact match, you use the word “like” instead of “=”. Then, you use the percent character (%) to denote a wildcard, not an asterisk (*). Finally, in SWQL you always use single quotes for strings, never double quotes. Let’s put that in our search box:

And now we have a search box!

To test, let’s search on IIS and see what we get:

There we go! Remember, this is just an application example, you can use this for anything else that you’re collecting in the product. For more examples, check out my other post for searching on a Port Description in User Device Tracker (UDT): https://thwack.solarwinds.com/docs/DOC-192885

 

 

That’s it for now! Stay tuned for future posts on formatting SWQL queries in these reports!

SQL Server upgrades are a pain, I know.

 

And boring, too. It’s not very exciting to watch a progress bar.

 

Many people put off upgrading SQL Server. They wait for a business reason or an important security patch. Or, as was the case historically, they wait for the first service pack. After all, if it ain’t broke, don’t touch it.

 

I’m here today to tell you those days are over.

 

No longer can you sit back and allow systems and applications to lag behind with regards to patches and upgrades. You must stay current. Allowing applications to be more than one major version behind puts you, and your systems, at greater risk for security threats than ever before.

 

Microsoft has made it easier to upgrade and patch SQL Server. They’ve removed service packs, opting instead for cumulative updates. By shifting to a model that is similar to continuous deployment, Microsoft is able to deliver features, performance improvements, and security enhancements at a faster rate than ever before.

 

So, if you are waiting for SQL Server 2017 SP1, you’ll be waiting forever.

 

Don’t wait. Get started on upgrading SQL Server to the latest version today.

 

Let me help you understand just a few of the reasons why upgrading SQL Server is right for you.

 

Reasons for Upgrading SQL Server

As I mentioned before, it’s just common sense to stay current with the latest version of SQL Server. Microsoft has built tools like the Database Migration Assistant to help make upgrades easy. Applying cumulative updates has also been simplified. And because Microsoft hosts millions of database workloads inside of Azure SQL Database, you can be assured that these updates have been tested thoroughly.

 

Here’s a handful of the features available, out of the box, when you upgrade to the latest version of SQL Server.

 

Automatic database tuning – The ability for the database engine to identify and fix performance problems.

 

Adaptive query processing – While processing the execution plan, SQL Server will adapt query plans as necessary, essentially tuning itself instead of reusing the same plan.

 

Data security and privacy featuresAlways Encrypted, Dynamic Data Masking, Row Level Security, Data Discovery and Classification, and Vulnerability Assessment are all new, and all awesome.

 

Those are just a handful of the improvements. You will also find things like faster DBCC CHECKDB, improved backup security, and a new cardinality estimator. All those are great features worth your time for upgrading.

 

But there’s one more thing: the Orion® Platform.

 

See, we’ve been busy refactoring the Orion Platform to take advantage of newer SQL Server features.

 

Reasons for Upgrading Your Orion Installation

When I’m at an event performing demos, I am surprised how many customers haven’t upgraded to the latest version of the Orion Platform. Of course, I understand the many reasons why upgrades are put on the back burner.

 

I’m here today to help you understand that there’s more to the latest Orion version than a few fancy screens.

 

By using columnstore indexes, we have reduced the size of the Orion database (up to 33% less space), the amount of time it takes to perform maintenance (up to 6x faster on average), and the amount of time to retrieve data (up to 10x faster). That’s a lot of performance gains.

 

Table partitioning allows Log Manager for Orion to scale, accommodating multiple log sources, and the ability to quickly display all logs in time sequential order. As anyone that has had to analyze logs will tell you, it’s important to be able to quickly see all events in the exact order they occurred.

 

Also, in-memory OLTP helps products that leverage the Orion Platform achieve a high rate of concurrency, accelerating performance and scalability.

 

Those features sound great, but don’t just take my word for it. You should read about the SQL Server features being used by NetFlow Traffic Analyzer (NTA) over at this FAQ page.

 

Now, at the bottom of that page, I want to call out something else that you will find interesting…

 

“You can install your NTA Flow Storage database and your Orion database in the same instance of MS SQL, provided that instance is an MS SQL 2016 SP1 or later version.”

 

That’s right, upgrading to the latest version of NTA allows you to consolidate your SolarWinds footprint. For customers paying by the core for SQL Server licensing, this alone should motivate you to upgrade.

 

I’ll make it easy for you: here’s a link to help you get started. Also, here’s the official upgrade guide located on our Customer Success Center.

 

I’ve also written some other in-depth posts about tips and tricks on upgrading SQL Server. Have a look—I believe you’ll find the information useful.

 

Summary

At the end of the day, we want the same thing that any company would want: happy customers.

 

By upgrading to the latest version of SQL Server, and then the Orion Platform, our customers can see benefits immediately. Not just in performance, but in your wallet.

 

Continuous improvement is the world in which we live now. Stop thinking of upgrades as a chore or a task to get past. Upgrade because you want to, not because you have to.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

If you have installed or upgraded any Orion® Platform product module over the course of the last six months and were running Orion product modules on either Windows Server 2012, Windows Server 2012 R2, or SQL Server 2012, you probably noticed an ominous warning message notifying you that these operating system and SQL database versions are deprecated and will no longer be supported in a forthcoming release.

 

Windows Server 2012 / R2 Deprecation NoticeMicrosoft SQL Server 2012 Deprecation Notice

 

 

If you didn’t encounter this message during your latest upgrade or install, not to worry. The above message only appears if Orion product modules are being installed or upgraded on an operating system or SQL database version that has been deprecated. If you're running several versions behind but have been keeping tabs on the release notes, eyeing all the wonderful features that await you when you do your next upgrade, you will find a similar deprecation verbiage there for every Orion module letting you know that you should upgrade from Windows Server 2012, Server 2012 R2, and SQL 2012 at your earliest convenience to stay current with later releases.

 

 

So what exactly is the purpose of these deprecation notices and why should I care?

 

Deprecation notifications such as these serve as sign postings to our customers of important impending changes in the matrix of operating systems and SQL database versions that will no longer be supported in future releases. These types of advanced notices were introduced at the request of customers like you. Their intention is to allow you ample opportunity to upgrade your environment prior to the release of newer Orion product module versions where these operating systems and database versions may no longer supported.

 

 

Life before Deprecation Notices

 

Prior to the inclusion of these deprecation notices, the only real way of knowing if an operating system or database version was no longer supported in the latest release of the Orion Platform was to download and attempt to install it. This was obviously much too late in the process, as by this point you likely only received approval from the change advisory board to upgrade your Orion install, and your window for downtime was narrow enough only to allow for the upgrade of your Orion product modules and not the operating system or database server that your Orion Platform resided upon. As you could imagine, this was a frustrating or even downright infuriating time to find out your upgrade was blocked. To prevent these types of mishaps from occurring, SolarWinds provides in-product deprecation notices one version in advance, warning customers that future releases are unlikely to support these older operating systems or SQL database versions.

 

 

My OS or SQL database version has been deprecated. How am I affected?

 

In short, you're probably not. These deprecation notices apply only to the absolute latest releases and are not applicable to previous versions of the product. There has always been zero requirement that customers upgrade to the latest version to continue receiving support. While we welcome and encourage all our customers to take full advantage of the latest enhancements and improvements included in newer versions of the product, this is not always possible or practical in every customer environment. Some organizations even have firm constraints that require them to stay at least one version behind the latest at all times.

 

For those reasons and more, we continue to fully support several previous released versions of Orion product modules at any given time. Suffice to say, if you're currently running NPM 12.3 or any other Orion Platform 2018.2 module release on Windows Server 2012, 2012 R2, or SQL 2012, there is no immediate impending requirement to upgrade. SolarWinds end-of-life policy helps ensure that these versions will remain fully supported, even when installed on Server 2012, 2012 R2, or SQL 2012.

 

 

Why are you deprecating my otherwise perfectly fine operating system or SQL database version?

 

Going forward, the Orion Platform and its related modules will begin to leverage new technologies only available in newer versions of SQL and Windows. New capabilities such as In-Memory OLTP, columnstore indexes, as well as partitioned tables and indexes aim to improve various aspects of performance and scalability for the entire Orion Platform, as well as the modules installed atop it. This will allow for accelerated website performance, shorter nightly database maintenance routines, reduced database size, and faster report generation, to name only a few areas of noticeable improvement.

 

Windows Server 2016 and 2019, as well as the version of IIS included with them, provide a host of important new security improvements that are critical to organizations of all sizes. These include things like supporting newer, stronger encryption ciphers, HTTP Strict Transport Security (HSTS) enabled websites, secure cookies, and more. While patches for specific critical security vulnerabilities will still be made available for Windows Server 2012 and 2012 R2, vital new security enhancements, bug fixes, and other notable improvements will only be available to later versions of the Windows operating system still under mainstream support.

 

 

How can I better plan for possible future OS and SQL deprecations?

 

While SolarWinds does everything reasonably possible to help ensure customers stay well informed of impending deprecations, some have asked for a longer-term outlook so they can plan their upgrade and server migration schedules accordingly. First, when selecting which operating system or database version to install Orion product modules on, we always recommend using the latest possible version of both. This decreases the likelihood of that operating system or database version being deprecated anytime in the foreseeable future, while also limiting the number of times you need to migrate your Orion installation to a newer server throughout its lifetime. To stay proactively ahead of any impending deprecation notices, however, you need only look to Microsoft's published product lifecycle for Windows and SQL Server.

 

Put simply, the Orion Platform will support Windows operating system and SQL database versions covered under Microsoft's Mainstream support that are available at the time of that versions GA release date.

 

 

I still have Windows and SQL 2012 in my environment. Can I continue monitoring those systems with Orion?

 

Absolutely! Monitoring Windows Server 2012 and SQL 2012 systems with the Orion Platform and its related modules remains fully supported, even in the latest releases. This support also extends to those systems monitored using the Orion Agent.

 

 

What Windows and SQL server versions exactly should I expect will be supported in the release following Orion Platform 2018.2?

 

The following table outlines those versions of SQL and Windows Server that will be supported in the Orion Platform release following version 2018.2:

 

Supported Operating System VersionsSupported Microsoft SQL Server Versions
Windows Server 2016SQL 2014
Windows Server 2019SQL 2016
SQL 2017
Amazon RDS

 

 

I'm currently on Windows or SQL Server 2012. How do I upgrade?

 

In recent years, Microsoft has made the in-place upgrade process easier and more reliable than ever. In-place upgrades are likely also the fastest method for getting your Orion server up to the latest operating system or SQL database version. If an in-place upgrade isn't for you, SolarWinds provides a wealth of documentation on migrating your Orion Platform to a new server.

 

 

Also in the Success Center, you will find documentation on migrating your Orion database to a new SQL Server.

 

 

 

I need assistance with my next Orion upgrade, what options do you provide?

 

If you need a refresher course on the upgrade process, or a confidence boost that you're on the right track, you will find on-demand training videos and instructor-led virtual online classes you can attend for free through the Customer Portal. As always, if at any time you encounter an issue during your upgrade, don't hesitate to contact SolarWinds support for assistance. We are here 24 hours a day, seven days a week, 365 days a year to help ensure you are successful using SolarWinds products.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates.  All other trademarks are the property of their respective owners.

If you've been using the SolarWinds® Orion® unified IT monitoring platform for more than a few years, it's likely that you've at least once migrated it to a new server. You may have migrated from a physical server to a virtual one, or perhaps you simply needed to migrate to a server running a more modern operating system. Regardless of the reason, you're likely aware that there's a litany of documentation and training videos on the subject. Below are just a few, in case you’re curious.

 

 

Index

 

Deprecation

 

Now, as many of you have likely discovered during your last upgrade to SolarWinds® Network Performance Monitor (NPM) 12.3, SolarWinds® Server & Application Monitor (SAM) 6.7, or any other product modules released in 2018, support for Windows Server 2012, Server 2012 R2, and SQL Server 2012 were officially deprecated in those releases. You may have stumbled upon that when reviewing the release notes, or during the pre-flight checklist when running the installer.

 

Deprecation does not mean that those versions aren’t supported with Network Performance Monitor 12.3, Server & Application Monitor 6.7, etc. Deprecation simply means that new versions released in the future are unlikely to support those older operating systems. These deprecation notices were added at the request of customers like yourself, who asked to be provided with advance notice when future versions of Orion product modules would no longer run on older operating systems or SQL database versions. Those deprecation notices serve to allow customers an opportunity to budget and plan for these changes accordingly, rather than find out during a 3 a.m. change window that the upgrade you planned doesn't support your current OS or database version.

 

While future product module versions may no longer support Windows Server 2012, Server 2012 R2, or SQL Server 2012, that doesn't mean all previous versions are no longer supported at all. In fact, the latest, currently shipping versions of NPM, SAM, and other SolarWinds products are planned to continue to support running on Windows Server 2012 and 2012 R2 for many more years to come. So, if you're happy with the versions of product modules you're running today, take your time and don't rush your OS upgrade or server migration. Plan it appropriately. We'll still be here, waiting with a boatload of awesome new features whenever you're ready to upgrade.

 

Preface

 

But I digress. Since many of you have planned, or will eventually be planning, to migrate your Orion Platform to Server 2016 or perhaps even Server 2019, this inevitably stirs up painful memories of migrations you have undoubtedly done in the past—whether that was the Orion Platform specifically or some other mission-critical system in your environment. Let's face it, these migrations are typically neither fun or easy. Luckily, we’ll discuss how you can help change that.

 

Now, over the years, I have performed countless Orion Platform upgrades and migrations. And I, like many of you, have amassed a tremendous treasure trove of tips and tricks for streamlining the process down to an art form. The name of the game here is called downtime, and the less of it you can incur during your migration the sooner you get to go home, and the more you look like a rock star to your boss. What I'm about to show you here is how you can migrate your Orion Platform server with zero downtime!

 

As I stated above, there is a wealth of information on the subject of migrating the Orion Platform to a new server, and perhaps I overstated it a bit when I suggested that you're doing it wrong. There are multiple different (yet still correct) ways of migrating the Orion Platform from one server to another, and some ways may be faster, easier, or less error prone than others. These migrations typically vary depending upon the type of Orion server role the machine is hosting. This blog post focuses on the main Orion server, but the strategy can apply equally to Additional Polling Engines.

 

Orion Server Migration Made Easy

 

Going forward I'll assume you have at least one Orion server running NPM 12.0.1 or later—that one server being the main Orion server itself. If you're still running NPM 11.5.x, then chances are good you're not planning to migrate directly to Server 2016 or Server 2019 anyway, since 11.5.x isn't supported on either of those operating systems. I'm also going to assume that your Orion Platform server is currently running on Server 2012 or 2012 R2, though this process is equally applicable to those still rocking Server 2008 or Server 2008 R2. I'm also going to assume you have another freshly installed server ready for your Orion Platform migration. Lastly, this document won't be covering database server migrations. If that's what you were hoping for, there's an excellent document on the subject here.

 

First Things First

As with any good do-it-yourself project, the first order of business is to throw out, or otherwise lose, the instructions. I'm going to be walking you through what I’ve found to be the simplest, fastest, least error-prone manner of migrating the Orion Platform to a new server with absolutely zero downtime. None of those other documents or videos referenced above are going to show you how to do that, so let's just pretend they never existed.

 

Schedule Your Maintenance Window

While the Orion server should not be going down during the migration, it's always best to plan for the worst and hope for the best. I don't want anyone telling their boss that they decided to migrate their Orion server in the middle of the day because some guy on THWACK® named aLTeReGo told them to do it.

 

Backup Your Orion Server

Conventional wisdom would tell you that if it can go wrong, it probably will—so be prepared. If your Orion server is running on a virtual machine, take a snapshot prior to the migration just in case. While we won't be messing with that server at all during the migration, it's always good to have a safety net just in case.

 

 

Backup Your Orion Database

I can't emphasize this enough. BACKUP YOUR DATABASE! Seriously, just do it. Not sure if the backup from last night completed successfully? Do another one. Everything important is in the database, and with a backup, you can restore from virtually any disaster. If the database is corrupted though and you don't have a good backup to restore from, you may be rebuilding your Orion Platform again from scratch. You don't need to shut down the Orion Platform to take a backup, so go ahead and take another just to be on the safe side. We'll wait.

 

Need a little extra insurance? Why not give SolarWinds cloud-based server backup a try?

 

 

Do Not Upgrade (Yet)

If you’re migrating as part of an upgrade, don't upgrade yet unless you’ll be migrating to Windows Server 2019. It's best to leave the original server fully intact/as-is in the event something goes wrong and you need to roll back. There will be plenty of time to upgrade and play with all the cool new features later. For now, just focus.

 

 

Have Faith and Take a Deep Breath

This is going to start off a bit odd, but stick with me and we'll all come out of this together. Start by going to [Settings > All Settings > High Availability Deployment Summary] in the Orion web interface from a web browser on the new machine where you plan to migrate the Orion Platform. Next, click [Setup a New HA Server > Get Started Settings Up a Server > Download Installer Now].

 

Download the High Availability Secondary Server Installer

 

 

That's right, we'll be using the power of Orion High Availability (HA) to perform this Orion server migration. If at this point you're worried that you can't take advantage of this awesome migration method because you don't own an Orion High Availability license, fret not. Every Orion Platform installation comes with a full 30-day evaluation of High Availability for use on an unlimited number of servers. That's more than enough time for us to complete this migration! If you have no need for Orion High Availability, don't worry. The final steps in this migration process include disabling High Availability, so there's no requirement to purchase anything. However, you might find yourself so smitten with Orion High Availability by the end of the migration that you may wonder how you ever managed to live without it. You've been warned!

 

Begin Installation On Your New Orion Server

Once downloaded, double-click on the Scalability Engines Installer. Depending on which version of the Orion Platform you're running, the Scalability Engines Installer may look significantly different, so I've included screenshots below from both versions. On the top row, you’ll see screenshots from the Scalability Engines Installer version 1.x and version 2.x below that. Regardless of which version you're running through, the end result should be identical.

 

Connect to Existing Orion Server on Original Server

Select Server Role to Install

 

Once the installation is complete, the installer will walk you through the Configuration Wizard process. Ensure that all settings entered in the Configuration Wizard are identical to those used by your existing Orion server.

 

Let The Fun Begin

 

Now that we've installed a secondary Orion server, it's time to join them together into a pool (aka cluster). To do this, we begin by logging into the Orion web interface on your original Orion server. From there go to [Settings -> All Settings -> High Availability Deployment Summary]. There, you should find listed your original Orion server that you're logged into now, as well as your new Orion server that you just installed in the steps above. Click the “Set up High Availability Pool” button next to the name of your new Orion server.

 

Now, if both your existing Orion server and the new server you'll be migrating to are located on the same subnet, you might be prompted to enter different information within the HA Pool creation wizard. It's also important to note that if you're currently running Orion Platform 2017.1 or earlier, it will not be possible to perform this zero-downtime server migration, unless both your existing and new Orion servers are located on the same subnet.

 

Same Subnet Migration

 

If both your existing and new Orion servers reside on the same subnet, you’ll be prompted to provide a new, unused IP address on the same subnet as your existing Orion server. This virtual IP (VIP) will be shared between these two Orion servers, as long as they remain in the same HA Pool. The purpose of the VIP is to route traffic to whichever member in the pool is active. If you don't have intentions of keeping HA running after the migration, this IP address will be used only briefly and can be reclaimed at the conclusion of the migration. When you're done entering the IP address, click “Next”.

 

Migrating to Server in Different Subnet

 

If you're migrating to an Orion server on a different subnet than your existing one, then the HA Pool creation wizard will prompt you to provide a virtual hostname rather than a virtual IP address. This name helps ensure users are directed to the “active” member in an HA pool when accessing the Orion web interface whenever failovers occur. If you don't have intentions of keeping HA running after the migration, you can enter anything you like into this field. Once you've populated the “Virtual Host Name” field, click “Next.”

Close

On the “DNS Settings” step of the HA Pool creation wizard, select your DNS server type or choose “other” from the “DNS Type” drop-down menu if you don't intend on keeping HA running after the migration. If you choose “other,” you can populate any IP address and any DNS zone (even one that doesn't exist) into the fields provided—these values will not be used unless you plan to integrate Orion High Availability with a non-Microsoft and non-BIND DNS server in the future.

 

When complete, click “Next” to proceed, review the “Summary,” and click the “Create Pool” button to complete the HA Pool creation process.

 

Ready, Set, Cut-over!

 

From the Orion Deployment Summary [Settings > All Settings > High Availability Deployment Summary], select the HA pool you just created. On the right side, click the “Commands” drop-down menu and select “Force Failover.” This should initiate an immediate failover from your old Orion server to your new one. Note that while the cutover time for polling and alerting is typically just a couple of seconds, it may take the Orion web interface a minute or so before it's fully accessible. Unless you were accessing the Orion Web Console using the VIP you assigned earlier, you’ll need to change the URL in your browser to point to the IP address of the new Orion server or the VIP to regain access to the Orion web interface once you've initiated the failover.

 

Fore Failover.png

Verify you're cut over to the new server by looking at the pool members listed on the Orion Deployment Summary, specifically their state or roll showed just below their names. Your old Orion server should be listed as “Standby,” and your new Orion server should display as “Active.” Congratulations! You've just completed a successful Orion server migration with zero downtime!

 

Clean-up

 

The following steps should be completed within 30 days if you don't currently own or have plans to purchase Orion High Availability to provide continuous monitoring, redundancy, and near-instantaneous failover of your Orion server in the event of a failure. Don't forget that Orion High Availability also helps you maintain that your Orion Platform’s 100% uptime every month when Patch Tuesday rolls around. (Patch Tuesday… or, you know, when Microsoft releases its latest round of operating system hotfixes, all of which inevitably require a reboot.)

 

Shutdown Old Server

 

You should start the cleanup process by shutting down your original Orion server. It served you well, and we all know how hard it is to bid a final farewell to such a loyal friend, but its time has come. If you're not immediately planning to destroy the virtual machine or de-rack your old Orion server, you may first want to consider changing its IP address if you plan to use it on your new Orion server. This will ensure that if the old Orion server is started back up, it won't cause an IP address conflict and wreak havoc on your network monitoring. Once you've changed the IP address, resume with shutting down the server before proceeding with the next steps.

 

 

Remove The Pool

 

From within the Orion web interface, navigate back to the High Availability Deployment Summary by going to [Settings > All Settings > High Availability Deployment Summary]. Click on the name of the Pool you created earlier in the steps above. From the “Commands” drop-down menu on the right select “Remove Pool.”

 

 

Reclaim The Original Orion Server's IP Address

 

Last, but certainly not least, you may want to reclaim the IP address of your original Orion server by assigning it to your new Orion server. This is simply a matter of logging into the Windows server via RDP (or etc.) and opening the Network Control Panel. I prefer to go to the “Run” command and typing “ncpa.cpl” and <Enter> to open the Network Control Panel without needing to navigate around Windows. Once you've opened the Network Control Panel, right-click on your network interface and select “Properties.” Within the interface properties, select “Internet Protocol Version 4 (TCP/IPv4)” and click “Properties.”

 

Network Control PanelInterface Properties

 

Update the “IP Address” field by entering the IP of your original Orion server, then click “Advanced.” In the “Advanced TCP/IP Settings,” you’ll find the Virtual IP Address you configured earlier in the steps above—you should be able to safely remove this now. To do so, simply select it by clicking on the IP address with your mouse and then click the “Remove” button. Then, click “OK” on each of the three windows to save your changes.

 

TCP/IP PropertiesAdvanced TCP/IP Settings

 

Update DNS/Machine Name

 

Do not rename the server itself in Windows. If you have users who are accessing the Orion Web Console via the original server’s name, the best and easiest method of ensuring those users can now access the new server is to create a DNS C-Name that points to the new server. It's always a good idea to have a layer of abstraction between what name end users type into their browser and the name of the server itself. This can help ensure that you can easily redirect those users later, should you want to add an Additional Web Server or re-enable High Availability. To accomplish this, we are going to create a DNS CNAME record for your original Orion server's name that points to the new Orion server. In this example, I'm using Windows DNS, but the same principle applies for really any type of DNS server.

 

From the DNS Control Panel on your DNS Server, expand “Forward Lookup Zones” and right-click on your domain name and select “New Alias (CNAME).” In my example below, my previous server's FQDN (Fully qualified domain name) was “solarwinds.sw.local” and my new Orion's server name is “pm-aus-jmor-04.sw.local.” In the “Alias name” enter “solarwinds.” The “Fully qualified domain name” field will automatically populate with the alias and domain name. In the “Fully qualified domain name (FQDN) for target host” field, enter the FQDN of your new Orion server and click “OK” to save your changes. Lastly, find and delete the ANAME “Host (A)” record for your old Orion server.

 

DNS Control PanelAdd New Alias (CNAME)

 

While this undoubtedly looks like a lot of steps, the process is actually fairly straightforward and I completed it in less than an hour. Now, obviously your mileage may vary, but regardless of how long it may take, there's no simpler way—for which I'm aware—that will allow you to migrate your Orion Platform to a new server with anywhere close to zero downtime. Hopefully, this process will save you a fair bit of time and frustration over the previous methods referenced above. If you have any tips and tricks of your own that have simplified your Orion server migrations, feel free to post them in the comments sections below—we'd love to hear them!

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates.  All other trademarks are the property of their respective owners.

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.