This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Troubleshooting NCM performance for jobs /devices downloading configs failure

This post will help you to troubleshoot most common issues with the NCM Jobs / Download Device configuration / Common Errors while downloading the Configuration / Running NCM jobs checking the NCM logs.

Please follow the steps and recommendation carefully and do let us know if this post helped to address your issue in comments section .

Please Note: Currently supported software versions  Network Configuration Manager (NCM)

Latest Version: 8.0

Oldest Supported Version: 7.6

EOL VERSION

EOL ANNOUNCEMENT

EOE EFFECTIVE DATE

EOL EFFECTIVE DATE

7.7June 6, 2019September 6, 2019September 6, 2020
7.6December 4, 2018March 4, 2019March 4, 2020

Solarwinds strongly recommend for an upgrade to the latest version with HF installed - More details you can find out under the posts below linked

Fresh Orion deployment Vs upgrade older version

Your check list

 NCM version System Hardware & latest HF installed

 Disable Session Trace

 Clear Temp Folder

 Clear pending reboot

 Disable Config Archive

 Increase CLI TimeOut

 Reduce the amount of the config retain

 Reduce the Simultaneous Download / Upload

 Multiple APE ? - Create Job for each poller / One Job and one device from each poller

 Possible a single or some nodes are culprit and failure download in the loop

1-NCM System &Hardware Sockets

Make sure your are on NCM 7.9 OR Upgraded to latest released NCM 8.0 + Latest HF installed version installed

Please audit your environment and make sure there is no bottleneck at your side which is causing the issue

How to check the server hardware using Orion Platform diagnostics

Open Task Manager On Orion Server > Performance Tab > CPU - Make sure you have Minimum 4 Sockets available there.

Make sure you are on recommended hardware

For more details please see My  Thwack post below.

https://thwack.solarwinds.com/docs/DOC-190027

Check the HF from the Customer portal and make sure you have the latest HF installed

https://customerportal.solarwinds.com/HotFixes

2- Disable Session Trace

In few cases if you are running jobs for large network its not recommended to keep the Session Trace ON as it will consume CPU and Memory on the system also will effect the config  download progress therefor keep this folder clear .

To disable Session Tracing:

  1. Open the Orion Web Console.
  2. Go to:
    7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
    7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings
  3. Clear the Enable Session Tracing check box.
  4. Go to the trace log location:
    7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
    7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace
  5. Delete the trace files.

3-Clare Temp folder (Recommended If its larger )

Check disc space on NCM drives as well as SQL server where the NCM DB is stored

Clear the Windows Temp directory for all polling engines in NPM

  1. Log in to the Orion server hosting the Main Polling Engine.
  2. Stop all Orion services.
  3. Disable the Antivirus software running on the system.
  4. Navigate to:
    C:\windows\Temp (including the SolarWinds folder)
  5. Delete all files in the Temp folder.
  6. Restart all Orion services.
  7. Restart the system.
  8. Repeat step 1 through step 7 for all servers hosting an Additional Polling Engine (APE).

4- Reboot the NCM Server   (Recommended ONLY IF there is windows update pending reboot or / AV could cause issues with NCM jobs)

5- Disable Config Archive

Settings > All Settings > Config Settings ( Disable Config Archive)

6- Increase CLI TimeOut

Go to the Settings > All Settings > CLI Settings > (Uncheck session Trace ) > increase the timeout values a bit .

7-Reduce the amount of the config retain in the NCM . (Recommended - Will improve the NCM & SQL Performance )

Off load extra load from the NCM DB will also help to run the NCM jobs faster.

Configs > Jobs > Edit the >Default Purge Configs Job
Follow the Wizard and on Add Job specific Details >  Delete all Configs Except for the last 10 days.

For Newer Version please follow the KB

NCM Default Database and Archive Maintenance job

Most common issues with NCM Jobs area

Schedule Job is not running at all / stuck on 99% or 100% / Running 100% - Now Post Processing  what should i try after above ?

I have a schedule job and it randomly failing for different nodes What to check ?

pastedImage_0.png

Turn On NCM Job Logs first

If its already enabled take a backup of the folder and delete all the old files files from the folder

Settings > NCM Settings> Advanced Settings > (Enable Scheduled Jobs)

pastedImage_1.png

You can also reduce the load from the poller CPU if its lower then 3.0 Ghz use the  below settings .

pastedImage_0.png

Create a New NCM TEST Job > Add only One NODE (From main poller)  and then run the job manually check if you have the failure  ?

Now Check the same Job schedule after 10 minute?

What results you have ? Failure .

If you have success ad 10 more nodes into the same job and run it again and so on up to 50 nodes and then test up to 100 nodes in same job

It is Possible a single or some nodes are culprit and failure download in the loop this could cause delay completing the NCM job

Please make sure the Nodes you have added in the Jobs for testing NCM able to download the configuration manually without any long delays.

I have Multiple Polling Engines (APE)

If you have multiple polling engines Create separate job for each of APE and test the same as above

Create Job for each poller / One Job and one device from each poller

Please create Nighty Config Backup jobs with nodes only from one poller.

Please create this job for all pollers and run.

We need to know if there is a problem with a specific poller.

If issue is on a specific poller

Does manual config download work for this poller?
Could we test ping between Main and this problematic poller?

Maybe firewall rules block something. Could we do some tests - turn off the firewall and check if the job works? (could be problematic)

On the Orion Server go to the following location and check the log files .

C:\ProgramData\SolarWinds\Logs\Orion\NCM\Logging

Check the log file and see if you have any Error there ?

Open Support Ticket

Tips and Tricks on opening a Support Ticket with SolarWinds

NCM Inventory Job is taking to much time / Which nodes are taking long time /  Where i can see real time inventory job logs

I have few nodes failing downloading config files (Connection Refused ) / (Connection TimeOut) Error what should i check?

Pick one single node and work along to make sure the node in question actually have no issues with connectivity.

(Please make sure you work on the correct polling Engine where the node is assigned for polling if you have multiple pollers)

Checking NCM Profile
Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile "Test" if this successful ?

Checking NCM Profile with SSH Auto

Also please on Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile

Select SSHAuto >

"Test" if this successful ?

If you have connection failure (Make sure you RDP  on the the same Poller where the node is assigned )

Please try this "ConnectionTester.exe" tool from NCM server and let me know the outcome if this failed as well ?

C:\Program Files (x86)\SolarWinds\Orion\NCM\Tools

ConnectionTester.exe

pastedImage_1.png

If you are able to connect to the node without any issue and still have the same issue in NCM downloading the configuration

Or

You are able to connect with the device using PUTTY / SSH or NCM Connection Tester however you have failure with NCM when running Connection Profile "Test"

In this case we have to check the Session Trace for the Node.

Enable Session Trace

  1. Open the Orion Web Console.
  2. Go to:
    7.6 and older: Settings > All Settings > NCM Settings > Advanced Settings
    7.7 and newer: Settings > All Settings > Product Specific Settings > CLI Settings
  3. Enable Session Tracing check box.
  4. Go to the trace log location:
    7.6 and older: C:\ProgramData\SolarWinds\Logs\Orion\NCM\Session-Trace
    7.7 and newer: C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace
  5. Delete the trace files.

Now Run the below Test and check the Session Trace log file for the Error (Error will be listed on the bottom of the log file)

Checking NCM Profile
Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile "Test"

Checking NCM Profile with SSH Auto

Also please on Orion web console > Go to this node  "target effected node" > Edit Node > NCM Properties check the Connection Profile

Select SSHAuto >

"Test"

If you are unable to understand the Error why its failing - You can either search or post the issue on the Thwack or Open Support Ticket and provide us the Session Trace file.

Please do not forget to ZIP the Session Trace file

Open Support Ticket

Tips and Tricks on opening a Support Ticket with SolarWinds

NCM nodes failing downloading Error message: connectivity issues, discarding configuration / "show running" on a Cisco switch ( % Invalid input detected at '^' marker. )

Troubleshooting downloading F5 devices configuration

NCM troubleshooting downloading F5 devices configuration

NCM Logs and data locations

Where i can see NCM Jobs activity in details ?

You can find the Logs under following location where you can check and track the NCM jobs activity if there is any Error there can be tracked.

C:\ProgramData\SolarWinds\Logs\Orion\NCM

NcmBusinessLayerPlugin

NCM.Collector.Jobs

Default location for CLI and Session Trace Logs

C:\ProgramData\SolarWinds\Logs\Orion\CLI\Session-Trace

C:\ProgramData\SolarWinds\Logs\Orion\CLI

Default Location for NCM ASA Polling

C:\ProgramData\SolarWinds\Logs\Orion\ASA

Default location for NCM vulnerability location 

C:\ProgramData\SolarWinds\NCM\Vuln   

Default location for config archive

C:\ProgramData\SolarWinds\NCM\Config-Archive

I will include more details in it and case studies please feel free to let me know about your feedback and i will include in this guide.

Related Link.

NCM troubleshooting landing page