This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Has NPM 12.2 dropped polling support for ESXi 5.0?

Hello,

     I upgraded from NPM 12.1 to 12.2 to get the additional ASA polling features. I have unfortunately not been able to understand why my NPM instance is no longer polling the hardware health for the 2 older ESXi Hosts in my environment that are still running v5.0 and are standalone (not managed by vCenter).  The specific error message is Hardware polling failed: Polling of chassis (CIM_Chassis class) failed. The remote server returned an error: (501) Not Implemented.

  • The hardware polling worked in 12.1
  • The ESXi credentials work fine when testing under the Edit Node page. It is the root login for the ESXi Host.
  • Both of these Hosts are Dell servers - one is a PE-1950 and the other is a PE-R510
  • Both are located on the directly attached network with the NPM instance

It seems like it is a change in the way NPM is doing ESXi polling (soapbox API) and not some minor technical issue.  Just like what happened a in 11.x when ESX 3.5. and ESXi 4.0 Hosts stopped being supported. I have already gone through the ESXi Logs searching for any issues and restarted the management servers on the ESXi Hosts and verified that all necessary firewalls and services are correctly configured even though no changes were made on the ESXi Hosts.

On a side note, I am also experiencing the very frustrating issues where NPM is telling me that the ESX Status is in a critical state but gives zero information as to what threshold has been exceeded that is causing the critical level. Nothing in any of the panels helps to understand the cause of the status. This is only on one of the ESXi Hosts - the other does not have this false critical status.

If you are also experiencing this issue and have already opened a ticket with SolarWinds and gotten a response, please let me know so I can same myself the trouble of creating my own ticket.

Thank you.

  • Resolution: Not fixable

    I created a SolarWinds ticket for this and got some responses but all of the steps were basic items I already tried and they ended up not being able to help me. I gave up on trying to troubleshoot the issue and also closed the support ticket with SolarWinds.

    The two ESXi 5.0 Hosts are still showing their heath status as undefined and are unable to be polled for hardware health. All other VMware polling using SNMP and the VMware API work and I see all the VM information.

    What I cant see (or get alerts on) without the CIM polling working in SolarWinds is if there is a hardware issue on the ESXi Host such as with the RAID controller or one of the drives.

    I did a bunch of tests myself and discovered a few new tools I had not used before for troubleshooting - slptools and CIM polling with Powershell.

    My Troubleshooting

    1. I get the same 501 HTTP error from all Hosts (5.0 and 5.5) when I attempt to go to https://host_IP_address:5989 . I think this error is not the true reason for the issue.

    2. I can query the SLP service on the network where the Hosts are and see that both 5.0 and 5.5 ESXi Hosts are listening for wbem (CIM service) and respond to queries.

    C:\Program Files\OpenSLP>slptool findsrvtypes

    service:VMwareInfrastructure

    service:wbem:https

    3. I can query the CIM attributes on the 5.0 Hosts and the CIM service responds

    C:\Program Files\OpenSLP>slptool findattrs service:wbem:https://esxi_host_IP:5989

    (template-type=wbem),(template-version=1.0),(template-description=This template describes the attributes used for advertising WBEM Servers.),(template-url-syntax=https://esxi_host_ip:5989),(service-hi-name=sfcb),(service-hi-description=Small Footprint CIM Broker 1.3.7),(service-id=76a5b1be-f9cd-4b98-8764-771b98ff382a),(CommunicationMechanism=CIM-XML),(InteropSchemaNamespace=root/interop),(ProtocolVersion=1.0),(FunctionalProfilesSupported=Basic Read,Basic Write,Instance Manipulation,Association Traversal,Query Execution,Indications),(MultipleOperationsSupported=false),(AuthenticationMechanismsSupported=Basic),(Namespace=root/interop,root/cimv2,vmware/esxv2,root/config),(Classinfo=0,0,0,0),(RegisteredProfilesSupported=SNIA:Job Control,DMTF:Software Update,DMTF:PCI Device,DMTF:Software Inventory,DMTF:Indications,Other:IPMI OEM Extension,DMTF:Sensors,DMTF:Profile Registration,DMTF:System Memory,DMTF:Record Log,DMTF:CPU,DMTF:Fan,DMTF:Base Server,DMTF:Battery,DMTF:Physical Asset,DMTF:Power Supply,DMTF:Power State Management)

    4. The 5.0 Hosts will not poll using the SolarWinds.HardwareHealthTool.exe tool. It shows the same 501 error that the NPM web console shows in the hardware health panel.

    5. I can poll the 5.0 Hosts using powershell though

    $ipaddress = "172.31.xxx.xxx"

    $HostUsername = "root"

    $CIOpt = New-CimSessionOption -SkipCACheck -SkipCNCheck -SkipRevocationCheck -Encoding Utf8 -UseSsl

    $Session = New-CimSession -Authentication Basic -Credential $HostUsername -ComputerName $Ipaddress -port 443 -SessionOption $CIOpt

    Get-CimInstance -CimSession $Session -ClassName CIM_Chassis

    ....

    Caption : Chassis

    ChassisPackageType : 23

    ChassisTypeDescription : Rack Mount Chassis

    CommunicationStatus :

    CreationClassName : OMC_Chassis

    ...

    Manufacturer : Dell Inc.

    Model : PowerEdge R510

    ...

    PSComputerName : 172.31.xxx.xxx

    5. Hardware health appears correctly when connecting to the ESXi Hosts using the vSphere client, and when connecting to the vmk using a web browser at https://esxi_host_ip/mob and navigating through the Managed Object Browser web data.

    6. I have removed the hosts from NPM and then added them back, changed the polling between a) ICMP only with VMware polling, b) SNMP polling with VMware polling c) SNMP polling without VMware polling, restarted the management interfaces on the ESXi Hosts, and verify that all services are running correctly and that there are no firewall issues.