2 Replies Latest reply on Jun 15, 2015 8:13 AM by jbiggley

    Master.log showing 'sending results to VIM failed: Connection refused' on master collector

    jbiggley

      VMAN integration with NPM/SAM can be a little tricky when you get multiple polling engines and multiple federated collectors in the mix.  We've been struggling with data not showing up in the Orion console for a while and, after cleaning up a bunch of decommissioned hosts, were finally able to clear through the clutter and find that it was two of our federated collectors that were in our DMZ that were having issues.  Or so we thought.

       

      Once we dug into the master.log on our primary VMAN collector we found messages that said either:

       

      2015-06-10 14:02:52,152 [pool-27-thread-1]  INFO com.solarwinds.vman.orion.collector.JobResultDispatcher:250 - Sending data to Orion data processor on URL: https://[IP_address_here]:17778/orion/collector/remotedataprocessor

      2015-06-10 14:02:56,650 [pool-27-thread-1]  INFO com.solarwinds.vman.orion.collector.JobResultDispatcher:262 - Data to Orion data processor sent successfully.

       

      Or

       

      2015-06-10 14:02:59,362 [pool-27-thread-1]  INFO com.solarwinds.vman.orion.collector.JobResultDispatcher:250 - Sending data to Orion data processor on URL: https://[IP_address_here]:17778/orion/collector/remotedataprocessor

      2015-06-10 14:02:59,364 [pool-27-thread-1] ERROR com.solarwinds.vman.orion.collector.JobResultDispatcher:139 - sending results to VIM failed: Connection refused

       

      That's odd.  The http://www.solarwinds.com/documentation/Orion/docs/SolarWindsPortRequirements.pdf guide has a line that says "17778: For communication with the Orion server if the integration with Orion is enabled" and since, in our case, our VMAN primary collector and primary Orion server are on the same VLAN then there shouldn't be a problem.  Why, then, are we seeing connections being attempted from our VMAN primary collector to our DMZ-based Orion polling engines and failing?  The secret is IVIM.

       

      Integrated Virtualization Infrastructure Manager (or IVIM for short) runs on your Orion servers.  It is like VMAN-lite and, if you don't have VMAN, or don't have VMAN integrated with either NPM or SAM, operates to collect data from the vCenters that you are monitoring.  When VMAN enters the picture the primary collector tries to send data to the IVIM process via the remotedataprocessor WSDL (I think!) for addition to your Orion DB, specially the VIM_* tables.  In our case, our DMZ-based additional polling engines are polling our DMZ-based vCenter servers.  This means that in order to get the data into the Orion DB that the primary collector (who uses the VMAN federated collectors in the DMZ to collect data from the vCenter API for VMAN) turns around and sends the data to the DMZ-based Orion servers to be sent to the database.

       

      Yep, you read that right.  That data you collected from your DMZ-based collector crosses the DMZ boundary 3 times before it finally gets to the database.  Collector to primary collector (1), primary collector to DMZ poller (2) and DMZ poller to database server (3).

       

      What does this all mean?  It means that you need to open TCP 17778 from your primary collector to all of the polling engines (DMZ or other firewall-separated VLANs) that are being used to manage vCenter servers.  You are already familiar ensuring that vCenters and all of the hosts that they manage are assigned to the same polling engine for proper data collection.  Now you have one more thing to remember if you've enabled VMAN to NPM integration!

       

      In my opinion the documentation really ought to be more clear for those of us running multi-poller environments, especially where a DMZ or other firewall-separated VLAN is involved. chrispaap dsbalcau