This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

TLS/SSL Error AppServer WebSphere

Hi,

     From sometimes ago, I have an issue that I decided to find its root cause. APM's Solarwinds give me some errors with one of our agent's http/tls monitor when it's try to check the HTTPS service, some of this check gave me an error creating SSL/TLSsecure channel, but it's not true because I have some internal user that at the same time are working with the service.

     I activated the debug message for this agent and this is what I get when the error occurs:

2016-08-01 16:14:26,016 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.HttpClientHelper - WebRequest created. Getting the response. Starting ResponseTime measurement.

2016-08-01 16:14:26,016 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.HttpClientHelper - Using asynchronous call.

2016-08-01 16:14:26,016 [STP Pool:10 Thread #0] [C341] DEBUG SolarWinds.APM.Probes.HttpClientHelper - WebRequest created. Getting the response. Starting ResponseTime measurement.

2016-08-01 16:14:26,016 [STP Pool:10 Thread #0] [C341] DEBUG SolarWinds.APM.Probes.HttpClientHelper - Using asynchronous call.

2016-08-01 16:14:26,578 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.HttpClientHelper - #1 GetWebResponse failed. The request was aborted: Could not create SSL/TLS secure channel.

2016-08-01 16:14:26,578 [STP Pool:10 Thread #1] [C338] ERROR SolarWinds.APM.Probes.HTTP.HttpProbeBase`1 - WebException caught.

System.Net.WebException: The request was aborted: Could not create SSL/TLS secure channel.

   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)

   at System.Net.WebClient.GetWebResponse(WebRequest request, IAsyncResult result)

   at SolarWinds.APM.Probes.HttpClientHelper.GetWebResponse(WebRequest request)

   at SolarWinds.APM.Probes.HttpClientHelper.GetWebResponseFromUri(Uri address, Stopwatch stopwatch)

   at SolarWinds.APM.Probes.HttpClientHelper.Download(Uri address, Stopwatch stopwatch, Encoding encoding)

   at SolarWinds.APM.Probes.HTTP.HttpProbeBase`1.ProbeInternal(ProbeInformation probeInfo, HttpMonitorResult result)

2016-08-01 16:14:26,594 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.HTTP.HttpProbeBase`1 - Outcome - NotAvailable.

2016-08-01 16:14:26,594 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.HTTP.HttpProbeBase`1 - Cleaning up the HTTP client.

2016-08-01 16:14:26,609 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.HTTP.HttpProbeBase`1 - Adding results to the result writer.

2016-08-01 16:14:26,609 [STP Pool:10 Thread #1] [C338] DEBUG SolarWinds.APM.Probes.ProbeExecutors.RegularBatchExecutor - Component probe finished: JobContext - NodeId: 64, NodeName: Higia, ApplicationId: 25, ApplicationName: UN-Suir, ComponentId: 338, ComponentName: SUIR PROD - WAS, CustomLogEnabled: True

The agent is for a HTTPS service that runs in our Application Server Websphere. The port that t he agent check si not the standard one but it's configured well.

  • Do you know if that site only supports TLS 1.2? There was a bug in SAM where it had issues in that regard. Upgrading to the latest version fixed it. What version are you on now?

  • I have an open ticket, Case # 1001037, for this.  We ping many dozens of HP iLO and Dell iDRAC cards, and have had thousands of errors over the last few weeks.  Typically it goes down for one ping cycle (5 mins).  I've sent off the usual full log bundle, plus debug logs, but it's been quite a while since I used WireShark and I need to refresh my memory of how to take a trace.   The HP iLO cards have all been firmware updated to the latest, and for some reason all of those are now fine - most were fine anyway.  The Dell iDRAC cards are another matter.  I firmware updated every last one to the very latest firmware - they all take the same firmware.   Across all of them, there are no ones which seem to have the errors more than the rest, i.e. any of them will randomly have an error, then it will poll ok.

    This is the error reported by the component part of the application template.

    Component Status          Unexpected error occurred. The request was aborted: Could not create SSL/TLS secure channel.

    Source                                  APM: Component Https

    Alert time                            03 August 2016 09:20

    Last Up Time                      03/08/2016 10:14:38

    The cards are grouped together in an application template, i.e. Site A ESX hosts, Site B ESX hosts, Site A VDI ESX Hosts, etc.

    Our engvironment is as below, plus Core 2016 hotfix 3 (overnight) and NTA hotfix 1 (otherwise it won't display properly in the web client).

    Orion Platform 2016.1.5300, NCM 7.5, IPAM 4.3.1, IVIM 2.1.2, NetPath 1.0, DPA 10.0.1, NPM 12.0, QoE 2.1.0, SAM 6.2.4, NTA 4.2.0

  • Hi Chad.

        Thanks for your answer. In our case the connection is with SSL3 (I know, I know....) between the servers. The currently version of SAM is 6.2.1, I guess this not the latest one.

  • Hi Chad,

              Is there something that I can do to solve this problem?. I constantly receive this false alarm from this process because the agent monitor itself failed when it's trying to check the site secure. emoticons_sad.png

  • Honestly, i would try and update SAM to version 6.2.4 and see if anything changes. I'm all for making sure i'm on the latest version if i'm seeing weird anomalies.

  • Thanks Chad. I will do it ASAP. btw can you share the release note of this version?

  • If you note my post - I already have 6.2.4 - it started after I upgraded from 6.2.3 to 6.2.4 - although we also went to NPM 12.  Upgrading introduced it, didn't solve it.  But if others are getting it on previous versions then perhaps it isn't Solarwinds software, but perhaps .Net Framework, microsoft patches, browser upgrades, etc.

  • As support articles go that is a pretty bad example of one - it's got no date (to place it in time), nor any details of what versions of SAM it relates to.  We have the very latest SAM so you would think it would have been fixed in that version - hence if they said this article relates to SAM 6.2.4, and an expected fix is due in, say, hotfix 4, due <whenever>, that would make the article much more useful.

    It is a useful thing to check, and I really appreciate the link.  But as it happens I had already set them to x64 - I do typically use that option where I can - I mean, we've had 64-bit capable servers for a decade now and so it still baffles me why any software manufacturer (including Microsoft) are still producing any 32-bit software.

    I should have also mentioned we run our main poller on a physical box with a lot of power.  The database is on a separate physical server with flash storage locally.  The additional poller is a VM which is over-specced.  Without a network trace, and despite sending a ton of additional logs, Solarwinds dev team aren't able to reproduce it or solve it.  Unfortunately I'm knee-deep in work and don't have time to teach myself how to capture just the WireShark data I need (one poller + multiple IP addresses of iDRAC cards). 

    Typically we see one or two cards per cluster of machines, now and again, usually it's just one, sometimes two.  The clusters are about 16, 16, 6 and 6 (for those Dell-based ESX hosts).

    I created a report and it seems that basically every single iDRAC is affected, so over 287 polls (24 hours) component availability has ranges from 96.5% to 99.65%, so that answers the question of whether specific cards are being affected, but not others.  Also, they were all firmware updated by me to the very latest firmware - to try to cure this problem (+compliance/audit reason - future work brought forward).

    I've resorted to running PingPlotter on some iDRACs just in order to rule out packet loss (never seen any).