This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Alerting Broken After 2023.2 Upgrade

We upgraded our instance from 2023.1.1 to 2023.2. After the upgrade 30% of our alerts are broken and not firing, mostly with our Component based alerts. The Alerting.Service Log shows a conversion failed SQL exception for the alert triggers: 

2023-04-28 15:04:43,276 [35] WARN SolarWinds.Orion.Core.Alerting.Plugins.Conditions.Swql.ConditionEvaluatorSwql - Condition evaluation failed : RunQuery failed, check fault information.
Conversion failed when converting the nvarchar value 'net-snmp' to data type int.
2023-04-28 15:04:43,276 [35] ERROR SolarWinds.Orion.Core.Alerting.Service.ConditionsStateEvaluator - Condition 'AlertId: 326, AlertLastEdit: 4/5/2023 1:32:36 PM, ConditionIndex: 0, Type: Trigger' Evaluator failed - Condition evaluation failed for query = (SELECT E0.[Uri], E0.[DisplayName]
FROM Orion.APM.Component AS E0
WHERE ( ( ( E0.[Application].[Node].[Status] = @p0*1 ) AND ( E0.[Status] != @p1*1 ) AND ( E0.[Status] != @p2*1 ) AND ( E0.[Status] != @p3*1 ) AND ( E0.[Status] != @p4*1 ) AND ( E0.[Application].[Node].[CustomProperties].[OPS_Targeted_Alert_Node] = @p5*1 ) AND ( E0.[ComponentAlert].[UserNotes] NOT LIKE @p6 ) AND ( E0.[ComponentAlert].[UserNotes] NOT LIKE @p7 ) AND ( E0.[Application].[Node].[CustomProperties].[OPS_Targeted_Non_Crt_Node] = @p8*1 ) AND ( E0.[Application].[ApplicationAlert].[ApplicationName] LIKE @p9 ) AND ( ( E0.[Application].[Node].[Vendor] = @p1*10 ) OR ( E0.[Application].[Node].[Vendor] = @p1*11 ) OR ( E0.[Application].[Node].[Vendor] = @p1*12 ) ) ) AND ( ( E0.[Status] != @p1*13 ) ) )), condition = (AlertConditionDynamic: scope=(
([Orion.Nodes|Status|Application.Node] = '1')
AND ([Orion.APM.Component|Status] != '27')
AND ([Orion.APM.Component|Status] != '9')
AND ([Orion.APM.Component|Status] != '3')
AND ([Orion.APM.Component|Status] != '0')
AND ([Orion.NodesCustomProperties|OPS_Targeted_Alert_Node|Application.Node.CustomProperties] = '1')
AND ([Orion.APM.ComponentAlert|UserNotes|ComponentAlert] NOTCONTAINS 'NonCritcal:')
AND ([Orion.APM.ComponentAlert|UserNotes|ComponentAlert] NOTCONTAINS 'Serious:')
AND ([Orion.NodesCustomProperties|OPS_Targeted_Non_Crt_Node|Application.Node.CustomProperties] = '0')
AND ([Orion.APM.ApplicationAlert|ApplicationName|Application.ApplicationAlert] CONTAINS 'OPS Telnet - EDI Proxy Ports')
AND (
([Orion.Nodes|Vendor|Application.Node] = 'net-snmp')
OR ([Orion.Nodes|Vendor|Application.Node] = 'Sun Microsystems')
OR ([Orion.Nodes|Vendor|Application.Node] = 'Unknown')
)
): (OR ([Orion.APM.Component|Status] != '1'))) - System.ServiceModel.FaultException`1[SolarWinds.InformationService.Contract2.InfoServiceFaultContract]: RunQuery failed, check fault information.
Conversion failed when converting the nvarchar value 'net-snmp' to data type int. (Fault Detail is equal to InfoServiceFaultContract [ System.Data.SqlClient.SqlException (0x80131904): Conversion failed when converting the nvarchar value 'net-snmp' to data type int.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.SqlDataReader.TryHasMoreRows(Boolean& moreRows)
at System.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)
at System.Data.SqlClient.SqlDataReader.Read()
at SolarWinds.InformationService.DataProviders.SqlQueryRelation.<GetEnumerator>d__8.MoveNext()
at SolarWinds.Data.Query.PhysicalQueryPlan.Provider...). 

Debug for the Alerting Log shows missing entities when alerts are fired: 

2023-04-28 14:08:57,850 [48] DEBUG SolarWinds.Orion.Core.Alerting.Service.ConditionsStateEvaluator - EvaluateScheduled: nothing to evaluate, exiting

2023-04-28 14:08:57,850 [36] DEBUG SolarWinds.Orion.Core.Alerting.Service.ConditionsStateEvaluator - EvaluateScheduled: nothing to evaluate, exiting

2023-04-28 14:08:57,909 [46] DEBUG SolarWinds.Orion.Core.Common.ChannelProxy`1 - Invoking <Query>b__0 finished

2023-04-28 14:08:57,909 [46] DEBUG SolarWinds.Orion.Core.Alerting.Plugins.Conditions.Swql.ConditionEvaluatorSwql - } Start exited

2023-04-28 14:08:57,909 [46] DEBUG SolarWinds.Orion.Core.Alerting.Service.ConditionsStateEvaluator - Condition Evaluator OnNext (AlertId: 244, AlertLastEdit: 7/12/2019 6:30:24 PM, ConditionIndex: 0, Type: Trigger)

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstance

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstance

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstance

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstance

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstance

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstance

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstanceApplication

2023-04-28 14:08:57,910 [46] DEBUG SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider - Missing entity from navigation SolarWinds.Orion.Core.Common.InformationService.SwisSchemaProvider+RelationsSearchItem (Orion.APM.Application) -> Orion.DPA.DatabaseInstanceClientApplication

When creating new component based alerts and mirroring our old alerts, we can no longer select Application as a trigger condition, Error: "Missing field ApplicationName in Orion.APM.ApplicationAlert"

The only fix is to re-create alerts using Application instead of Component and re-writing the email alerts. Everything was running great on 2023.1.1 and we were pleased with the product. We have an open case with SolarWinds to look at this issue, support seems to be stumped at the moment. 

Also, after upgrading to 2023.2 WPM monitors started to flap, we lost our worker configuration on our players, and that module has become very noisy. Re-recording transactions, adding wait times, resolution, image match adjustments, etc. does not correct the issue. We have an open ticket for this issue as well. 

We thought this 2023.2 upgrade was going to be the same as the 2023.1 and 2023.1.1 upgrades that completed successfully without issue. The only reason we wanted to get to 2023.2 is to address the UTC Bug for last reboot that end users were complaining about, of course that led to the system being down with alerting broken. We have made a decision to wait to preform platform upgrades for at least 6 months due to these issues we are seeing. 

Parents
  • We had a similar issue, many of our most critical alerts stopped working. It was related to the bit in the error:

    Conversion failed when converting the nvarchar value 'net-snmp' to data type int. (Fault Detail is equal to InfoServiceFaultContract [ System.Data.SqlClient.SqlException (0x80131904): Conversion failed when converting the nvarchar value 'net-snmp' to data type int.

    In our case we were able to replace custom property that was causing that error with something that would parse as an integer rather than a string. I'd like to think we were lucky that we were able to do that. But ultimately nothing was coming up under after clicking "Only following set of object (Show List)" until we adjusted our custom property. 

    They definitely broke something in this release. 

  • Delved into my version of this issue over the last few days, there was also a conversion happening where it didnt need to, for me it was a "__00" string into a number or something simmilar.

Reply Children
No Data