11 Replies Latest reply on Mar 12, 2014 8:54 AM by Jan Pelousek

    Max number of groups in NPM

    Jason.Henson

      Does anybody know the maximum number of groups the software can handle?  I just built out 900+ groups with no contents yet and it slammed the 4 core 3Ghz CPU and rendered the server almost unuseable.  I've backed those groups out now and the server is back where we can manage it but we need to put those groups back in at some point soon to enable us to configure strategic dependencies for alerting.

       

      Thanks,

      Jason Henson

      Loop1 Systems

      www.Loop1Systems.com

        • Re: Max number of groups in NPM
          mavturner

          Jason,

           

          There is no hard set maximum number of groups. As you pointed out, having static groups has much less impact than dynamic groups, but even then, there will still be internal limits. One of our larger customers has about 600 groups, about half dynamic and half static. As we get more requests for larger numbers of groups we will continue to improve the performance of these groups. Can you tell me a little more about what you are trying to do that requires this large number of groups?

           

          Mav

            • Re: Max number of groups in NPM
              Jason.Henson

              Hey Mav,

               

              Sure.  What we are working with in this instance is an environment where the network is designed with lots and lots of remote sites.  In each site, there are layers to the network.  The end user wants to only receive alerts for applicable devices.  As a result each of their sites is being configured with a total of 12 groups.  Each group is going to have a dependency once we get them set up.  We were creating these groups by injecting them into SQL and found problems with doing that.  We've focused our attention on building these with the SDK.  That seems to have helped alot.  We did see the groups slam the CPU as soon as we built them but we restarted the SWISv3 service and that brought the CPU back down to a manageable place.  The web interface is still struggling to show the groups but they do exist in the database.  The All Groups resource on the Groups summary page won't populate no matter how long we've let it run and occasionally when we manage the groups, we receive an error stating "Error: A query to the SolarWinds Information Service failed." 

               

              This is a very important feature for the customer because it determines whether or not they receive redundant alerts that confuse the problem.

               

              The end user had a few groups built prior to my involvement but at this point we have a total of 901 groups.  We are building 708 groups.  The delta between those two numbers will be removed when all is said and done because the new strategy accomplishes what they want to do and the old strategy does not.

               

              Do you have any suggestions to help us work through this?

               

               

              Thanks,

              Jason Henson

              Loop1 Systems

              www.Loop1Systems.com

                • Re: Max number of groups in NPM
                  Jason.Henson

                  Mav,

                   

                  Following up, we are getting the groups into the db with the SDK.  At this point though, we are consistently receiving the error

                   

                  Capture.JPG

                   

                   

                  ... when we try to view the groups management page.  The groups do contain contents.  The SDK is importing the contents as dynamic queries.  We tested with a subset of the groups and it worked fine.  When we load all of the groups it works just fine. 

                   

                   

                  Thanks,

                  Jason Henson

                  Loop1 Systems

                  www.Loop1Systems.com

                  • Re: Max number of groups in NPM
                    cshanep

                    Mav,

                     

                    Below is an exception we are seeing over and over in the C:\ProgramData\SolarWinds\Logs\Orion\OrionWeb.log. It seems like everytime we try to manage groups or go to the Groups page, we see this exception. Any thoughts?

                     

                    Thanks,

                    Shane

                    www.loop1systems.com

                     

                    2012-07-05 13:18:15,907 [67] WARN  SolarWinds.Orion.Web.InformationService.InformationServiceProxy - Caught CommunicationObjectFaultedException. Will try to reconnect to SWIS in 5s.

                    2012-07-05 13:18:20,907 [67] ERROR SolarWinds.InformationService.Contract2.InfoServiceProxy - Error closing exception.

                    System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

                     

                    Server stack trace:

                       at System.ServiceModel.Channels.CommunicationObject.Close(TimeSpan timeout)

                       at System.ServiceModel.Channels.CommunicationObject.Close()

                    Exception rethrown at [0]:

                       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

                       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

                       at System.ServiceModel.ICommunicationObject.Close()

                       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Close()

                    2012-07-05 13:18:20,954 [67] ERROR SolarWinds.InformationService.Contract2.InfoServiceProxy - Error executing query:

                    System.ServiceModel.CommunicationObjectFaultedException: The communication object,

                    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

                     

                    Server stack trace:

                       at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

                       at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecurityDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)

                       at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)

                       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

                       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

                       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

                     

                    Exception rethrown at [0]:

                       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

                       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

                       at SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

                       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Query(QueryXmlRequest query)

                    SELECT COUNT(DISTINCT m.MemberPrimaryID) AS TotalRows

                                                                FROM Orion.ContainerMembers m LEFT JOIN Metadata.Entity me

                                                                    ON me.FullName = m.MemberEntityType

                                                                 where m.MemberEntityType = 'Orion.Nodes' RETURN XML RAW

                    2012-07-05 13:18:20,954 [67] ERROR SolarWinds.Orion.Web.InformationService.InformationServiceProxy - main exception:

                     

                    System.ServiceModel.CommunicationObjectFaultedException: The communication object,

                     

                    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

                     

                    Server stack trace:

                       at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

                       at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecurityDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)

                       at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)

                       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

                       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

                       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

                     

                    Exception rethrown at [0]:

                       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

                       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

                       at SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

                       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Query(QueryXmlRequest query)

                       at SolarWinds.Orion.Core.Common.OrionInfoServiceProxy.SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

                       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteReader(CommandBehavior behavior)

                       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteDbDataReader(CommandBehavior behavior)

                       at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior)

                       at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)

                       at System.Data.Common.DbDataAdapter.Fill(DataTable[] dataTables, Int32 startRecord, Int32 maxRecords, IDbCommand command, CommandBehavior behavior)

                       at System.Data.Common.DbDataAdapter.Fill(DataTable dataTable)

                       at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.Query(String query, IDictionary`2 parameters, Boolean useSeparateExecutionContext)

                    2012-07-05 13:18:20,954 [67] ERROR SolarWinds.Orion.Web.InformationService.InformationServiceProxy - The communication object,

                     

                    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

                    System.ServiceModel.CommunicationObjectFaultedException: The communication object,

                     

                    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

                     

                    Server stack trace:

                       at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

                       at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecurityDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)

                       at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)

                       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

                       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

                       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

                     

                    Exception rethrown at [0]:

                       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

                       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

                       at SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

                       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Query(QueryXmlRequest query)

                       at SolarWinds.Orion.Core.Common.OrionInfoServiceProxy.SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

                       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteReader(CommandBehavior behavior)

                       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteDbDataReader(CommandBehavior behavior)

                       at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior)

                       at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)

                       at System.Data.Common.DbDataAdapter.Fill(DataTable[] dataTables, Int32 startRecord, Int32 maxRecords, IDbCommand command, CommandBehavior behavior)

                       at System.Data.Common.DbDataAdapter.Fill(DataTable dataTable)

                       at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.Query(String query, IDictionary`2 parameters, Boolean useSeparateExecutionContext)

                    2012-07-05 13:18:26,564 [13] ERROR SolarWinds.InformationService.Contract2.InfoServiceProxy - Error closing exception.

                    System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

                • Re: Max number of groups in NPM
                  mgibson

                  I was wondering if it was ever determined what the maximum number of Groups that can be handled by NPM?

                  I am up to 600+ and can't seem to open the manage groups resource even after changing timeout from 60 to 120.

                  I am running NPM 10.6.1 and are told by support that 10.7 should correct my issue, yea right! Case#589137

                    • Re: Max number of groups in NPM
                      HerrDoktor

                      We currently have 26 groups active that run smoothly. However if we want to apply account limitation with "group of groups" the web ressource "All Groups" is not working. This might not be related to your issue.

                       

                      We do not have the NPM Module so I can only tell you the following versions: Orion Platform 2013.1.0, NCM 7.2.1

                      We have been told NCM 7.2.2 should fix this. We are looking forward

                      • Re: Max number of groups in NPM
                        Jan Pelousek

                        Hello, current implementation allows creation of 2100 groups. This is caused by the limit on the SQL server side. This will be improved in future, however the biggest factor in performance below this number of groups, is the server performance. Basicly groups and dependencies are the ones of most expensive functionalities of the Orion Platform. I can imagine 600+ complex groups (containing dynamic member definitions) can kill the performance of some, not too powerfull Orion servers + SQLs. It's not possible to estimate the requirements since it very depends on the group complexity. E.g. one very complex group could utilize the server more, than 100 simple groups with static definitions. I hope it helps to understand. Regarding the upgrade to 10.7 - yes, it's possible. Some improvements were done here, but it very depends on your configuration.

                        Regards,

                        H.