This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Max number of groups in NPM

Does anybody know the maximum number of groups the software can handle?  I just built out 900+ groups with no contents yet and it slammed the 4 core 3Ghz CPU and rendered the server almost unuseable.  I've backed those groups out now and the server is back where we can manage it but we need to put those groups back in at some point soon to enable us to configure strategic dependencies for alerting.

Thanks,

Jason Henson

Loop1 Systems

www.Loop1Systems.com

  • Jason,

    There is no hard set maximum number of groups. As you pointed out, having static groups has much less impact than dynamic groups, but even then, there will still be internal limits. One of our larger customers has about 600 groups, about half dynamic and half static. As we get more requests for larger numbers of groups we will continue to improve the performance of these groups. Can you tell me a little more about what you are trying to do that requires this large number of groups?

    Mav

  • Hey Mav,

    Sure.  What we are working with in this instance is an environment where the network is designed with lots and lots of remote sites.  In each site, there are layers to the network.  The end user wants to only receive alerts for applicable devices.  As a result each of their sites is being configured with a total of 12 groups.  Each group is going to have a dependency once we get them set up.  We were creating these groups by injecting them into SQL and found problems with doing that.  We've focused our attention on building these with the SDK.  That seems to have helped alot.  We did see the groups slam the CPU as soon as we built them but we restarted the SWISv3 service and that brought the CPU back down to a manageable place.  The web interface is still struggling to show the groups but they do exist in the database.  The All Groups resource on the Groups summary page won't populate no matter how long we've let it run and occasionally when we manage the groups, we receive an error stating "Error: A query to the SolarWinds Information Service failed." 

    This is a very important feature for the customer because it determines whether or not they receive redundant alerts that confuse the problem.

    The end user had a few groups built prior to my involvement but at this point we have a total of 901 groups.  We are building 708 groups.  The delta between those two numbers will be removed when all is said and done because the new strategy accomplishes what they want to do and the old strategy does not.

    Do you have any suggestions to help us work through this?

    Thanks,

    Jason Henson

    Loop1 Systems

    www.Loop1Systems.com

  • Mav,

    Following up, we are getting the groups into the db with the SDK.  At this point though, we are consistently receiving the error

    Capture.JPG

    ... when we try to view the groups management page.  The groups do contain contents.  The SDK is importing the contents as dynamic queries.  We tested with a subset of the groups and it worked fine.  When we load all of the groups it works just fine. 

    Thanks,

    Jason Henson

    Loop1 Systems

    www.Loop1Systems.com

  • Mav,

    Below is an exception we are seeing over and over in the C:\ProgramData\SolarWinds\Logs\Orion\OrionWeb.log. It seems like everytime we try to manage groups or go to the Groups page, we see this exception. Any thoughts?

    Thanks,

    Shane

    www.loop1systems.com

    2012-07-05 13:18:15,907 [67] WARN  SolarWinds.Orion.Web.InformationService.InformationServiceProxy - Caught CommunicationObjectFaultedException. Will try to reconnect to SWIS in 5s.

    2012-07-05 13:18:20,907 [67] ERROR SolarWinds.InformationService.Contract2.InfoServiceProxy - Error closing exception.

    System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

    Server stack trace:

       at System.ServiceModel.Channels.CommunicationObject.Close(TimeSpan timeout)

       at System.ServiceModel.Channels.CommunicationObject.Close()

    Exception rethrown at [0]:

       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

       at System.ServiceModel.ICommunicationObject.Close()

       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Close()

    2012-07-05 13:18:20,954 [67] ERROR SolarWinds.InformationService.Contract2.InfoServiceProxy - Error executing query:

    System.ServiceModel.CommunicationObjectFaultedException: The communication object,

    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

    Server stack trace:

       at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

       at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecurityDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)

       at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)

       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

    Exception rethrown at [0]:

       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

       at SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Query(QueryXmlRequest query)

    SELECT COUNT(DISTINCT m.MemberPrimaryID) AS TotalRows

                                                FROM Orion.ContainerMembers m LEFT JOIN Metadata.Entity me

                                                    ON me.FullName = m.MemberEntityType

                                                 where m.MemberEntityType = 'Orion.Nodes' RETURN XML RAW

    2012-07-05 13:18:20,954 [67] ERROR SolarWinds.Orion.Web.InformationService.InformationServiceProxy - main exception:

    System.ServiceModel.CommunicationObjectFaultedException: The communication object,

    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

    Server stack trace:

       at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

       at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecurityDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)

       at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)

       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

    Exception rethrown at [0]:

       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

       at SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Query(QueryXmlRequest query)

       at SolarWinds.Orion.Core.Common.OrionInfoServiceProxy.SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteReader(CommandBehavior behavior)

       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteDbDataReader(CommandBehavior behavior)

       at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior)

       at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)

       at System.Data.Common.DbDataAdapter.Fill(DataTable[] dataTables, Int32 startRecord, Int32 maxRecords, IDbCommand command, CommandBehavior behavior)

       at System.Data.Common.DbDataAdapter.Fill(DataTable dataTable)

       at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.Query(String query, IDictionary`2 parameters, Boolean useSeparateExecutionContext)

    2012-07-05 13:18:20,954 [67] ERROR SolarWinds.Orion.Web.InformationService.InformationServiceProxy - The communication object,

    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

    System.ServiceModel.CommunicationObjectFaultedException: The communication object,

    System.ServiceModel.Security.SecuritySessionClientSettings`1+ClientSecurityDuplexSessionChannel[System.ServiceModel.Channels.IDuplexSessionChannel], cannot be used for communication because it is in the Faulted state.

    Server stack trace:

       at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()

       at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecurityDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)

       at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)

       at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

       at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

       at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

    Exception rethrown at [0]:

       at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

       at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

       at SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

       at SolarWinds.InformationService.Contract2.InfoServiceProxy.Query(QueryXmlRequest query)

       at SolarWinds.Orion.Core.Common.OrionInfoServiceProxy.SolarWinds.InformationService.Contract2.IInformationService.Query(QueryXmlRequest query)

       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteReader(CommandBehavior behavior)

       at SolarWinds.InformationService.InformationServiceClient.InformationServiceCommand.ExecuteDbDataReader(CommandBehavior behavior)

       at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior)

       at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)

       at System.Data.Common.DbDataAdapter.Fill(DataTable[] dataTables, Int32 startRecord, Int32 maxRecords, IDbCommand command, CommandBehavior behavior)

       at System.Data.Common.DbDataAdapter.Fill(DataTable dataTable)

       at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.Query(String query, IDictionary`2 parameters, Boolean useSeparateExecutionContext)

    2012-07-05 13:18:26,564 [13] ERROR SolarWinds.InformationService.Contract2.InfoServiceProxy - Error closing exception.

    System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

  • Bounced the information services and the GUI is looking good now.

  • Hi Shane, what do you man by "bounce the Information Services" ? Did you restart the IIS? Or Restart anything else?

    I am also facing this issue at a Customer and Solarwinds Dupport does not have any Idea how to handle this.

    Hope this Thread is still being read.

    Thanks,

    Holger

  • I had to restart the information service on the Orion server.

  • I was wondering if it was ever determined what the maximum number of Groups that can be handled by NPM?

    I am up to 600+ and can't seem to open the manage groups resource even after changing timeout from 60 to 120.

    I am running NPM 10.6.1 and are told by support that 10.7 should correct my issue, yea right! Case#589137

  • We currently have 26 groups active that run smoothly. However if we want to apply account limitation with "group of groups" the web ressource "All Groups" is not working. This might not be related to your issue.

    We do not have the NPM Module so I can only tell you the following versions: Orion Platform 2013.1.0, NCM 7.2.1

    We have been told NCM 7.2.2 should fix this. We are looking forward emoticons_wink.png

  • Hello, current implementation allows creation of 2100 groups. This is caused by the limit on the SQL server side. This will be improved in future, however the biggest factor in performance below this number of groups, is the server performance. Basicly groups and dependencies are the ones of most expensive functionalities of the Orion Platform. I can imagine 600+ complex groups (containing dynamic member definitions) can kill the performance of some, not too powerfull Orion servers + SQLs. It's not possible to estimate the requirements since it very depends on the group complexity. E.g. one very complex group could utilize the server more, than 100 simple groups with static definitions. I hope it helps to understand. Regarding the upgrade to 10.7 - yes, it's possible. Some improvements were done here, but it very depends on your configuration.

    Regards,

    H.