This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Most CPU intensive SQL activities on NetPerfMon DB

FormerMember
FormerMember

Hello Thwackers,

We are having some performance issues on our NPM installation at the moment, and it seems to be coming down to NTA related queries.  this SQL query seems to be really causing some pain, but the next few down the list of "top CPU hoggers" are also NTA related.

I have posted in the NPM forum due to the overall affect being that NPM runs slow, or inability to drill into node details when it really plays naughty

I believe there is fundamentally something bad going on in NTA that is causing the rest of the platform to act up, does the below query mean anything to you?

cheers

Spongey toilet-pants

.

--EXEC [dbo].[swsp_NodesGetAllNodes] 1

CREATE PROCEDURE [dbo].[swsp_NodesGetAllNodes] ( @ntaManagedOnly bit)

AS

BEGIN

SET NOCOUNT ON;

    IF (@ntaManagedOnly = 1)

BEGIN

        SELECT DISTINCT n.NodeID AS NodeID,

   n.UnManaged AS UnManaged,

   (CASE WHEN Len(Isnull(n.DNS,''))=0 THEN n.Caption ELSE n.DNS END) as DNS,

   n.IP_Address AS IP_Address,

   convert(int, COALESCE(np.FlowVersion,0)) as FlowVersion,

   convert(int, COALESCE(np.FlowType,0)) AS FlowType,

            CAST(case when n.ObjectSubType='ICMP' then 1 else 0 end AS BIT) as IsICMP,

   CAST(1 AS BIT) AS IsPrimaryIpAddress,

   n.[Description] AS [Description],

   n.IOSImage AS IOSImage,

   n.IOSVersion AS IOSVersion,

   np.IsAARCapable AS IsAARCapable,

   np.IsAARMainDevice AS IsAARMainDevice

        FROM Nodes n WITH(NOLOCK)

            JOIN Interfaces i WITH(NOLOCK) ON i.NodeID = n.NodeID

            JOIN NetFlowSources ns WITH(NOLOCK) ON ns.InterfaceID = i.InterfaceID AND Enabled=1

            LEFT JOIN NetFlowNodeProperties np WITH(NOLOCK) ON np.NodeID=n.NodeID

  WHERE n.UnManaged = 0

END

    ELSE

BEGIN

        SELECT n.NodeID AS NodeID,

   n.UnManaged AS UnManaged,

   (CASE WHEN Len(Isnull(n.DNS,''))=0 THEN Caption ELSE n.DNS END) as DNS,

   n.IP_Address AS IP_Address,

   convert(int, COALESCE(np.FlowVersion,0)) as FlowVersion,

   convert(int, COALESCE(np.FlowType,0)) AS FlowType,

            CAST(case when n.ObjectSubType='ICMP' then 1 else 0 end AS BIT) as IsICMP,

   CAST(1 AS BIT) AS IsPrimaryIpAddress,

   n.[Description] AS [Description],

   n.IOSImage AS IOSImage,

   n.IOSVersion AS IOSVersion,

   np.IsAARCapable AS IsAARCapable,

   np.IsAARMainDevice AS IsAARMainDevice

        FROM Nodes n WITH(NOLOCK)

        LEFT JOIN NetFlowNodeProperties np WITH(NOLOCK) ON np.NodeID=n.NodeID

    END   

END

  • What kind of performance issues are you facing? Is the website loading slowly or do you have problems polling your systems. In case you have a slow responding website I'd consider getting an additional webserver. As I have stated many many many times, it is worth the money. Additional webserver is one of the most affordable licenses but increases usability a lot. If you are having polling issues I'd go with an additional polling engine.

    If you have other performance issues with your Orion setting ... please elaborate.

    Cheers!

  • FormerMember
    0 FormerMember in reply to HerrDoktor

    Hey there

    thank you for the reply!  the problems were overall site navigation but extreme poor performance with any view with NTA resources within it, or the NTA admin page.  the NTA admin page is a disaster and just had not had the work pout into it to allow users to filter on custom properties, but hopefully it will happen soon

    anyway, it seems to be that one of the additional pollers had a version mismatch of NTA that had thrown the whole thing into a spin.  the top queries were all NTA related which means the application must have got right messed up with the versions.

    we have 3 additional webservers, but have bought a few more too to load balance as I want to make sure that that layer will never bent he bottle neck.  however, I still want to keep an eye on busy queries

    cheers

  • It might help to activate "Hubble" so you can see what is taking a long time to load on your web page. From my experience there are a lot of resources that are painfully slow. We ended up removing those resources from any start- or summary- page/view and put the "slow" resources on a separate view, which we tagged with "please wait" or "patience"

    as you said, filtering resources can be slow, I don't know what you wat to filter in the NTA pag but for NCM we ended up keeping it as simple as possible and created Groups on which we filter.

  • Hiya,

    That does seem a bit odd and keeping version concurrent across the platform is always a good call. Can I ask what versions of what modules you are on?

    And as you have 3 additional web servers, how many users of the Solarwinds platform?

    I ask as NTA since version 4 have a preference for a separate flow storage server to remove the weight of the flows from the SQL database and just want to check that is not the cause.

    The other thing that springs to mind is how big is the database and what are the top tables?

    You can find this by hopping onto the applicaion server and loading the Solarwinds Database Manager, Add default Server and right clicking on the Solarwinds Database and clicking properties. Two tabs will then open and one is Tables. Open that and click the top of the amount of rows column to sort it to have the largest at the top. I also do this with the size column just to compare.

    I only mention this bit as I seem to have had a few ppl with cases like this recently. Before going any further, check you have a valid backup of the SW Database emoticons_happy.png!

    IF, IF, IF............. The top tables are Syslogs or Traps or TrapVarBinds, THEN it could be the case that you are over loading the database with syslog and traps (I mention this as I see it every now and then ...) IF that is the case, check to see if the syslogs are at a debug level, if so reduce it to information level, preferable at the device end or drop them in the Syslog viewer. Also check Traps to see what coming in on them, again IF you are getting alot of traps, investigate etc.

    IF that is the case then you may want to check the syslog and traps retention settings. Those are located in All Settings, Poller settings, 2/3 rd's of the way down the page Syslog and traps have a fields for number of days to keep them, this is by default 7 days, it can be changed, but if it's the case you need to reduce the size of the tables, then lower the syslogs / traps retention to 1 day  or what is appropriate. At that point you should be able to run the database maintenace application on the application server which will clear down the tables for you.

    So there are a few things to check:

    Module versions?

    Additionall pollers?

    Approx number of users?

    Total element count?

    Are the NTA flows kept in the storage database or in the SQL database?

    What size is the DB?

    What are the top tables?

    And If the top tables are traps or syslogs then how to check and clear them down is above emoticons_happy.png

    HTH