This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Orion v9 Website Timeouts and NetPerfMonService Service Hangs

Is anyone else experiencing website performance timeouts with Orion v9? I'm seeing this happen in the strangest places, such as the node details page, or volume details chart. I have no idea what would be causing the issue.


I'm noticing issues where the NetPerfMonService service hangs and won't respond to a service stop command. Killing NetPerfMonService.exe in task manager and restarting the service seems to be the only way to bring it back. One symptom that the service is hung is System Manager won't open.

  • I should point out that the issue with website timeouts that I'm having is not with all nodes or volumes, but the nodes & volumes that don't work are very consistent. The strange thing is the web interface seems pretty peppy, I just can't get to some pages for seemingly no reason at all. The website error is fairly generic

    Orion Website Error

    An error has occurred with the Orion website.

    Additional Information

    System.Web.HttpException: Request timed out.
     
    As for the service hanging issue, that's a completely unrelated issue.
  • Let's start with the website timeout issues. In \Inetpub\Solarwinds\web.config, find the <threshold value="WARN"/> line and change the value to DEBUG. Save it. Click around the website until you hit a timeout or two. That will capture more information about what the website is busy doing when it gets bogged down. Capture diagnostics and put web.config back the way it was.

    If you already have a ticket open about this, just attach these diagnostics and let the rep know to get me involved. If not, opening one is probably the easiest way to get the big file to me.

  • I have the logs - I'm looking at this now.

  • I found two website timeouts in the log during the period when the log level was turned up. In both cases, it was a couple of minutes into the execution of the same query:

    Select * From CPULoad  Where NodeID=49 AND DateTime >= '07/01/2008 00:00:00' AND DateTime <= '07/01/2008 16:06:53' Order By DateTime

    That's not a query that should normally be taking a long time to execute. The diagnostics are showing a healthy (not excessive) amount of data in the CPU history tables. Could you run that query using SQL Management Studio or the Orion Database Manager? Time how long it takes, roughly.

  • That query returned results almost instantly. Maybe 1 second?

  • Ah, I misread the log. It wasn't a couple of minutes into executing the query - it was a couple of minutes into what comes after that query: setting up the chart axes. The algorithm for auto-scaling the chart is getting confused and going into an infinite loop.

    I have scheduled this bug for SP1.

    As a workaround you could use List Resources in Web Node Management or System Manager to unassign the CPU&Memory poller from that node (since we apparently can't pull that info from that node anyway). Then run "DELETE FROM CPULoad WHERE NodeID=49" to get rid of the data.

  • As a workaround you could use List Resources in Web Node Management or System Manager to unassign the CPU&Memory poller from that node (since we apparently can't pull that info from that node anyway). Then run "DELETE FROM CPULoad WHERE NodeID=49" to get rid of the data.

    Wow that worked like a charm! Thanks for the help. I look forward to SP1.

  • Not to sound ungrateful but how would you like me to troubleshoot the NetPerfMonService service hangs?

  • Oops - I forgot there were two problems in this thread!

    To debug the hang, I'm going to need a memory dump. You can create that using Microsoft's "ADPlus" tool, which is part of Debugging Tools for Windows. You can download this from www.microsoft.com/.../installx86.mspx

    Once that's installed, wait for NetPerfMonService to get stuck. The open a command window to C:\Program Files\Debugging Tools for Windows (x86) and run this:

    ADPlus -hang -pn NetPerfMonService.exe

    That will probably warn you about a bunch of stuff like your default vbscript interpreter (doesn't matter) and not having debug symbols configured (also doesn't matter - I can add them after the fact). Just ignore all these warnings and it will create the memory dump directory. That directory will have a few smallish files and one .dmp file whose size will be equal to the memory size of the NetPerfMonService process.

    Zip up the Hang_Mode__Date_... directory and send it through support.

    If you want, you can uninstall Debugging Tools for Windows at this point.