This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

SEUM.Worker question/issues and other weird WPM issues

Hey Guys,

Like most of you I'm finding that our Org is doing more and more web based things with cloud computing and the like etc.. So WPM is playing more of a critical roll in my environment and especially from a uptime reporting perspective.

So here are my issues with WPM and maybe some of you long term experts can point me in the right direction or shed some light on the issues I'm having because right now I can't tell if my config is messed up or if WPM is just a bad product?

1.) So I noticed that 5 out of my 10 or more players were having issues with the player load being at 20k percent or more. Yes, you read right, twenty thousand or higher!! So I send the logs over to Solarwinds and I'm waiting to hear back. And this is with me having all those players in the local security policy with "Allow log on locally" etc.. I did all the tweaks on these systems before I deployed WPM so as far as I know these servers and WPM should be running top notch. Not the case thus far. So just for kicks I rebooted the boxes in question and the loads have all dropped to less then 10% for the past two days. And now I turned on the alert about player being overloaded but never experienced that before so that rule was turned off. So my question is: What would cause the load to ramp up that high? And why does a simple windows reboot fix that issue?

2.) Are any of you guys running into these accounts eating a TON of disk space? I have small disks on my player boxes because I figure it's only doing one function so it shouldn't eat disk space but I find that all 7 accounts for the player take up at least 1GB in their profiles on the WPM Player box. And it never goes down so whatever process is chewing up the space never purges that data or clears it self out. I have run into issues in the past with this but I still haven't received a clear answer as to why these SEUM accounts chew up so much disk space and how can you offload whatever that content is to another location? Should I be doing some maintenance on those accounts or something?

3.) UNKNOWN.. This is the BIGGEST issue of all because trying to tell a MGR " I don't know what happened it just went into an unknown state because WPM doesn't know how to process blah blah blah" That makes you sound like a complete idiot. And I run into some of these unknowns during some production hours from time to time and there is NOTHING on said boxes that would prevent network traffic or some backup or other process that would prevent WPM from being up 100% of the time, yet I run into a bunch of weird issues with WPM it seems to be a bit unstable to me at first glance.

So any feedback on things I could be doing wrong would be appreciated.

My solarwinds server is a 2008 R2 box with 92GB or Ram Quad Core etc..  and it ONLY runs Solarwinds (SAM, WPM, NPM)

My WPM version 2.1.0

Remote SQL Server 2005

Remote player boxes

Win 7 Professional with 4GB or RAM and brand new HP Elite Desk workstations all freshly formatted and ONLY doing WPM no other processes outside of the OS or any other applications.

Thanks,

Mac

  • I can't speak for your other issues but your second issue is fixed in WPM 2.2‌ (Uncontrolled growth  of IE cache). Your other issues might be a symptom of that. You should think about upgrading.

  • As jasonsmith‌ mentioned second issue is fixed in newest WPM 2.2.  And on the first issue. I am not dev of WPM but I think that player load is somehow computed on how many transaction do not make it to start on time, and these transaction are buffered to stack. After you simply restart your computer, you simply erase stack. So then you may get third (Uknown) issue, because you lost results of transactions which was on that stack waiting for replay.

  • Hello Mac,

    player load tells if player is able to play transactions on time. If it's over 100% it means that some transactions are delayed because player runs on full capacity and they have to wait. How many transactions do you have assigned to single player? We have seen few cases when relatively simple transaction was stuck on player for long time and it occupied playback capacity. WPM tries to ensure that page is fully loaded before it moves on. If there is for example periodic AJAX request on the page which pings server every few seconds then WPM can wait up to 30 seconds due to that periodic request. Having this in multiple transactions can easily overload player. There are ways how handle this but for that we need to get more information. Support case is the best bet.

    Player load can be also related to your third issue. If a transaction is not played for more than 2x its frequency (if you have transaction with 5 min frequency and it's not played for more than 10 minutes) then it goes to unknown state. Solving player load issue should solve also transaction unknowns.

  • jiri.tomek​; are there any specific settings that would help playback on a Windows 7 box? I have an instance that does not seem to be overloaded by any metric, but transactions often either flap unknown and up or Down and Up between sequential transactions plays.

    Would adjusting NumWorkerProcesses to 1 in SolarWinds.SEUM.Agent.Service.exe.config possibly help?

  • cahunt​, if player does not show high player load then flapping transaction is most likely caused by something else. Also if transaction goes down and not just unknown it's definitely failure during transaction playback. I suggest to open a support case. WPM default settings should suit the most cases and any change should be for a good reason based on investigation of particular issue.

  • jiri.tomek​, not quite the response I was hoping for, but it is informative and does provide me some direction with this effort. Thank you for the reply.

    I will try a new recording with a few more specifics to see if we can't get past this failure in playback.

  • Without knowing more about your processing environment it's hard to give you any real direction. With that said here are some things you can look at in regards to items 1 and 3.

    • Check that the total amount of user configured wait times in your transactions do not exceed the playback interval.
      This can cause transaction playback requests to be dropped and cause unknown transactions
      This can also cause all of your SEUM users to be tasked when new transactions need to play causing high player load
      User configured wait times are NOT included in the Duration metric for a step or transaction. The time to complete playback is the sum of all step durations for that transaction plus the number of user configured wait times. If your time to complete playback is greater than you playback interval it could be causing both your first and third issue to occur
      This could be helpful: WPM Report: Transaction Wait Times

    • Player Load is a by-product of the number of transactions in queue to be played in relation to the number of SEUM users that are available to playback transactions
      You can control the number of users used by WPM on each player location in the file: SolarWinds.SEUM.Agent.Service.exe.config
      I would not recommend adjusting this file without direction from support
      I would not recommend increasing this number past 15
      A stop and restart of the Solarwinds WPM Playback Player service is required for any changes to take place

    • Also the amount of work your player is doing is based on the number of Actions it's taking.
      There is not good way to determine how many actions your performing per transaction step already included in WPM
      This could be helpful: WPM Report: Transaction Action Count

    Hopefully some of this information helps. WPM load balancing and control can be difficult with the limited about of information and documentation there is on KPI's.

  • For disk space, try this. REMOVE any player local accounts from the local Administrator group. We had the same issue and this resolved it, despite the best practices documentation. And in actuality, we've never had so much stability in WPM since we removed the accts from local admin. Local admin is doing something gnarly and may have too many rights.