Player Load Percentage

Is the player load percentage based on # of CPUs?   So if I have 2 CPUs, then I can safely go up to 200%?

Asking because around 7pm the other night, for some unknown reason at this time, all 4 of my current players doubled in load according to WPM.

All of them are now reporting well over 100% load and none of them even have 25 transactional tests on them yet.

Oddly, this occurred on two different WPM installations, one in PRD and one in DEV.   Yet, the load on all players on each WPM installation doubled at the same time.

  • Hello,

    the player load is not related to number of CPUs and is computed by the following formula (simplified):

    player_load = number_of_running_playbacks/total_number_of_playback_workers*100 + transactions_waiting_for_playback

    the transactions_waiting_for_playback value is based on sum of wait times of the transactions on the player before they are played back. So basically the longer the transactions wait for playback on the player the higher this value gets. Based on this formula you for will for example see 100% load when all playback workers are currently playing back transactions. So having a current load around the 100% (even slightly above) is completely natural. But if the load in player load chart is constantly over 100% it should be considered to move some of the transactions to different player.

    In your case if the load got over 100% on a two different installations at the same time something probably changed on the monitored side so the playback now takes longer (which causes higher accumulation of transactions in the waiting queue which in turn causes the load to rise). This can also happen when some of the transactions start failing as the player is by default in some cases trying to replay the transaction again to ensure that we won't report false alert.

  • Thanks, that was helpful.   Was always curious why the player load percentage didn't seem to really line up with the actual load on the box.

    There was a Trendmicro OfficeScan to all of my boxes around that same time.   So looking into that now to see if it is causing tests to take longer.

    The total time for a transaction doesn't seem to have changed, but I noticed that it seems to be running less tests than it did also. 

    For example, in some cases, a test scheduled to run every 3 mins seems to only really be showing data for one test in a 10 min poll in the graphs (no min or max bars).

  • It's important to note that Trendmicro's OfficeScan installs a transparent proxy and routes all browser traffic through it for purposes of picking up phishing sites and malware through the use of web reputation services. It's very probable that the removal of OfficeScan from the computer where the WPM agent resides will improve overall transaction playback performance.

  • Thanks.  I had them completely remove it on one of my players but it didn't seem to help at all.   In the recorder, on a 9 step test that reports taking 24 seconds to run, it takes actually over 5 minutes to complete.   Between every step I'm seeing a 30 second pause at the end of the step and before the next step starts.   So for example, it highlights "Click on image" at the end of a step and then just sits for 30 seconds before the next step starts.   I see the actual page of the next step load in the recorder, but then it just hangs there.   If this is happening in the player, it would explain why each 3 minute poll test seems to really only be running once every 10 minutes.

    Since it seems to be exactly 30 seconds, I'm guessing it's one of these and will start changing them one at a time to see if I can narrow it down. 

        accessDeniedRetryTimeMs="30000"

        elementWaitTimeMs="30000"

        fileDownloadWaitTimeMs="30000"

        browserCompleteWaitTimeMs="30000"

        pendingRequestsWaitTimeMs="30000"

  • This is the one that helped greatly.   Any ideas as to why this would be?    In the recorder,  my 9 step test was  taking 5 minutes and 10 seconds on average to run.   After changing this one setting to 5 seconds from 30, the test now only takes 1 minute and 22 seconds to run.   After changing the player config to this, my player load % has dropped down from being well over 100% to well under 50%.


         pendingRequestsWaitTimeMs="5000"

    I don't see any issues in the recorder at all, I can pull up pages and they seem to complete loading very quickly.

  • This timeout is used for wait for each pending request. So simply put if you have some long running request on page (long poll requests, keep alive requests) we wait for this request specified amount of time to finish. You can see if your page has these kind of request for example in Internet Explorer developers console. Generally it is not recommended to lower this value if there are no issues with playback as having this value too low may cause that player will not wait for the page to fully load and measured step duration would be lower that it actually is. But if you are not seeing any issues I guess it should be OK in your case.

  • I forgot to mention something, if you will be able to pinpoint a request which is causing this issue let me know and I'll send you a message with information how to setup player configuration in order to ignore this one specific request. This approach should be significantly safer than lowering thresholds.

  • Thanks, I appreciate the input...

    Still scratching my head over this one.    I just now went and created a new local user on the box, switched to that user and loaded up the same URLs in IE with the developer tools and just don't see any issue at all.

    The pages all load almost as soon as I click on the links.   Logging into our app takes less than a second.   IE shows it waiting on nothing at all.


    I created a local account just in case something was getting configured under my corp account that needs to be setup on these local accounts that wpm uses.

    If I do figure out that it is a specific request that can be safely ignored, I'll get back with you...   Thanks again.

    (think i'm just going to totally remove the player and users from a node and re-install to see if maybe some local security routine ran monday night and changed/broke something with the wpm user accounts)

  • Eureka, guess I should have used the debug logs earlier and saved a couple of hours...    Seeing this in the debug logs for every step of every test:

    2013-03-14 14:26:05,080 [SolarWinds.SEUM.Agent.Worker.exe][Browser Thread] DEBUG SolarWinds.SEUM.Player.WatiN.WatiNPlayer - Waiting for pending requests

    2013-03-14 14:26:06,094 [SolarWinds.SEUM.Agent.Worker.exe][Browser Thread] DEBUG SolarWinds.SEUM.Player.WatiN.WatiNPlayer - Waiting for pending requests

    2013-03-14 14:26:06,094 [SolarWinds.SEUM.Agent.Worker.exe][Browser Thread] WARN  SolarWinds.SEUM.Player.WatiN.WatiNPlayer - Browser was stuck with pending requests for more than 30000ms.

    2013-03-14 14:26:06,094 [SolarWinds.SEUM.Agent.Worker.exe][Browser Thread] WARN  SolarWinds.SEUM.Player.WatiN.WatiNPlayer - Remaining pending request: Begin: +1.622 s, Blocked: 0ms, DNS: 0 ms, Connection: 0 ms, Send: 0 ms, TTFB: 0 ms, Download: 0 ms, Size: 0 Mime: text/html Status: 0 URL: https://***.tcliveus.com/i?siteID=.........

    So this tclieus.com call seems to be causing my issues and started Monday night.       Learning how to ignore/skip requests to specific domains would definitely be appreciated.

    Thanks,

    Derek

  • Would appreciate that info on blocking specific domains when you get a chance.   I've been going through the config files but not finding it.   I could (?) update the hosts file to point them to local host but hoping there is a better method? 

    Thanks,

    Derek