This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Job Scheduler v2 Results Notify error skyrocketing

Hello,

I am running several APEs all on 2016.2 and on most of the devices from one particular APE, I am getting blotchy charts, missing data, and the Job Scheduler v2 Results Notify Error climbs at a rate of about 300 a second for hours. The Application monitor for Orion sits at around 50K.

I have rebooted 3 times (ape and primary poller) emoticons_happy.png

I have checked to verify that the "Count stat as difference" is checked.

Validated connectivity and ports between APE and primary poller, and APE and SQL

This started when I upgraded the APE from core 2016.1 to 2016.2.

I have been unable to find what is causing this and I don't see any documentation or discussions that point me in any direction.

When I run the configuration wizard on the APE it only runs through the program improvement 2.0. It's been a while since we've had to do anything with the APEs (they have just been working), so I can't remember if the config wizard went through the same process as the primary poller, i.e. Database, Services, Website, minus the website, or if this is normal operation.

Has anyone seen anything like this?

  • I've had a similar situation. And digging around I found that in my case it was VNQM module that had failed to start and was causing these errors in job scheduler v2. Once I fixed VNQM, And rebooted my server those errors in my instance cleared out. I would suggest maybe looking at the logs and seeing if you have any other modules that could potentially be the culprit.

    I hope this helps!

  • Good deal. I'll see what I can find. Thanks

  • My primary poller is regularly shows over 44K in the statistic for this component (with count as difference turned on). Is this normal?  Where would I begin to look to find the cause?

  • So, looks like a version mismatch on the APE (NCM) was causing my issue. Once I upgraded, they dropped.

  • I don't think it's normal. I would verify your module versions. That seem to fix it for me.

  • For our installation the error is seen on the MPE normally, and then APEs as a result, seems to occur after a reboot of everything for whatever reason.

    After reboot we have to stop/start SWIS v3 after a boot - a half missing appstack issue workaround on our version, sometimes we have to stop/start module engine after.

    Tried those individually again and no improvement on MPE.

    This time I tried: Stop SWIS v3, whilst down stop Module Engine, then start Module Engine and once running start SWIS v3 and the error seen on MPE quickly resolved as the MPE processed everything, then after the MPE is behaving the APEs all caught up also on their flagged stats without any interaction on those needed

  • Service bounce, SolarWinds TCP optimization reg hacks, core Orion Services uninstall and re-install are the short and long term remedies.

    Statistic collection is dead when this one is high. I watch this counter very closely.

    emoticons_happy.png

  • I'd recommend you call in a ticket to get support ASAP in case this is a further issue. Run diagnostics and go from there. There are a lot of possibilities but if you're paying for support you may as well use it.