7 Replies Latest reply on Jan 2, 2018 1:42 PM by cahunt

    Alert when NCM job fails to run

    mcconns

      Hi,

       

      Is there a way to alert when an NCM job fails to run? I have a job setup to run nightly which works great and sends me an email report of all the configs that were backed up or failed to back up.

       

      However, every so often the job fails to run at all and I need to restart the Orion services. Is there a way to alert me that the job did not run?

       

      Thanks

       

      Stewart

        • Re: Alert when NCM job fails to run
          cahunt

          When the job does not run, do you still get the email?

            • Re: Alert when NCM job fails to run
              mcconns

              No, that's the problem. The job fails to run altogether and so does not even attempt to send out the email. As I say restarting the services solves the issue and the job runs again the following night but because I don't get the job completion email I may not notice for several days. I would like to be informed that day if last nights job failed.

            • Re: Alert when NCM job fails to run
              kange0010

              I'm also interested to know the answer to this issue.  Anyone?

              • Re: Alert when NCM job fails to run
                cahunt

                You can try renaming the SDF Files and then create new files for the system to use/rewrite and see if your jobs will run.

                 

                The Configuration Wizard also replaces these mini db files that hold the job info, so either option will work. If the issue continues after the CW there is something causing it to fail, and diagnostics might help find the root cause. It could be tied to a few different things, and those logs/errors will point you in one direction or another for a more concrete fix (possibly) .. you may need to open a case beyond that.  This specific issue has a few possibilities for the root cause, and best to find that rather than implementing different fixes with your fingers crossed.

                • Re: Alert when NCM job fails to run
                  rschroeder

                  I look to the pie chart showing devices that have recently been backed up, versus not backed up.  I put that up front up high in NPM so it's easy to see any yellow slice in that pie. 

                   

                  Also, when the Daily Config Change Report comes out, I look closely for the tiny date stamp beneath any before/after change to ensure things are keeping up with the calendar.

                   

                  Are you running the latest version of NCM?  It seems that a failure to start or complete the job would have a notification you can enable easily on the latest version.

                  • Re: Alert when NCM job fails to run
                    alphabits

                    I'm sure you can write some SWQL alert for this. Is there any event or other message you see indicating the job failed?

                    So we know where to direct the query?

                      • Re: Alert when NCM job fails to run
                        cahunt

                        Your Errors are are going to be in the log files. In the DB Tables there is a column for logs - default is null on this column until there is a log written for a job issue/failure.

                        I can not find any error, or error messages in the DB tables for the NCM Jobs (like you find for SAM component errors), only a job status. And if the job runs properly then the status goes back to scheduled when done.