10 Replies Latest reply on Feb 21, 2012 2:13 AM by ET

    node has stop responding but does not down

    jliewch2003

      Hi all, few days ago one of my node has stop responding (request time out) but the node does not goes down(according to the event log, i can't see the event mention node down). As a result of this, it does not trigger my alert mention that the node is down(since it was not down in event log). After that i had try to add my laptop as a node and disconnect and my alert works well and event log also logged that node down after stop responding. Below are my print screen of the event log for my first devices that will not goes down.

       

      My question is why my first server will stop responding but will not change to down? Is that any issues about this?Hope u guys understand my poor english thanks.

        • Re: node has stop responding but does not down
          ET

          Do you have any parent defined on your first server? If so, and this parent is down permanently or it was, than your node goes to UNREACHABLE state instead, because of this dependency.

          • Re: node has stop responding but does not down
            cscoengineer

            True.  i would check the group dependencies to see if the server is a child.   Also see if the server has dual connections to the network.  See what interface the node is defined to 'ping'.  Sometimes, you may think it's on one interface, whereas it's looking at a different interface on the same node.

            • Re: node has stop responding but does not down
              mavturner

              jiliewch2003,

              Is it just this one node or do you have this problem with all of your nodes? Did you change any of the standard polling intervals, thresholds (Settings - Orion Thresholds), or warning level (Settings - Polling Settings - Node Warning Level)? This is probably not the problem, but just curious.

              No reason to apologize for your English, it is great!

              Mav

                • Re: node has stop responding but does not down
                  jliewch2003

                  Hi all sorry for late reply. FYI, the node does not have any dependencies. It was standalone, and i had try to use my laptop act as a server and test the alert that i wanted to trigger. and it works fine.

                  Mav, i does not change any setting that you mention, they all are remain default settings. It was kinda weird that the server does not goes down. Is there any way to troubleshoot this issue?? Thanks in advance dude

                    • Re: node has stop responding but does not down
                      ET

                      Well, backward investigation is very hard.  But from your events you posted I can see that node went Down.

                      Node has stopped responding (reason)

                      This event is fired in case Status of node is Down. Is problem that Node was not down, or that Alert wasn't triggered?

                        • Re: node has stop responding but does not down
                          jliewch2003

                          Hi, but the event does not show that the node goes down,it just mention that the node was stop responding. My problem is the alert will only trigger when te node goes down but from the event te node does not goes down...so I need to know how can th node goes down n trigger the alert

                          • Re: node has stop responding but does not down
                            jliewch2003

                            Hi ET, as u can see that the node was stop responding and suppose to be down next. However, the event does not show the node was down, my concern was about why the node does not goes down?Cause other devices will goes down once the node had stop responding. My concern was is that any issue that cause the node does not goes down?(as it just stop responding and does not goes down in event so i assume the node does not goes down). If the node does not goes down, it will not trigger my alert that alert me when a node goes down, where this is my second concern.

                            Really thanks a lot for all of your opinion.

                              • Re: node has stop responding but does not down
                                ET

                                I see what you are trying to show now. You are missing event "Node ... is Down." This event is suppressed in case previous status of node was "Unknown". That's the only situation.

                                Are you able to reproduce this issue also in case node was "Up" previously? If so, I would open a support ticket so we can look closer into this.

                                Alerts must work in any situation and I understand your concern here.

                                  • Re: node has stop responding but does not down
                                    jliewch2003

                                    Hi ET, thanks for your understanding. That node that have this issue was unable to use for testing since it was a main server for the company. As you say that the event is suppressed in case of the status was unknown previously, but as i know that Unknown was only happen for those node that have dependencies, however this node does not configure for any dependencies. Is that possible that the server was hang then the node was not going down?

                                    FYI, this was one of my project, so currently already handover to customer, so i might not able to to test the issue. BUt i will try to ask my customer for testing and update this issue.

                                     

                                    Thanks