6 Replies Latest reply on May 17, 2011 11:56 AM by fcaron

    What I cannot wait for - The Grand Unification

    Donald_Francis

      Orion now has has vision into just about every part of a datacenter down to the SAN and into the virtual, from hard disk to switch port.

       

      What I wait for now, and I can hardly wait, is the day all the modules are tied together and data is linked together in such a way that I can for example see that one of my main SQL servers has has alerted and now shows under the "Performance Impacted" column.  I click on it to see what the matter is and find that one of the physical drives is under performing badly and there are errors on the switch port it is connected too.

      Or maybe

      I get an alert saying that one of my remote sites is severely suffering from latency.  I goto "The Board" and see that it is listed under "Sites Performing Badly" and click on it and see that latency is over half a second and this appears to be due to a file transfer from ABC PC y user XYZ.

       

      Just some thought on where I would love it to all go.  Which is basically the ultimate troubleshooting machine stamping out problems before they occur and taking a large IT dept to the state of being proactive and not reactive. 

        • Re: What I cannot wait for - The Grand Unification
          fcaron

          Hi Donald,

          Interesting posting.

          I agree we could increase the value of the data that some of the Orion modules produce, by improving the integration with other data from other modules. Given the number of modules that Orion has, there can be many candidates for better integration.

          One that we have in mind, relates to your second paragraph: remote site performing badly (e.g. latency) because of unwanted PC traffic (e.g. netflow and user tracking).

          The type of use cases we have in mind are as follows, and could be delivered by a future tighter integration between IP SLAM and NTA and Finding where devices are connected in your network.

          - Latency and traffic correlation for troubleshooting long network latency (Is high traffic teh cause of latency issues? If yes what traffic? Between whom?)
          - Latency and traffic correlation for assessment of the impact of heavy traffic on user experience (I have high traffic, does this create bottlenecks and impact latency, or is my network correctly engineered to support this traffic?
          - I've detected high network traffic, who are the users that will be impacted?
          - I've detected high network latency, who are the users that will be impacted?

          Would love to hear your thoughts and those from the community, here.

          Comments on other integrations are of course welcome, too.

            • Re: What I cannot wait for - The Grand Unification
              Donald_Francis

              That sort of thing is exactly what I was talking about and would indeed love to see.

              There is now so much data at the disposal that I think you guys can in reality come up with an additional module just for problem resolution being for both proactive and reactive problem solving.

              I say an additional module because I think the current modules need to continue like they are being very black and white on things but a fuzzy logic module for especially preventive maintenance would be truly awesome. 

              Something like:

              -I see high utilization on these host switch ports at the same time as a trunk port on the same switch

                        -has this been happening routinely?

                                            -yes-

                                                 - has the cpu been spiking during this time?

                                            -yes-

                                                  ===alert that switch is an OK health state but is routinely being overwhelmed.

               

               

               

              I could think of many such things like the above with things like errors, logs etc etc etc...