0 Replies Latest reply on Mar 4, 2014 4:55 PM by nicole pauls

    Hey, Listen! Alert Central Office Hours Chat Transcript from Thursday, December 5

    nicole pauls

      Here's the transcript from Thursday, December 5th's Alert Central Office Hours session. As always, check out the schedule here: Hey, Listen! Alert Central Office Hours Schedule & Transcripts. We'll put up a Thwack event over in the Community Events, Webcasts & Training space and you can sign up for a handy reminder email by filling out this survey: https://www.surveymonkey.com/s/9TNQKD3

       

      Highlights from this Office Hours include the finer points of what it means to be On Call.

       

      I've marked the AC Team members below with (AC Team), (PM), and (Dev), so they are easy to distinguish. If you didn't get your question answered or have a new question after reading through, feel free to post it here in the Alert Central forum!

       

      SenderText
      IanJHi Katie. Is the discussion open yet?
      Katie B (AC Team)Not technically (our product team isn't here yet), but you are more than welcome to chat.
      IanJOk thanks. I can wait till they get here.
      Katie B (AC Team) They will be here soon.
      Katie B (AC Team)Just a reminder, we will be post the chat from this event once the meeting is completed.
      Katie B (AC Team)Who here has downloaded Alert Central?
      IanJI have.
      IanJIt is implemented as a test in our environment.
      Katie B (AC Team)How long have you had it up in your test environment?
      PeteCousinsJust put it 'live' today for a single source.
      Katie B (AC Team)What source do you have coming into AC?
      IanJWe have had it a for a few weeks with a single direct Orion source.
      Katie B (AC Team)Is everyone able to see the chat?
      Nicole (AC PM)works for me
      PeteCousinsA have 4 HP Disk-to-disk appliances at the moment. VMware, Orion, Redgate SQL Monitor, SQL Agent to be added at a later date when I can figure out how to route alerts affectively for our multi disciplined team.
      PeteCousinsI can see chat fine
      Katie B (AC Team)Well, we are officially in office hours, so please feel free to submit your questions
      JayMeathYes, I can see the chat
      IanJSo I found in the last few weeks there are three main points that will prevent AC from ever being used in production at our site.
      Nicole (AC PM)bring them on!
      IanJI found most of the them as issues that are being voted on.
      IanJ1. There is no way to scope alerts to certain groups. Anyone can ack/close anyone's alerts.\n\n2. Alerts do not clear in AC when they clear in Orion. Makes it very cumbersome when things auto clear as you still get spammed.\n\n3. New alerts for the same node just make a note on the old AC alert. This is very probmatic, as it forces the old alert owner to be responsible for new alerts, even when they are not on call.
      Nicole (AC PM)are you only using AC with Orion, or primarily with Orion?
      IanJOnly with Orion.
      Nicole (AC PM)got it - for the auto-clear, would you want that behavior always when trigger conditions are cleared, or are there some where you'd leave the alert open?
      IanJOur team really likes the way Orion does it.
      IanJIf that was mirrored in AC it would be great.
      IanJOverall AC really fills a role we desperately need to make a multi team transition.
      Nicole (AC PM)with scoping alerts for groups, would you want groups to not even have visibility of other groups' alerts, or just read-only access?
      IanJBoth options would be great, but I would be happy with read only access.
      IanJWe can't have a Tier 1 tech accidentally closing a priority 1 alert.
      Nicole (AC PM)the third one is interesting - do they show up as new alerts in orion?
      Nicole (AC PM)we do have a couple of nuances we're working on with orion alerting so it's something that has got a little more attention recently on our end, though that doesn't ring a bell.
      IanJIf someone owns an alert for say, volume full and they ack it and the alert in orion clears it, then say it retriggers in orion it just finds the acked alert in AC and notes that it retrigged.
      Nicole (AC PM)OH
      Nicole (AC PM)that makes more sense
      Nicole (AC PM)so either the alert in alert central should start over from the beginning, or a new one created for the new re-trigger.
      IanJYes. either re-queue old alert new or re-trigger as new alert.
      IanJEither would be fine and we could work with it.
      jwilsonThat's a double edged sword. THe tech would have to look at history for a volume alert if it was a recurring alert.
      IanJCorrect, which is what you do now for Orion.
      jwilsonIn a big shop you could have several different people clear space and not notice that it was an ongoing issue.
      IanJIf you see an alert as a "trouble ticket" then you would want it tracked in one alert.
      IanJBut if someone forgets to close it then it just lives assigned to them.
      Nicole (AC PM)on our end, I think we'd make it an option, to try to cover both cases flexibly. We've been leaning toward providing more options rather than making more defaults because of how widely used Alert Central tends to be.
      jwilsonYeah, swiss army knife is good....until it doesn't fit in your pocket
      jwilsonI'm all for options
      PeteCousinsI don't want to break the flow of conversation, however I need some advice on handling routing of alerts. We have a multi-disciplined team of 10 people. Although we all have our specialist areas, we are expected to be able to have a go at fixing anything if the specialists aren't available. We also have a rota with one person doing an early shift (7-30am to 9am), someone doing a late shift (5pm to 8pm), and someone covering the Saturday 8am-4pm. Everyone else is theoretically working Mon-Fri 9am to 5pm. So for alerts regarding backups, we'd ideally like to alert whoever is on call (if not 9am-5pm), then escalate to one of the two specialists (ideally at random) then to the other specialist. Then to the rest of the team, and finally if there's still no response to our team leader. I don’t think this is possible with Alert Central, without a lot of identical calendars for each specialism.
      Nicole (AC PM)a puzzle. thinking
      Nicole (AC PM)interestingly, we have heard the "on duty" vs. "on call" a few times lately.
      Nicole (AC PM)also, the ability to route an alert to different groups, which would make some of the redundancy unnecessary
      jwilsonalert to different or alert to multiple?
      Nicole (AC PM)so, i think you're right - I think you would need to create an on call calendar for each specialty that has the on call rotation, so you're at least duplicating that effort
      Nicole (AC PM)either or
      Nicole (AC PM)i think the most straightforward means creating an on call calendar for each specialty, routing alerts to that specialty, then doing a notify on call -> notify specialist -> notify everyone -> fallback to team lead
      Nicole (AC PM)also means everyone is in every group.
      IanJSo it's always going to be a linear alerting? For example, orion allows you to ping the owner, while still providing an escalation tangent to exist.
      Nicole (AC PM)linear is how the workflow is now - more or less the most straightforward.
      Nicole (AC PM)but, I could see that scenario where you want to notify someone but let the alert continue through escalation.
      IanJI also noticed that you do not have an option to say "continue escalation steps indefinietly for a group" which means I have to have it end up with someone eventually.
      IanJI got around this by setting the repeat to a super high number.
      scottrichthe ability to send notifications to other groups or individual users as tickets are worked would be huge for me!
      Nicole (AC PM)true, there is a max repeat of 99
      Nicole (AC PM)scott, something automated (when an alert in "backups" happens also let the "servers" group know as a cc) or something in alert central that would let you pass on information (I see an alert happened in backups, send a note to the "servers" group from AC so I don't have to figure out who those people are)?
      IanJRight.. and my only options after that are to ack, close, or assign to a person. None of which I want to do.
      Nicole (AC PM)how often do you run into that, or is it a "just in case" type thing?
      scottrichyes, automated would be best. for example, when a firewall goes down, the network team gets the alert, but the e-comm team gets notified so they know the website is not accessible and can see the progress.
      Nicole (AC PM)great, that's what I was thinking.
      IanJYeah I would solve that in the current system by adding both parties to the group or set up a mailing list that includes both.
      IanJBut including dependency notifications would be huge.
      Nicole (AC PM)we are adding a manual CC option so at least you could add someone to receive updates on an alert without taking responsibility for it, step in that direction.
      scottrichthat would be good to have anyway, in case you need to include someone who is not part of the group.
      IanJhow often do you run into that, or is it a "just in case" type thing? - It's more of a case that we don't want the alert to close until the on call person takes it.
      IanJI want it to ping them until they get out of bed.
      Nicole (AC PM)haha
      Nicole (AC PM)haven't quite got to the feature of "air horn notification"
      scottrichThat's what managers are for!
      IanJVery true, but also why we need the tangent alerts. The primary on call person still needs to continue to be notified. So the CC would help.
      IanJNicole, what is the timeline for some of these issues? I know permissions for alerts has been requested by others. I would really like to implement AC full time, but can't until we sort out some of the issues and flow. Is there a roadmap that will be posted?
      Nicole (AC PM)there is a roadmap posted, but i haven't updated it recently. we're working on SMS/paging notifications at the moment as our "major" item, but we are filling in with lots of smaller items - for example, some of these orion alerting issues are things we'd try to fit in over say the next 6 months
      Nicole (AC PM)more complex things like multiple groups are things we'd have to weigh a little more carefully
      Nicole (AC PM)(permissions)
      IanJOn the note of SMS/paging... Right now I know you can do it through email. How do you plan to implement that going forward?
      Nicole (AC PM)our initial support will be of third party gateways - we'll support protocols that submit/receive SMS/page notifications to a gateway that will handle the actual modemy bits
      Nicole (AC PM)e.g. PageGate
      IanJSo through a cell phone gate?
      IanJOr similar system?
      PeteCousinsSorry, been dealing with a live issue. These are the conclusions I've been coming to.
      PeteCousinsHave to go - live issue getting more critical sorry. Thanks for the ideas.
      IanJcell service gate I mean. We use NotePager Pro now.
      Nicole (AC PM)right
      Nicole (AC PM)thanks, Pete!
      Nicole (AC PM)most likely we'll support some direct APIs (PageGate is kind of a weird thing) and then things like SNPP/SMPP which are standard protocols
      IanJGotcha. Unfortunately for us that piece is covered and independant of AC. If AC mirrored the alert status in Orion it would allow us to easily integrate with it, as the triggers in Orion cover paging as well.
      Nicole (AC PM)yeah. using AC to extend an Orion-only environment is sometimes a unique bird.
      IanJUnderstood. Unfortunately if AC can't meet the need of segementing/routing alerts properly between groups with permissions and handling escalations in a better way that the logic of triggered alerts there is no win when you are coming from Orion.
      IanJYou actually lose more important functionality for a few smaller wins (oncall calendars, easily email acks).
      Nicole (AC PM)15 minutes left for questions! or comments, unsolicited or otherwise!
      IanJThanks for taking the time to listen Nicole. I'm excited to see the roadmap and what features are going to be top priority in the next dev cycle so we can plan accordingly.
      Nicole (AC PM)thanks very much for your feedback, Ian, it was very helpful.
      IanJI hope so. I haven't found a way to get support on AC yet, so I really appreciate this avenue being available.
      Nicole (AC PM)this and thwack are your best bets
      scottrichThanks, Nicole, appreciate your time!
      Nicole (AC PM)thanks, scott!
      Nicole (AC PM)thanks, everyone!
      Nicole (AC PM)see you again for office hours in January! - will post the transcript and schedule here as usual: http://thwack.solarwinds.com/message/202041