cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 8

Who's a full-time monitoring engineer?

A question for the people that do this full-time for one network. After you have got everything singing and dancing - built a fully resilient, high availability, top-notch monitoring system, with everything onboarded and all alerts in place, how do you spend your time keeping busy as a monitoring engineer?
0 Kudos
6 Replies

Might be moving that way when we staff up. Been beating the Database servers and NPM and NCM into shape in upgrading to 2019-4.

I have at least 7-10 features on my agile Epic related to Solarwinds, but tough to get to them through my daily network engineering tasks, and t-shooting.

0 Kudos

I was doing consulting for several years and recently came on full time at a place with what most people would consider to be a pretty mature monitoring environment. With that said, my team is in major flux because as far as the business cares basic infra monitoring and alerting is a solved problem. When oddball things come up we already have well defined processes in place to figure out how to track for that and alert the right people and that side of things is humming along smoothly with barely any attention, but now the dev teams want us to get tighter integration with their world. So lots of new things to learn in the arena of APM and development pipelines and such. Learning the kinds of problems that can come up with kubernetes clusters, cloud services that we depend on, etc.
A lot of my time is also spent around improving the general user experience of accessing data that we have spread out across a half dozen platforms. Nobody wants to log into Orion, then log into the Meraki portal for other pieces of info, then log into the Viptela, then hit up the Palo, so we aggregate a lot of tools in custom databases and are building up observability tools to make it easier for the owner of a system to find out everything they could possibly want to know about that system and everything else it touches and trying to make that appear to be simple.

- Marc Netterfield, Github

We might be at a place where we just send everything to Datalake and monitor from there. 

I wouldn't want to be Solarwinds, as it's got to be hard to try to be all things to all people.

0 Kudos

I find that at the end of the day user experience on a data lake is still pretty trash tier so I end up having to design some kind of interface to lay over it to help them get what they need. If I'm building the overlay then I can just pull the data directly from whatever tools it lives in and save myself the hassle of duplicating/ standardizing/ consolidating the data in the lake anyway. That's kind of the model Grafana takes, they just want to provide a visualization layer over whatever data wherever it lives. I've dabbled there but i still have a lot of work to bring it all together into something im happy with. Who would have known that UX was such a pain.
- Marc Netterfield, Github
0 Kudos
Level 12

I have been doing "monitoring engineering" for a long time over 20 years, with several companies.  Just when I thought I was golden, a new platform would come on-line and of course it was not be fully supported.  Or a new application would come on-line and again things were no longer golden.

Or the latest COVID-19 everyone (15,000 to 20,000 globally) working from home, management wants reports (hourly). "Oh nice report but can you add this".  So what to do with the spare time? What time.  

Monitoring and Alerting are the type things that is always changing.

Things change, application software changes, new unknown events happen, new reports are requested, you think of newer better ways to accomplish the same task, event messages change, the cloud happens for some things changing how everything else interacts, etc.

 

Keeping the lights on takes work as you have to check up on your tools too.