I'm in process of setting up performance monitors for my windows server team. My goal is to create meaningfull alerts... that is, an alert means there is a problem and not just another CPU/Mem spike. I thought I'd share what I'm doing and see if anyone wants to do the same.
Proc-queue
- Monitor sampling time = 120 sec
- Queue length greater than: 20 ( this is for 1 x Xeon 5550 2.67 GHz)
- Alert to trigger if moniter is critical > 5 minutes
Process monitoring
- Monitor sampling time = 300 sec
- ASP worker process percent CPU > 90
- ASP worker process Mem (disabled, since the value is a percentage of total memory which on a server with 40G is meaningless)
- Process Handler Count > 10,000 (need time to burn in on this one, 10,000 might need to increase)
- Process Thread Count > 500 ( same comment as above)
(as I get more info I'll add to the list... oh... here is a link I've been using as a guide: http://technet.microsoft.com/en-us/magazine/2008.08.pulse.aspx?pr=blog#id0120047 )