What is the expected procedure if these devices are using high CPU?
Is there a scenario that would actually require an intervention for these boxes?
To me it sounds like no action is considered necessary if you don't care during peak hours, and you don't care during off hours then when would it ever actually matter? You could just come up with a scheme to exclude these devices from CPU alerting in general. Assuming you have SAM or some other form of application monitoring in place, maybe it would be more effective to just alert on some kind of synthetic transaction against the application this server supports and just rely on that since ultimately CPU load is just an indirect measure that we normally associate with slow application performance.
What I did was created a 'Highly_Utilized' Custom Property that I could then add an exclusion condition to my alert rules for. So for nodes that tend to be running hot but it's expected behavior I can have those excluded from my default rules. I then created a rule for the highly utilized nodes that is a bit more lenient. In addition to that you could also adjust the scheduled hours for the specific alert rules should things be consistent.
I like the option of creating an additional alert for the high cpu offenders but that can lead to a nightmare as offenders increase and the time the cpu peaks occur vary between the devices.
I was toying with the idea of creating the cpuPeakStart and cpuPeakStop custom property and then adding a condtion in my swql alert query which will make sure the cpuload time is not between certain values. One of many issues is that the date and time is store together and if I get just the hour and concat the time 7:08 will display as 7:8. It would be nice if the mute function would be granular and allowed repeats.