This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

SAM 5.0 feedback and enhancement suggestions

Finally managing to sink my teeth into SAM 5.0 and have now been using it for the past week or so. Have to say I'm impressed with what I'm seeing - in particular the hardware health monitoring is a big enhancement for us. I do, however, have a few comments on areas I'd like to see enhanced, and a couple of possible bugs. Apologies if they're covered elsewhere in earlier posts!

1. I created a new Monitor to check for services that are required as part of our LANDesk patch management software. One of the things that initially caught me out was that I didn't change the name of each component, so all my monitors showed up with the same name of "Windows Service Monitor". To change this name I see you have to click on the "Rename" button. Would it be possible to have the ability to change the component name from the expanded component details?

2. I am monitoring HP servers (DL380 of various generations). Before SAM I was monitoring hardware health using UnDP and one of the things I was monitoring was the health of the PSUs rather than the power draw. Would it be possible to add this as a part of the HP health status since it is probably more relevant that the power draw. The OID I was using was 1.3.6.1.4.1.232.6.2.9.3 which returns a table with the status of each PSU.

3. One of my servers appears to have a couple of broken sensors. I am seeing a power draw of 4194572 Watts and CPU temperature of 255degC! i assume that these are failure values that HP use, particularly given they are the maximum vaules that a binary value of a certain number of bits can hold. However, SolarWinds is seeing the power consumption as "OK" and the temperatures as "Degraded". I'm not sure how feasible it would be to have a new status of "Sensor error?" in this sort of situation.

4. Before I assign nodes to an application template I sometimes want to check what components are covered before the assignment. Assuming I make no changes, it would be helpful if there were a link on the "Edit Application Template" page straight to the assignment of that template to nodes. I know this may cause confusion if someone makes changes on the page and doesn't save them before going to that link, but I'm sure a quick line of text of popup as a warning would suffice!?!

5. I've noticed that on the Node Details page the controls for the graphs for power consumption and temperature are a little broken:

heathGraphError.jpg

Also, those controls vanish altogether on the Temperature graph!

6. Some of our servers don't have all the memory slots in use. The feedback from SIM seems to put these slots in the "Other" state, which shows as a problem in some views. I'm not sure if there's an easy way to change this, but it would be helpful if unused slots could be marked as such with an associated "Up" status.

7. Is there any way to have the HP SIM version installed on a server displayed in the "Hardware Details" box?

8. Is there any way to have SolarWinds push the HP SIM agents to a server?

9. In my Active Alerts view I am only shown the component that has gone down, e.g. "Disk 0". This isn't helpful as i don't know what server this relates to. Is there any way to change the hardware health alert such that it also shows the server name, e.g". "Disk 0 on EMAIL_SERVER"

Sorry there's a lot there, but hopefully some of this might be possible for a future update to SAM and of interest to other users.

  • I completely agree with #9! My current work around is to click on the "Disk 0" which takes me to the node details page, but it would be much nicer to see "Disk 0 on ${Caption}"

  • Bad form I know, but a couple of more that've cropped up since I wrote the original post...

    10. Any change of a "Components with Problems" resource under "SAM Application Summary Reports"? I have a couple of components that are "Critical", but the app itself doesn't show as being "Down". I know I could do a report to cover this, but would be nice to see it as a default.

    11. Is there any chance that the Hardware Health Monitoring will be extended to non-server devices like Cisco switches? Even better... Any chance of an SDK or API or other TLA that would allow us to build our own health monitors like the server health ones? emoticons_wink.png

  • Just a quick bump to see if anyone from the SolarWinds SAM team has any comments on the above!

  • 1. I created a new Monitor to check for services that are required as part of our LANDesk patch management software. One of the things that initially caught me out was that I didn't change the name of each component, so all my monitors showed up with the same name of "Windows Service Monitor". To change this name I see you have to click on the "Rename" button. Would it be possible to have the ability to change the component name from the expanded component details?

    We're currently working on some fairly significant changes to the application template editing page but I'm curious why you'd like to see the component monitor name as an editable item under the expanded settings and not next to the component monitor name as it is today?

    2. I am monitoring HP servers (DL380 of various generations). Before SAM I was monitoring hardware health using UnDP and one of the things I was monitoring was the health of the PSUs rather than the power draw. Would it be possible to add this as a part of the HP health status since it is probably more relevant that the power draw. The OID I was using was 1.3.6.1.4.1.232.6.2.9.3 which returns a table with the status of each PSU.

    This is also one of items we're working on. It's not called out specifically in my blog posting because this is more of a bug than a feature, but this falls under Additional Hardware Monitoring Support. We're tracking this internally as FB107941.

    3. One of my servers appears to have a couple of broken sensors. I am seeing a power draw of 4194572 Watts and CPU temperature of 255degC! i assume that these are failure values that HP use, particularly given they are the maximum vaules that a binary value of a certain number of bits can hold. However, SolarWinds is seeing the power consumption as "OK" and the temperatures as "Degraded". I'm not sure how feasible it would be to have a new status of "Sensor error?" in this sort of situation.

    I can't say I've seen this situation before. I'd be interested to know how HP's SIM is reporting these values. I recommend opening a support case so we can gather a MIB walk from this device and a SolarWinds diagnostic to determine if we can improve how we handle this condition.

    4. Before I assign nodes to an application template I sometimes want to check what components are covered before the assignment. Assuming I make no changes, it would be helpful if there were a link on the "Edit Application Template" page straight to the assignment of that template to nodes. I know this may cause confusion if someone makes changes on the page and doesn't save them before going to that link, but I'm sure a quick line of text of popup as a warning would suffice!?!

    I'll log this as a feature request. A possible workaround may be to execute a "test" before assignment. This will list all component monitor names and their status (pass/fail) before the template is actually assigned to a node.

    5. I've noticed that on the Node Details page the controls for the graphs for power consumption and temperature are a little broken:

    heathGraphError.jpg

    Also, those controls vanish altogether on the Temperature graph!

    I can't say I've seen this issue before, and there have been no reported cases to support that I've been able to find. This may be specific to your particular browser/version. I would recommend trying to reproduce this from another machine using a different browser/version. If the problem persists please open a case with support.

    Hardware Health Resource Chart.png

    6. Some of our servers don't have all the memory slots in use. The feedback from SIM seems to put these slots in the "Other" state, which shows as a problem in some views. I'm not sure if there's an easy way to change this, but it would be helpful if unused slots could be marked as such with an associated "Up" status.

    HP SIM should not be returning the status of unused memory slots as "Other". In fact, unused memory slots shouldn't be listed at all in SAM. I would recommend upgrading HPs SIM agent on those servers to the latest version and see if that helps to resolve the issue. If possible I'd re-add the node as a WMI host to see if this condition is reproducible for both SNMP and WMI managed nodes. If it is please open a case with support so we can investigate this further.

    7. Is there any way to have the HP SIM version installed on a server displayed in the "Hardware Details" box?

    Not currently, but this is something we're considering for a future release.

    8. Is there any way to have SolarWinds push the HP SIM agents to a server?

    SAM cannot deploy the vendor specific hardware agent software but this is something our Patch Manager product can perform.

    9. In my Active Alerts view I am only shown the component that has gone down, e.g. "Disk 0". This isn't helpful as i don't know what server this relates to. Is there any way to change the hardware health alert such that it also shows the server name, e.g". "Disk 0 on EMAIL_SERVER"

    The alert is based on the sensor that's in failure which is why you're seeing the specific volume. This can be changed to be the hardware type which will return hard drive, fan, etc, or general hardware alert which will specify the node. This is no different than how application alerts are reported in the Active Alerts Resource but your point is valid. I'll log this as a feature request. In the meantime you can use Events to display any required information as part of the alert that you'd like to see with advanced alert macros.

  • I am curious if any other SAM users are having issues with HTTP and HTTPS content monitors going critical at regular 20 minute periods. We see this only on our content monitors. The dev folks have confirmed they can also see the issue in their lab and we are tryijng to get this fixed ASAP. I am just confused that we are the first client to notice this issue? Is this possible?

  • Hi stephen.black,

    I recommend to open support ticket in your case if you haven't done so already. We need to take a closer look what's going on.

  • Tomas is right. If you've already upgraded to SAM 5.0.1 and are experiencing this issue then please open a case with support so we can take a look at your diagnostics. I'm not aware a systemic issue with the HTTP/HTTPS user experience monitors affecting multiple customers.

  • 2. I am monitoring HP servers (DL380 of various generations). Before SAM I was monitoring hardware health using UnDP and one of the things I was monitoring was the health of the PSUs rather than the power draw. Would it be possible to add this as a part of the HP health status since it is probably more relevant that the power draw. The OID I was using was 1.3.6.1.4.1.232.6.2.9.3 which returns a table with the status of each PSU.

    The latest SAM 5.2 beta includes hardware health monitoring improvements for HP Servers. So you should now be able to properly monitor the health and status of your power supplies. If you're an existing APM/SAM customer under active maintenance you can sign-up here to participate in the beta. We'd love to get your feedback on this feature.

    To learn more about what's new in the SAM 5.2 beta please read my blog postings below.

    SAM 5.2 Beta 2 - Soup's On

    SAM 5.2 Beta is Now Available