Dell iDRAC
Dell iDRAC Event Monitor
The Dell iDRAC Event Monitor connects to the iDRAC interface on Dell servers and monitors hardware-related metrics.
Overview
The Dell iDRAC Event Monitor connects to the iDRAC system found on many systems sold by Dell. iDRAC provides hardware-related data about a system, including details about fans, temperatures, and power supplies.
This event monitor uses the "winrm" Windows command line tool to connect to the iDRAC interface and retrieve the data that it requires. Before using the event monitor for the first time, you may need to run "winrm quickconfig" on the monitoring server or remote node.
When configuring the event monitor, be sure to select the IP or host name of the iDRAC interface and not the system's network interface.
Use Cases
- Monitoring the hardware health status of your Dell servers
- Alerting based on high power consumption
- Warning about fan and temperature status
Monitoring Options
This event monitor provides the following options:
- Alert with [Info/Warning/Error/Critical] if the iDRAC system is unreachable: Get alerts if the iDRAC system could not be contacted.
- Alert with [Info/Warning/Error/Critical] if the system's health status is not OK: Checks the overall system's rollup status. A value of "Degraded" indicates that one or more system components are in a failed state but the system is still operational. A value of "Error" indicates a critical failure of one or more system components.
- Alert with [Info/Warning/Error/Critical] if the chassis health status is not OK: Checks the chassis health and warns about intrusion status and CPU health status.
- Alert with [Info/Warning/Error/Critical] if fan health status is not OK: Checks the status of each fan and warns if the fan's primary status is not 'OK'.
- Alert with [Info/Warning/Error/Critical] if temperature probe health status is not OK: Checks system board inlet temperature and CPU temperatures. Alert if any temperatures are outside of system configured thresholds. Records data points for each temperature.
- Alert with [Info/Warning/Error/Critical] if power supply health status is not OK: Checks power supply status, alerts about failed power supplies, alerts if redundancy status is not OK.
- Alert with [Info/Warning/Error/Critical] if RAID and/or drive status is not OK: Checks primary and RAID status for all physical and virtual disks. Alerts if the status is not "OK" or if the RAID status is not "Online".
- Alert if the average consumed power for a chassis is greater than specified thresholds: Checks power consumption of the chassis and alerts if it exceeds thresholds that you define. Records current power consumption as a graph data point.
Authentication and Security
The account used for authentication must have access to the iDRAC interface.
Protocols
Data Points
Data Point | Description |
---|---|
Temp | The temperature of your hardware. |
Power Consumption | The amount of power consumed by your hardware. |