Asm Health Checker Found 1 New Failures -

The most frequent culprit. One disk in a disk group has been taken offline due to:

| ID | Requirement | |----|--------------| | FR1 | System must track the state of each ASM health check item across runs | | FR2 | Detect difference between current_failures and previous_failures | | FR3 | If new_failures_count > 0, trigger a notification | | FR4 | Include in alert: failure name, timestamp, component, severity (if available) | | FR5 | Suppress duplicate alerts for same failure unless it re-occurs after being resolved | asm health checker found 1 new failures


  • Check port listening:
  • Check disk and memory:
  • Check network path:
  • Re-run health check (example CLI):
  • View recent logs:

  • Error example: Disk DATA_0001 is offline The most frequent culprit

    Fix:

    ALTER DISKGROUP DATA ONLINE DISK 'DATA_0001' POWER 3;
    -- wait for rebalance to complete
    SELECT * FROM v$asm_operation;
    

    If the disk remains offline, drop it and add a replacement: Check port listening:

    ALTER DISKGROUP DATA DROP DISK 'DATA_0001';
    ALTER DISKGROUP DATA ADD DISK '/dev/mapper/asm_data_new' NAME 'DATA_0001';