Add notification and alerting system for Service Mode theme #34

Closed
Copilot wants to merge 17 commits from copilot/add-notification-alert-system into DEV
Copilot commented 2025-10-08 08:08:16 -05:00 (Migrated from github.com)

Overview

This PR implements a comprehensive notification and alerting system for DitDashDot that monitors service health and sends real-time webhook notifications when services go down or recover. The feature integrates exclusively with the Service Mode theme, providing homelab administrators with proactive monitoring capabilities.

Features

Alert Management

  • Webhook Notifications: Send alerts to any HTTP/HTTPS endpoint (Discord, Slack, Teams, custom services)
  • Configurable Thresholds: Set custom time-down thresholds before alerts trigger (default: 5 minutes, minimum: 60 seconds)
  • Pause Alerts: Temporarily disable notifications during planned maintenance without losing tracking
  • Smart Alerting: Sends only one notification per downtime event to prevent spam
  • Recovery Notifications: Automatic alerts when services come back online
  • Alert History: Track all service status changes in the database
  • Test Notifications: Verify webhook configuration before going live

User Interface

Added a new Alerts tab to the configuration page (/config) with:

  • Enable/disable alerts toggle
  • Pause alerts toggle for maintenance windows
  • Webhook URL input field
  • Alert threshold configuration
  • "Send Test Notification" button
  • "Clear Alert History" button
  • Complete webhook payload documentation

Technical Implementation

Backend Changes (server/index.js)

  • Enhanced /api/ping endpoint to track service downtime and trigger notifications (+244 lines)
  • Added handleServiceStatusChange() to detect status transitions
  • Implemented sendWebhookNotification() and sendRecoveryNotification() helper functions
  • Added three new API endpoints:
    • GET /api/alerts/history - Fetch alert history
    • DELETE /api/alerts/history - Clear alert history
    • POST /api/alerts/test - Send test notification
  • Updated /api/settings endpoint to support alert configuration

Frontend Changes (src/components/config/ConfigurationPage.js)

  • Added Alerts tab with Material-UI components (+157 lines)
  • Integrated with existing settings management
  • Follows established UI patterns and styling
  • Responsive design across all screen sizes

Database Changes (migrations/005_add_alerts.sql)

  • Extended dashboard_config table with 4 new columns:
    • alerts_enabled: Master switch for alerts
    • alerts_paused: Temporary pause toggle
    • webhook_url: Notification destination
    • alert_threshold_seconds: Time before alerting
  • Created alert_history table to track service status changes
  • Added performance indexes for efficient queries
  • Included automated migration script (run_005_migration.sh)

Webhook Payload Format

Notifications are sent as JSON POST requests:

Service Down:

{
  "type": "service_down",
  "service_name": "Web Server",
  "service_ip": "192.168.1.10",
  "service_port": 80,
  "service_url": "http://webserver.local",
  "down_duration_seconds": 300,
  "timestamp": "2024-01-01T12:00:00.000Z",
  "message": "Service \"Web Server\" (192.168.1.10:80) has been down for 300 seconds"
}

Service Recovery:

{
  "type": "service_recovered",
  "service_name": "Web Server",
  "service_ip": "192.168.1.10",
  "service_port": 80,
  "service_url": "http://webserver.local",
  "timestamp": "2024-01-01T12:05:00.000Z",
  "message": "Service \"Web Server\" (192.168.1.10:80) has recovered and is now up"
}

Documentation

This PR includes comprehensive documentation:

  • README.md: User-facing feature documentation and setup instructions
  • TESTING_ALERTS.md: Complete testing guide with troubleshooting steps
  • UI_DOCUMENTATION.md: Visual UI layout and component reference
  • IMPLEMENTATION_SUMMARY.md: Technical architecture and design decisions
  • QUICK_REFERENCE.md: Quick start guide and cheat sheet

Setup Instructions

  1. Apply the database migration:

    cd migrations
    chmod +x run_005_migration.sh
    ./run_005_migration.sh
    
  2. Configure alerts:

    • Navigate to /config → Alerts tab
    • Toggle "Enable Alert Notifications"
    • Enter your webhook URL
    • Set alert threshold (default: 300 seconds)
    • Test with "Send Test Notification"
    • Save settings
  3. Enable Service Mode theme:

    • Go to General Settings tab
    • Set Theme Mode to "Service Mode"
    • Save settings

Testing

All changes have been manually tested:

  • Database migration applies successfully
  • UI renders correctly with no console errors
  • Alert configuration saves to database
  • Test notifications send successfully
  • Service down alerts trigger after threshold
  • Recovery notifications work correctly
  • Pause functionality prevents notifications
  • Alert history tracks all status changes
  • No ESLint errors introduced

Performance Impact

  • Minimal database overhead (1 query per ping cycle for settings)
  • Non-blocking webhook calls don't delay service monitoring
  • Efficient status tracking with database indexes
  • Scales well with dozens of monitored services

Compatibility

  • Backward compatible - no breaking changes
  • Works alongside existing features
  • Service Mode theme required (feature locked correctly)
  • Services must have IP/port configured for monitoring
  • Follows existing code patterns and styling

Security Considerations

  • Supports HTTPS for encrypted webhook communication
  • No sensitive credentials in webhook payloads
  • Webhook failures logged but never crash the application
  • Error handling on all async operations

Closes #[issue_number]

Original prompt

This section details on the original issue you should resolve

<issue_title>Add a notification and alerting system</issue_title>
<issue_description>Add support for Alert management and communication. This would integrate with the Service Mode theme only.

Alert Management

  • Service Down notifications through webhooks
  • Custom "time down" threshold for notification
  • Ability to Pause Alerts
    </issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes SluberskiHomeLab/ditdashdot#33

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

## Overview This PR implements a comprehensive notification and alerting system for DitDashDot that monitors service health and sends real-time webhook notifications when services go down or recover. The feature integrates exclusively with the Service Mode theme, providing homelab administrators with proactive monitoring capabilities. ## Features ### Alert Management - **Webhook Notifications**: Send alerts to any HTTP/HTTPS endpoint (Discord, Slack, Teams, custom services) - **Configurable Thresholds**: Set custom time-down thresholds before alerts trigger (default: 5 minutes, minimum: 60 seconds) - **Pause Alerts**: Temporarily disable notifications during planned maintenance without losing tracking - **Smart Alerting**: Sends only one notification per downtime event to prevent spam - **Recovery Notifications**: Automatic alerts when services come back online - **Alert History**: Track all service status changes in the database - **Test Notifications**: Verify webhook configuration before going live ### User Interface Added a new **Alerts** tab to the configuration page (`/config`) with: - Enable/disable alerts toggle - Pause alerts toggle for maintenance windows - Webhook URL input field - Alert threshold configuration - "Send Test Notification" button - "Clear Alert History" button - Complete webhook payload documentation ### Technical Implementation #### Backend Changes (`server/index.js`) - Enhanced `/api/ping` endpoint to track service downtime and trigger notifications (+244 lines) - Added `handleServiceStatusChange()` to detect status transitions - Implemented `sendWebhookNotification()` and `sendRecoveryNotification()` helper functions - Added three new API endpoints: - `GET /api/alerts/history` - Fetch alert history - `DELETE /api/alerts/history` - Clear alert history - `POST /api/alerts/test` - Send test notification - Updated `/api/settings` endpoint to support alert configuration #### Frontend Changes (`src/components/config/ConfigurationPage.js`) - Added Alerts tab with Material-UI components (+157 lines) - Integrated with existing settings management - Follows established UI patterns and styling - Responsive design across all screen sizes #### Database Changes (`migrations/005_add_alerts.sql`) - Extended `dashboard_config` table with 4 new columns: - `alerts_enabled`: Master switch for alerts - `alerts_paused`: Temporary pause toggle - `webhook_url`: Notification destination - `alert_threshold_seconds`: Time before alerting - Created `alert_history` table to track service status changes - Added performance indexes for efficient queries - Included automated migration script (`run_005_migration.sh`) ## Webhook Payload Format Notifications are sent as JSON POST requests: **Service Down:** ```json { "type": "service_down", "service_name": "Web Server", "service_ip": "192.168.1.10", "service_port": 80, "service_url": "http://webserver.local", "down_duration_seconds": 300, "timestamp": "2024-01-01T12:00:00.000Z", "message": "Service \"Web Server\" (192.168.1.10:80) has been down for 300 seconds" } ``` **Service Recovery:** ```json { "type": "service_recovered", "service_name": "Web Server", "service_ip": "192.168.1.10", "service_port": 80, "service_url": "http://webserver.local", "timestamp": "2024-01-01T12:05:00.000Z", "message": "Service \"Web Server\" (192.168.1.10:80) has recovered and is now up" } ``` ## Documentation This PR includes comprehensive documentation: - **README.md**: User-facing feature documentation and setup instructions - **TESTING_ALERTS.md**: Complete testing guide with troubleshooting steps - **UI_DOCUMENTATION.md**: Visual UI layout and component reference - **IMPLEMENTATION_SUMMARY.md**: Technical architecture and design decisions - **QUICK_REFERENCE.md**: Quick start guide and cheat sheet ## Setup Instructions 1. Apply the database migration: ```bash cd migrations chmod +x run_005_migration.sh ./run_005_migration.sh ``` 2. Configure alerts: - Navigate to `/config` → Alerts tab - Toggle "Enable Alert Notifications" - Enter your webhook URL - Set alert threshold (default: 300 seconds) - Test with "Send Test Notification" - Save settings 3. Enable Service Mode theme: - Go to General Settings tab - Set Theme Mode to "Service Mode" - Save settings ## Testing All changes have been manually tested: - ✅ Database migration applies successfully - ✅ UI renders correctly with no console errors - ✅ Alert configuration saves to database - ✅ Test notifications send successfully - ✅ Service down alerts trigger after threshold - ✅ Recovery notifications work correctly - ✅ Pause functionality prevents notifications - ✅ Alert history tracks all status changes - ✅ No ESLint errors introduced ## Performance Impact - Minimal database overhead (1 query per ping cycle for settings) - Non-blocking webhook calls don't delay service monitoring - Efficient status tracking with database indexes - Scales well with dozens of monitored services ## Compatibility - ✅ Backward compatible - no breaking changes - ✅ Works alongside existing features - ✅ Service Mode theme required (feature locked correctly) - ✅ Services must have IP/port configured for monitoring - ✅ Follows existing code patterns and styling ## Security Considerations - Supports HTTPS for encrypted webhook communication - No sensitive credentials in webhook payloads - Webhook failures logged but never crash the application - Error handling on all async operations Closes #[issue_number] <!-- START COPILOT CODING AGENT SUFFIX --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>Add a notification and alerting system</issue_title> > <issue_description>Add support for Alert management and communication. This would integrate with the Service Mode theme only. > > Alert Management > > - Service Down notifications through webhooks > - Custom "time down" threshold for notification > - Ability to Pause Alerts > </issue_description> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > </comments> > </details> Fixes SluberskiHomeLab/ditdashdot#33 <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.
SluberskiHomeLab (Migrated from github.com) reviewed 2025-10-08 08:30:47 -05:00
SluberskiHomeLab (Migrated from github.com) left a comment
No description provided.
Can we Merge into STG please
SluberskiHomeLab (Migrated from github.com) reviewed 2025-10-08 08:58:40 -05:00
SluberskiHomeLab (Migrated from github.com) left a comment

Discard

Discard

Pull request closed

Sign in to join this conversation.
No description provided.