System Health Monitoring
System Health Monitoring
Learn how to use the system health monitoring features in the Shifts platform to track system performance, detect issues early, and ensure optimal operation of your deployment.
Overview
The System Health Monitoring feature provides system administrators with comprehensive tools to monitor the health and performance of the Shifts platform. This article explains how to access these tools, interpret system health metrics, and respond to potential issues to maintain optimal system performance.
Accessing System Health Monitoring
- Log in to the Shifts platform using a system administrator account
- In the System Admin portal, navigate to System Health in the left sidebar
- The main dashboard displays a comprehensive overview of system health metrics
- Use the tabs to navigate to detailed monitoring sections for specific components
Key Health Metrics Dashboard
The System Health dashboard displays critical status indicators for core system components:
Database Health
- Connection Status: Shows if the database is available and responding
- Performance Metrics: Current database load and query response times
- Record Counts: Overview of record counts for key tables
Redis Cache Health
- Connection Status: Indicates if Redis is operational
- Memory Usage: Current memory consumption
- Connected Clients: Number of active connections
- Version Information: Current Redis version
Server Resource Utilization
- Disk Space: Percentage of free disk space with warnings at 90% usage
- Memory Usage: RAM utilization percentage and total available memory
- CPU Load: Average system load over various time periods
Application Status
- Recent Errors: Count of errors logged in the last 24 hours
- API Performance: Average response time and success rate
- Background Jobs: Status of scheduled and pending jobs
Detailed Monitoring Sections
Database Monitoring
- Click on the Database tab in the System Health dashboard
- View detailed metrics including:
- Query performance statistics
- Connection pool status
- Slow query logs
- Record growth trends
Redis Monitoring
- Navigate to the Redis tab
- Test Redis connection to specific endpoints using the connection tester
- View detailed Redis metrics including:
- Memory usage trends
- Hit/miss ratio
- Expiration statistics
- Connection history
Error Tracking
- Go to the Errors tab
- View a chronological log of system errors
- Filter errors by:
- Error type
- Time period
- Severity level
- Component
Performance Monitoring
- Access the Performance tab
- Monitor key performance indicators:
- Page load times
- API response times
- Database query speed
- Background job processing rates
System Analytics Dashboard
For more comprehensive analytics on system usage and performance:
- Navigate to System Analytics in the left sidebar
- Use the tabs to access different analytics views:
- Overview: General system usage statistics
- Operations: Server and application performance
- Performance: Detailed performance metrics
- Security: Authentication and security analytics
- Engagement: User activity statistics
Key Analytics Features
- Usage Trends: Charts showing usage patterns over time
- Performance Trends: Graphs of response times and error rates
- Resource Utilization: CPU, memory, and disk usage over time
- User Activity: Patterns of user engagement with the system
- Export Capabilities: Download analytics data in CSV format
Testing System Connections
The system health dashboard provides tools to test connections to critical services:
- Redis Connection Test:
- Navigate to the Redis tab
- Enter the endpoint to test
- Click โTest Connectionโ
- View detailed connection results and server information
- Database Verification:
- Check database connection status on the main dashboard
- Use the database tab to run verification queries
- View connection pool statistics
Responding to Health Issues
When the system health monitoring detects issues:
- Database Problems:
- Check database server status
- Verify connection settings
- Examine database logs for errors
- Consider increasing resources if performance issues persist
- Redis Issues:
- Verify Redis server is running
- Check network connectivity
- Examine Redis logs for errors
- Clear cache if corruption is suspected
- Disk Space Warnings:
- Review log file sizes and rotation settings
- Check for large export files that can be archived
- Remove temporary files and old backups
- Consider adding storage if approaching capacity
- Memory Usage Alerts:
- Review application memory settings
- Check for memory leaks in logs
- Consider increasing server resources
- Restart services if memory fragmentation is detected
Regular Maintenance Tasks
For optimal system health, perform these regular maintenance tasks:
- Weekly Reviews:
- Check system health dashboard for warnings
- Review error logs for recurring issues
- Monitor disk space usage trends
- Verify backup processes are functioning
- Monthly Maintenance:
- Review performance trends for degradation
- Check database and cache optimization
- Update health monitoring settings as needed
- Archive old logs and reports
Best Practices
For optimal results when using System Health Monitoring:
- Set Up Alerts: Configure notifications for critical health metrics
- Regular Checks: Schedule routine reviews of the health dashboard
- Trend Analysis: Monitor metrics over time to identify patterns
- Proactive Maintenance: Address warnings before they become critical
- Documentation: Keep records of system changes and their impact on health metrics
Troubleshooting Common Issues
- Slow API Response: Check database performance and connection pool settings
- High Memory Usage: Review Redis cache configuration and object storage
- Database Connection Errors: Verify database credentials and network connectivity
- Disk Space Warnings: Clean up temporary files and optimize log rotation
Related Resources
This article should be updated when:
- New health monitoring features are added
- The monitoring interface changes
- Additional metrics are tracked
- System requirements or recommended thresholds change