Degraded performance
Incident Report for Sympa
Postmortem

Wednesday November 20th between 14:00–16:50 Finnish time, our system experienced significant performance degradation. Users encountered slower operations, delays, and higher error rates. Sympa Classic customers have been seen even bigger problems during this period. The system stabilized after 16:50 after a mitigation action.

Root Cause

The issue stemmed from a database problem where the CPU utilization for our primary database peaked for over two hours. The underlying cause was identified as broken database index statistics . This issue disrupted query optimization, causing a system-wide slowdown for all database queries.

Resolution

Once the root cause was identified, we rebuilt the table statistics in the database. The system recovered within minutes of completing this operation.

Next Steps

  1. Improve Query Efficiency: Refactor the code to prevent future database jams by optimizing the database query that had issue
  2. Enhance Monitoring for database
  3. Faster Recovery Protocols: Leverage today's learnings to ensure quicker recovery in case of recurrence.

Apologies and Acknowledgments

We deeply regret the issue. This was an unfortunate situation, and we sincerely apologize for the disruption caused to customers. Rest assured, Sympa Engineers are committed to implementing the necessary safeguards to prevent similar occurrences in the future.

Posted Nov 22, 2024 - 08:55 EET

Resolved
The issue that caused slowness in Sympa has now been resolved. All services should be fully accessible. If you continue to experience any problems, please contact our support service. Thank you for your patience.
Posted Nov 20, 2024 - 18:02 EET
Identified
We have identified an issue in Sympa that may cause slowness. We are working to resolve the problem and will provide an update when it is confirmed to be fully resolved.
Posted Nov 20, 2024 - 14:51 EET
This incident affected: System availability.