Ongoing connectivity issues

Incident Report for Keeping

Postmortem

Postmortem: Service Outage on July 16, 2024

Incident Summary:

On 9:05 am EDT July 16, 2024, Keeping experienced intermittent connection issues that lasted approximately 2 hours. The outage affected both the Keeping Chrome Extension and the main application website at https://app.keeping.com/

Timeline:

  • 9:05 am EDT: Initial reports of the website being inaccessible were received.
  • 9:07 am EDT : Incident response team was alerted and began investigation.
  • 9:25 am EDT : Root cause identified as a Denial of Service attack on Keeping’s hosting provider, Gigalixir.
  • 9:30 am EDT: Mitigation steps initiated.
  • 10:25 am EDT: Website functionality partially restored.
  • 11:00 am EDT: Full service restored and confirmed stable.

Root Cause:

The outage was caused by a SYN flood attack on Gigalixir’s load balancers. Gigalixir is Keeping’s primary web hosting provider.

Impact:

During the outage, access to Keeping’s main web application was severely limited (or impossible), and syncing between agent accounts and shared mailboxes was paused. No data was lost.

Posted Jul 16, 2024 - 15:27 EDT

Resolved

This incident has been resolved.
Posted Jul 16, 2024 - 14:49 EDT

Monitoring

The Keeping Chrome Extension and web application are experiencing connectivity issues due to an outage with our primary hosting provider. We've remediated the issue and are monitoring.
Posted Jul 16, 2024 - 11:39 EDT
This incident affected: Chrome Extension and Keeping App.