Unknown Service Error in Dev and Prod
Incident Report for Blameless
Resolved
Restarting Key services seems to have mitigated the issue. Will be performing an RCA.
Posted May 17, 2019 - 19:35 UTC
Update
We are attempting restart of Key services to mitigate the problem.
Posted May 17, 2019 - 17:46 UTC
Update
The initial theory was that the large queues for deprecated services were contributing to the message publishing failures. Deleting these queues has not resolved the issue. Engineering team is continuing to investigate.
Posted May 17, 2019 - 16:13 UTC
Update
Some dev and prod instances are in a degraded state due to an ongoing RabbitMQ issue. The affected instances are encountering a partial to full outage of service. We are applying various mitigating strategies.
Posted May 17, 2019 - 15:41 UTC
Update
Some dev and prod instances are in a degraded state due to an ongoing RabbitMQ issue. The affected instances are encountering a partial to full outage of service. Engineering team is investigating.
Posted May 17, 2019 - 15:10 UTC
Investigating
Some customers may be experiencing problems logging into the Blameless Platform
Posted May 17, 2019 - 15:04 UTC