Incident Report: GCP Infrastructure Disruption on Feb 20 (Fully Resolved)

On February 20 at 05:17 KST (20:17 UTC, Feb 19), some of our infrastructure and services experienced an interruption due to an issue on Google Cloud Platform (GCP).

This affected a minor subset of nodes and endpoints on both Mainnet and Kairos, as well as Kaia Safe, Kaiascan, and FeeDelegator.

Important to note: this was not a network outage. Kaia is a decentralized network and continued operating without disruption throughout the incident.

All affected components have been fully recovered as of 09:12 KST (00:12 UTC, Feb 20).

For full details, see our blog post below:

1 Like

Esse tipo de incidente reforça a necessidade urgente de descentralizar a infraestrutura de rodagem de nós para além de provedores centralizados como o GCP. Seria interessante discutir estratégias de balanceamento de carga agnósticas ao cloud, garantindo que o RPC e o FeeDelegator mantenham alta disponibilidade mesmo com falhas pontuais de provedores single-point.

Essa interrupção reforça a necessidade de diversificarmos mais a infraestrutura de nós fora de provedores centralizados de nuvem. Seria interessante discutir se a implementação de um mecanismo de failover mais robusto entre diferentes regiões ou providers ajudaria a mitigar esses pontos únicos de falha na camada de serviço.

It’s good to see the consensus layer remained resilient despite the GCP instability affecting RPC endpoints. This highlights the importance of diversifying node infrastructure across multiple cloud providers and bare-metal setups to truly mitigate single points of failure for decentralized networks.