AT2k Design BBS Message Area
Casually read the BBS message area using an easy to use interface. Messages are categorized exactly like they are on the BBS. You may post new messages or reply to existing messages!

You are not logged in. Login here for full access privileges.

Previous Message | Next Message | Back to Slashdot  <--  <--- Return to Home Page
   Local Database  Slashdot   [80 / 102] RSS
 From   To   Subject   Date/Time 
Message   VRSS    All   A Single Point of Failure Triggered the Amazon Outage Affecting   October 24, 2025
 5:40 PM  

Feed: Slashdot
Feed Link: https://slashdot.org/
---

Title: A Single Point of Failure Triggered the Amazon Outage Affecting
Million

Link: https://slashdot.org/story/25/10/24/2212255/a...

An anonymous reader quotes a report from Ars Technica: The outage that hit
Amazon Web Services and took out vital services worldwide was the result of a
single failure that cascaded from system to system within Amazon's sprawling
network, according to a post-mortem from company engineers. [...] Amazon said
the root cause of the outage was a software bug in software running the
DynamoDB DNS management system. The system monitors the stability of load
balancers by, among other things, periodically creating new DNS
configurations for endpoints within the AWS network. A race condition is an
error that makes a process dependent on the timing or sequence events that
are variable and outside the developers' control. The result can be
unexpected behavior and potentially harmful failures. In this case, the race
condition resided in the DNS Enactor, a DynamoDB component that constantly
updates domain lookup tables in individual AWS endpoints to optimize load
balancing as conditions change. As the enactor operated, it "experienced
unusually high delays needing to retry its update on several of the DNS
endpoints." While the enactor was playing catch-up, a second DynamoDB
component, the DNS Planner, continued to generate new plans. Then, a separate
DNS Enactor began to implement them. The timing of these two enactors
triggered the race condition, which ended up taking out the entire DynamoDB.
[...] The failure caused systems that relied on the DynamoDB in Amazon's US-
East-1 regional endpoint to experience errors that prevented them from
connecting. Both customer traffic and internal AWS services were affected.
The damage resulting from the DynamoDB failure then put a strain on Amazon's
EC2 services located in the US-East-1 region. The strain persisted even after
DynamoDB was restored, as EC2 in this region worked through a "significant
backlog of network state propagations needed to be processed." The engineers
went on to say: "While new EC2 instances could be launched successfully, they
would not have the necessary network connectivity due to the delays in
network state propagation." In turn, the delay in network state propagations
spilled over to a network load balancer that AWS services rely on for
stability. As a result, AWS customers experienced connection errors from the
US-East-1 region. AWS network functions affected included the creating and
modifying Redshift clusters, Lambda invocations, and Fargate task launches
such as Managed Workflows for Apache Airflow, Outposts lifecycle operations,
and the AWS Support Center. Amazon has temporarily disabled its DynamoDB DNS
Planner and DNS Enactor automation globally while it fixes the race condition
and add safeguards against incorrect DNS plans. Engineers are also updating
EC2 and its network load balancer. Further reading: Amazon's AWS Shows Signs
of Weakness as Competitors Charge Ahead

Read more of this story at Slashdot.

---
VRSS v2.1.180528
  Show ANSI Codes | Hide BBCodes | Show Color Codes | Hide Encoding | Hide HTML Tags | Show Routing
Previous Message | Next Message | Back to Slashdot  <--  <--- Return to Home Page

VADV-PHP
Execution Time: 0.0174 seconds

If you experience any problems with this website or need help, contact the webmaster.
VADV-PHP Copyright © 2002-2025 Steve Winn, Aspect Technologies. All Rights Reserved.
Virtual Advanced Copyright © 1995-1997 Roland De Graaf.
v2.1.250224