Cloudflare Outage Technical Analysis: The Lua Bug That Stopped 28% of the Internet
25 Minutes of "500 Internal Server Error"
On December 5, 2025, at 08:47 UTC, a significant portion of the internet held its breath. For approximately 25 minutes, users worldwide trying to access services protected by Cloudflare were greeted with HTTP 500 errors. The outage affected about 28% of all HTTP traffic served by the CDN giant.
Interestingly, and worth emphasizing from the start—this was not a cyberattack. The cause was a configuration change intended to... increase security.
Context: Patching the React Vulnerability (CVE-2025-55182)
It all started with good intentions. Cloudflare was working on mitigating a critical vulnerability in React Server Components (CVE-2025-55182). To effectively detect malicious payloads associated with this flaw, engineers needed to increase the content analysis buffer in their Web Application Firewall (WAF) from the default 128KB to 1MB (the limit consistent with Next.js).
Sequence of Events: Deployment and Rollback
The buffer size change was being rolled out gradually. However, during this process, it was noticed that an internal WAF testing tool could not handle the larger buffer. Since this tool was not critical for customer traffic, the decision was made to disable it.
This was the critical moment. Disabling the testing tool was executed via the global configuration system, which propagates changes instantly to the entire fleet of servers—unlike the slow rollout of code changes.
The Logic Bug in Lua
The change reached servers running on the older proxy version (FL1), based on NGINX and Lua. Disabling the test rule (the "killswitch" action) caused the rule engine to skip the execution of a certain code section, but subsequent logic still attempted to reference its results.
This led to a classic error:
[lua] ... attempt to index field 'execute' (a nil value)
Simply put: the code expected a rule_result.execute object, which did not exist because the rule was "killed" before execution. This unhandled call on a nil (null) value caused 500 errors to spew out for all customers using Managed Rulesets on the FL1 proxy.
Conclusions and a Future in Rust
Cloudflare quickly identified the issue and reverted the change at 09:12 UTC, fully restoring network functionality.
It is worth noting that Cloudflare's newer proxy version (FL2), written in Rust, was completely immune to this error thanks to strong typing, which prevented such a situation at the compilation level. The company has announced it will accelerate the migration to Rust and implement "Fail-Open" mechanisms—so that a configuration error results in traffic being passed (potentially without scanning) rather than being completely blocked.
Irony of fate: an attempt to protect against one vulnerability (React) triggered a global paralysis due to a code bug that had existed unnoticed for years.
Aleksander
Sources
About the Author

Dyrektor ds. Technologii w SecurHub.pl
Doktorant z zakresu neuronauki poznawczej. Psycholog i ekspert IT specjalizujący się w cyberbezpieczeństwie.
Powiązane artykuły
Globalna Awaria AWS: Jak Jeden Region Wyłączył Pół Internetu
Globalna awaria AWS, z epicentrum w US-EAST-1, sparaliżowała dziś tysiące usług. Od Slacka i Zooma po Fortnite i banki – internet wziął przymusowe wolne. Winny: DNS.
Paraliż Płatności w Polsce: Ogromna Awaria Terminali PayTel
W poniedziałek doszło do ogólnopolskiej awarii terminali płatniczych firmy PayTel. Przez kilka godzin klienci w całej Polsce nie mogli płacić kartą, co wywołało chaos w handlu i usługach.
Weekend z Gotówką: Ogólnopolska Awaria Terminali Płatniczych. Atak czy Zwykła Awar-ia?
Wielu Polaków przeżyło cyfrowy detoks, gdy w miniony weekend padły terminale płatnicze w całym kraju. Oficjalnie to „problemy techniczne”, ale w kuluarach mówi się o cyberataku.
Komentarze
Ładowanie komentarzy...