Intel DOWNFALL: New Vulnerability Affecting AVX2/AVX-512 With Big Performance Implications - Intel CPUS from Tigerlake / Ice Lake back to Skylake are confirmed to be impacted. Performance impact in certain workloads of up to 50%.

  • 🏰 The Fediverse is up. If you know, you know.
  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
Link Archive (Wayback Machine)

This Patch Tuesday brings a new and potentially painful processor speculative execution vulnerability... Downfall, or as Intel prefers to call it is GDS: Gather Data Sampling. GDS/Downfall affects the gather instruction with AVX2 and AVX-512 enabled processors. At least the latest-generation Intel CPUs are not affected but Tigerlake / Ice Lake back to Skylake is confirmed to be impacted. There is microcode mitigation available but it will be costly for AVX2/AVX-512 workloads with GATHER instructions in hot code-paths and thus widespread software exposure particularly for HPC and other compute-intensive workloads that have relied on AVX2/AVX-512 for better performance.

Downfall is characterized as a vulnerability due to a memory optimization feature that unintentionally reveals internal hardware registers to software. With Downfall, untrusted software can access data stored by other programs that typically should be off-limits: the AVX GATHER instruction can leak the contents of the internal vector register file during speculative execution. Downfall was discovered by security researcher Daniel Moghimi of Google. Moghimi has written demo code for Downfall to show 128-bit and 256-bit AES keys being stolen from other users on the local system as well as the ability to steal arbitrary data from the Linux kernel.

Skylake processors are confirmed to be affected through Tiger Lake on the client side or Xeon Scalable Ice Lake on the server side. At least the latest Intel Alder Lake / Raptor Lake and Intel Xeon Scalable Sapphire Rapids are not vulnerable to Downfall. But for all the affected generations, CPU microcode is being released today to address this issue.

Intel acknowledges that their microcode mitigation for Downfall will have the potential for impacting performance where gather instructions are in an applications' hot-path. In particular given the AVX2/AVX-512 impact with vectorization-heavy workloads, HPC workloads in particular are likely to be most impacted but we've also seen a lot of AVX use by video encoding/transcoding, AI, and other areas. Intel has not relayed any estimated performance impact claims from this mitigation. Well, to the press. To other partners Intel has reportedly communicated a performance impact up to 50%. That is for workloads with heavy gather instruction use as part of AVX2/AVX-512. Intel is being quite pro-active in letting customers know they can disable the microcode change if they feel they are not to be impacted by Downfall. Intel also believes pulling off a Downfall attack in the real-world would be a very difficult undertaking. However, those matters are subject to debate.

Intel's general statement on Downfall is:

"The security researcher, working within the controlled conditions of a research environment, demonstrated the GDS issue which relies on software using Gather instructions. While this attack would be very complex to pull off outside of such controlled conditions, affected platforms have an available mitigation via a microcode update. Recent Intel processors, including Alder Lake, Raptor Lake and Sapphire Rapids, are not affected. Many customers, after reviewing Intel's risk assessment guidance, may determine to disable the mitigation via switches made available through Windows and Linux operating systems as well as VMMs. In public cloud environments, customers should check with their provider on the feasibility of these switches."

Meanwhile Daniel Moghimi with his Downfall site characterizes Downfall's impact as:

"GDS is highly practical. It took me 2 weeks to develop an end-to-end attack against encryption keys. It only requires the attacker and victim to share the same physical processor core, which frequently happens on modern-day shared computing infrastructure, implementing preemptive multitasking and simultaneous multithreading."

And then in regards to disabling the forthcoming Downfall mitigations:

"This is a terrible idea. Even if your workload does not use vector instructions, modern processors rely on vector registers to optimize common operations, such as copying memory and switching register content, which leaks data to untrusted code exploiting Gather."

Raising more alarm bells is that Daniel reported this issue to Intel all the way back in August 2022... Yes, basically one year since reporting it is this vulnerability only now being made public.

The updated Intel CPU microcode should be posted in the coming minutes as well as the Linux kernel patch(es) that will allow optionally disabling the mitigation on systems running this forthcoming CPU microcode. Intel's official security disclosure should be available here. The Downfall website is downfall.page.

Intel was quite proactive in their outreach ahead of today's Downfall embargo lift. While they provided some insight and their public responses to this vulnerability, they hadn't provided any benchmark result expectations. I also requested early access to the CPU microcode updates to allow time to independently verify the performance impact of the mitigation. Unfortunately, they were not able to provide the CPU microcode in advance. However, I've already spent days preparing fresh AVX benchmarks on the current microcode to look at the performance implications. With the microcode release today, I will now be running the post-mitigation benchmarks and as soon as this evening should have some preliminary results to share...

Gather Data Sampling / Downfall is perhaps the most concerning CPU security vulnerability we've seen in a few years now if there is indeed upwards of 50% performance penalties for AVX workloads with heavy gather instruction use... The year delay in disclosing GDS to the public and Intel's communications prominently bringing up the fact that the mitigiation can be disabled with upcoming Linux and Windows patches add additional weight to this mitigation being quite costly. Stay tuned to Phoronix for initial benchmark results shortly.
 
if this is like the last speculative execution vlun it will be a bunch of specultive garbage fud that does not work outside of a lab, either way more cheep servers on the used market.
 
Exploit webpage. This has some videos. The exploit would affect virtual machines, VPSes, and other sandboxing when programs have access to the same core.

Important excerpts from the webpage:
[Q] What can a hacker do with this?

[A] A hacker can target high-value credentials such as passwords and encryption keys. Recovering such credentials can lead to other attacks that violate the availability and integrity of computers in addition to confidentiality.

[Q] How practical are these attacks?

[A] GDS is highly practical. It tooks me 2 weeks to develop an end-to-end attack stealing encryption keys from OpenSSL. It only requires the attacker and victim to share the same physical processor core, which frequently happens on modern-day computers, implementing preemptive multitasking and simultaneous multithreading.

[Q] Is Intel SGX also affected?

[A] In addition to normal isolation boundaries e.g., virtual machines, processes, user-kernel isolation, Intel SGX is also affected. Intel SGX is a hardware security feature available on Intel CPUs to protect user’s data against all form of malicious software.

[Q] What about web browsers?

[A] In theory, remotely exploiting this vulnerability from the web browser is possible. In practice, demonstrating successful attacks via web browsers requires additional research and engineering efforts.

[Q] How long have users been exposed to this vulnerability?

[A] At least nine years. The affected processors have been around since 2014.

[Q] Is there a way to detect Downfall attacks?

[A] It is not easy. Downfall execution looks mostly like benign applications. Theoretically, one could develop a detection system that uses hardware performance counters to detect abnormal behaviors like exessive cache misses. However, off-the-shelf Antivirus software cannot detect this attack.

You should take the potential impact estimated by these disclosure websites with a grain of salt. What works in a lab is not necessarily going to work in the real world.

Red Hat Article (important part bolded)

Performance Impact​

The performance impact of the microcode mitigation is limited to applications that use the gather instructions provided by Intel Advanced Vector Extensions (AVX2 and AVX-512) and the CLWB instruction. Actual performance impact will depend on how heavily an application uses those instructions. Red Hat’s internal performance testing of a worst-case microbenchmark showed a significant slowdown. However, more realistic applications that utilize vector gathering showed only low single-digit percentage slowdowns.

If the user decides to disable the mitigation after doing a thorough risk analysis (for example the system isn’t multi-tenant and doesn’t execute untrusted code), the user can disable the mitigation. After applying the microcode and kernel updates, the user can disable the mitigation by adding gather_data_samping=off to the kernel command line.
Alternatively, to disable all CPU speculative execution mitigations, including GDS, use mitigations=off.

For more information about this issue, please consult the following Intel material:
* Gather Data Sampling Technical Paper
* Intel Security Advisory INTEL-SA-00828
 
Don't CPU's already have backdoors anyway? If you are going to let the CIA steal your shit, may as well let everyone have a go.
 
Back
Top Bottom