Source: Zakharchuk via Shutterstock
Security researchers have discovered a simple and troubling way for attackers to distribute malicious payloads via the PyPI package repository.
All that the technique involves is re-registering a malicious package on PyPI using the same name as any legitimate, previously registered but now removed package from the repository and then waiting for organizations to download it. Since PyPI does not prohibit the reuse of names of removed packages, it's easy for adversaries to pass off rogue packages that once were available on the registry as legitimate ones.
Revival Hijack
"The 'Revival Hijack' method can be used by attackers as an easy supply chain attack, targeting organizations and infiltrating a wide variety of environments," researchers at JFrog warned in a report this week. "PyPI users should stay vigilant and make sure their CI/CD machines are not trying to install packages that were already removed from PyPI," they noted, after recently discovering a threat actor using the tactic in an apparent attempt to distribute malware.
The attack method that JFrog discovered is one of several that adversaries have used in recent years to try and sneak malware into enterprise environments via public code repositories such as PyPI, npm, Maven Central, NuGet, and RubyGems. Common tactics have included cloning and infecting popular repositories, poisoning artifacts, and looking for and leveraging leaked secrets like private keys and database certificates in attacks.
Threat actors have also attempted to trick developers into accidentally installing malicious packages by exploiting common typing errors or using slight variations in the name of a legitimate package ("g00gle" instead of "google," for instance). Such typosquatting attacks continue unabated, despite efforts by organizations and the maintainers of PyPI and other registries to protect against them.
The challenge with Revival Hijack is that the technique does not rely on a victim making a mistake, as is typically the case with typosquatting and some of the other attack methods. "Updating a 'once safe' package to its latest version is viewed as a safe operation by many users (although it shouldn't!)," JFrog noted. "Many CI/CD machines are already set up to install these packages automatically."
Reusing Abandoned Package Names
According to JFrog, when a developer removes a project from PyPI, the associated package names become immediately available for anyone else to use. This means an attacker can easily hijack the package names and infect any user of the original packages that might try to update to the latest version. Any user that might want to install it for the first time on the assumption that it is the original would be similarly affected.
To test the effectiveness of the attack vector, JFrog researchers first created an empty project and published it to PyPI as "revival-package version 1.0.0," using a test "origin_author" account. After publishing the project, the researchers removed it from PyPI and almost immediately published another empty package with the same name to PyPI, but from a different "new_authr" account and different version number 4.0.0.
The exercise showed PyPI displaying JFrog's second empty package simply as a new version of the company's original "revival-package" with no indication that it contained very different code. Had JFrog's original package actually been legitimate code that developers were using, a CI/CD system would have downloaded the "new" version on the assumption it was an update.
"After demonstrating that hijacking removed legitimate packages can be easily done, [we] decided to analyze how many packages on PyPI were susceptible to 'Revival Hijack,' meaning that they were previously removed and can now be replaced/hijacked," JFrog said.
A Clear and Present Threat
The JFrog researchers' search showed a staggering 120,000 removed packages that attackers could potentially hijack to sneak malware onto PyPI. When the researchers filtered the results to only include packages that had been active for at least months or that users had previously downloaded more than 100,000 times, that number dropped to around 22,000 packages.
To prevent adversaries from misusing these abandoned package names, JFrog researchers "hijacked" the most popular of these packages and replaced them with empty ones. They also ensured that the version number on all the empty packages was 0.0.0.1, to ensure that no one using the original packages would accidentally download the empty package as an update.
Even despite this precaution JFrog's empty packages racked up nearly 200,000 automatic and manual downloads over a three-month period, showing that the Revival Hijack threat is very real, the security vendor said. "This seems to indicate that there are outdated jobs and scripts out there which are still looking for the deleted packages, or users that manually downloaded these packages due to typosquatting," JFrog said.
In an actual attack scenario, an adversary would have likely attached a high version number to each hijacked package so CI/CD systems would automatically download them believing them to be updates, JFrog said. The company has recommended that PyPI completely prohibit the reuse of abandoned package names. Organizations using PyPI also need to be aware of this attack vector when upgrading to new package versions, JFrog warned.