At 16th of February 2021 we received a message late in the evening from one of our developers that one of our build pipelines was failing mysteriously and without clear cause, when fetching our own internal libraries. “That’s strange, as nothing should have been changed in the last week in those libraries”. I remember thinking.
Investigating further, we noticed that someone had created the libraries in PyPi. Three of our main libraries we use in Qentinel Pace, QWeb, QVision and QMobile had all been created by an unknown account in the package repository. Because of the way pip is constructed, it defaults to fetching the libraries from PyPi in the first place, so what happened was that those external libraries were fetched, but not the actual libraries from our private repositories. The newly created public repositories did not contain our source code, so the dependencies failed in build.
At this point we were stumped on what to do against this kind of a threat and we were even wondering what the aim of the account that deployed those libraries to PyPi was. We were initially thinking that someone had crawled through libraries and noticed that we are using those names and reserved them from PyPi to then sell them to us.
Incidentally the article about Dependency Confusion by Alex Birsan had just been floating around our slack and Hackernews, and at that point we understood what was going on. We had become targets of a Dependency Confusion attack.
As Described by Alex, Dependency confusion attack exploits misconfigured build scripts and one off mistakes of developers to pull the malicious library from the public repository and not the actual library from a private one. The publicly released package then contains malicious code which phones home and even allows for remote code execution.
We promptly took the domain the libraries were supposedly registered from and contacted python security to help us resolve this issue of malicious libraries in PyPi. We registered the domains that the packages were supposedly registered from to ourselves to prevent the malicious actor from using that to spoof emails and sent our own emails form Qentinel and from the previously malicious, now in our control, domain to Python security.
With the help of Python team the malicious packages were removed and the names of our libraries are now blacklisted in PyPi. We recommend all our clients to purge all caches in their build pipelines which might contain the fake repositories and check that their build scripts are configured correctly. If you have updated or installed Pace Connect after 15th of February to 17th of February, we recommend you reinstall the packages.
Going forward, the blacklisting of our libraries in PyPi should mitigate the risk of the attack being repeated, but this outlines a more major problem in the way code is being shared and reused through node package manager, PyPi and other online repositories. To mitigate these kind of attacks would require changes in the default behavior of pip and other tools. Currently the flow of when using the –extra-index-url parameter in pip:
- Check if library exists in the specified package index
- Check if the library exists on the public package index
- Install whichever version is found, if the package is found in both, install the one with a higher version number (in this case, Malicious)
This default behavior opens you up for an Dependency Confusion attack where an attacker puts a library with an impossibly high version number to the public repository, for example library – 69420.0.0 , causing that version to be downloaded. This problem has to be solved at the build pipeline level in updating this default behavior.
If you want to mitigate the problem you can use –index-url to only specify the custom repository address for your pip. This will then look for the package only from the custom repository, not from the public one.
Qentinel will keep looking at this problem and ways to mitigate it on our end, but the answer lies in changing the default behavior of pip and other tools. To our knowledge the packages did not contain any malicious code, but were empty placeholder libraries.
Timeline of events:
2021-02-16T18:23Z Build pipeline problem detected
2021-02-16T18:23Z Packages in PyPi noticed
2021-02-16T20:09Z Content of Packages in PyPi checked, no malicious code detected
2021-02-17T09:09Z Removal of Malicious packages from PyPi started
2021-02-17T16:55Z PyPi confirms packages removed.