Home Data Engineering Data News Latest PyPI attacks shows crooks are getting smarter

Latest PyPI attacks shows crooks are getting smarter

February 17, 2023

Latest PyPI attacks shows crooks are getting smarter 1

The latest proof that targeting software developers with this kind of attack isn’t just a passing trend came with the upload of more than 400 malicious packages to PyPI (Python Package Index), the official code repository for the Python programming language.

The nearly identical malicious payloads revealed in all 451 of the freshly discovered packages by security company Phylum were uploaded in bursts that were closely spaced apart. As soon as they are installed, the packages produce a malicious JavaScript extension that loads every time a browser is opened on the infected device, giving the virus persistence despite reboots.

If any cryptocurrency addresses are copied to the infected developer’s clipboard, the JavaScript checks for them. When an address is discovered, the malware changes it to one that belongs to the attacker. Intercepting money that the developer was about to send to a different party was the goal.

In November, Phylum discovered dozens of packages that secretly performed the same action using highly encrypted JavaScript and were downloaded hundreds of times. Particularly, it:

Put a textarea on the page
Copied any contents from a clipboard over to it
A variety of regular expressions were employed to look for typical cryptocurrency address formats
Inserted the attacker-controlled addresses in the previously constructed textarea to replace any detected addresses.
Copied the textarea to the clipboard.

The malicious software would replace the wallet address with an attacker-controlled address if a compromised developer copies a wallet address at any time, according to Phylum Chief Technical Officer Louis Lang’s post from November. The user will unintentionally pay money to the attacker as a result of this covert find/replace.

Novel obfuscation technique

The most recent effort not only greatly increases the quantity of infected packages published, but also significantly alters how it hides its trail. The new packages write function and variable identifiers in what seem to be random 16-bit combinations of Chinese language ideographs seen in the following table, as opposed to the packages revealed in November that employed encoding to hide the behaviour of the JavaScript:

UNICODE CODE POINT IDEOGRAPH DEFINITION
0x4eba 人 man; people; mankind; someone else
0x5200 刀 knife; old coin; measure
0x53e3 口 mouth; open end; entrance, gate
0x5973 女 woman, girl; feminine
0x5b50 子 child; fruit, seed of
0x5c71 山 mountain, hill, peak
0x65e5 日 sun; day; daytime
0x6708 月 moon; month
0x6728 木 tree; wood, lumber; wooden
0x6c34 水 water, liquid, lotion, juice
0x76ee 目 eye; look, see; division, topic
0x99ac 馬 horse; surname
0x9a6c 马 horse; surname
0x9ce5 鳥 bird
0x9e1f 鸟 bird

Using this table, the line of code

''.join(map(getattr(__builtins__, oct.__str__()[-3 << 0] + hex.__str__()[-1 << 2] + copyright.__str__()[4 << 0]), [(((1 << 4) - 1) << 3) - 1, ((((3 << 2) + 1)) << 3) + 1, (7 << 4) - (1 << 1), ((((3 << 2) + 1)) << 2) - 1, (((3 << 3) + 1) << 1)]))

creates the built-in function chr and maps the function to the list of integers [119, 105, 110, 51, 50]. Then the line combines it into a string that ultimately creates 'win32'.

Phylum researchers explained:

We can see a series of these kinds of calls oct.__str__()[-3 << 0]. The [-3 << 0] evaluates to [-3] and oct.__str__() evaluates to the string '<built-in function oct>'. Using Python’s index operator [] on a string with a -3 will grab the 3rd character from the end of the string, in this case '<built-in function oct>'[-3] will evaluate to 'c'. Continuing with this on the other 2 here gives us 'c' + 'h' + 'r' and simply evaluating the complex bitwise arithmetic tacked on to the end leaves us with:

''.join(map(getattr(__builtins__, 'c' + 'h' + 'r'), [119, 105, 110, 51, 50]))

The getattr(__builtins__, 'c' + 'h' + 'r') just gives us the built-in function chr and then it maps chr to the list of ints [119, 105, 110, 51, 50] and then joins it all together into a string ultimately giving us 'win32'. This technique is continued throughout the entirety of the code.

The researchers said that although while the technique appears to produce highly obfuscated code, it is ultimately simple to overcome by simply watching what the code actually does when it is executed.

By downloading one of these trustworthy programmes, the most recent batch of malicious packages tries to profit from creators’ typos:

bitcoinlib
ccxt
cryptocompare
cryptofeed
freqtrade
selenium
solana
vyper
websockets
yfinance
pandas
matplotlib
aiohttp
beautifulsoup
tensorflow
selenium
scrapy
colorama
scikit-learn
pytorch
pygame
pyinstaller

Packages that target the legitimate vyper package, for instance, used 13 file names that omitted or duplicated a single character or transposed two characters of the correct name:

yper
vper
vyer
vype
vvyper
vyyper
vypper
vypeer
vyperr
yvper
vpyer
vyepr
vypre

The researchers noted, “This method is trivially simple to automate using a script (we leave this as an exercise for the reader) and as the length of the legal package’s name rises, so do the potential typosquats. For instance, 38 typosquats were found in the cryptocompare package that was submitted almost simultaneously by the user pinigin.9494, according to our system.

Since at least 2016, when a college student uploaded 214 booby-trapped packages to the PyPI, RubyGems, and NPM repositories bearing slightly altered names of legitimate packages, malicious packages have been available in legitimate code repositories that closely resemble the names of legitimate packages. The end result: More than 45,000 instances of the imposter code were run on more than 17,000 distinct domains, and more than half were granted full administrative authority. Since then, so-called typosquatting attacks have increased.

Anyone who planned to acquire one of the safe packages targeted should verify to make sure they didn’t unintentionally obtain a harmful lookalike.

Source link