⚠️ Zip-Domains: Why They Were a Bad Idea ⚠️

Image generated by OpenAI's DALL·E

On May 3, 2023, Google heralded a new era with the introduction of eight new Top-Level Domains (TLDs), including .zip, marking a moment of both innovation and controversy. But why has .zip caused such a stir in the cybersecurity world?

The .zip Challenge

The ".zip" extension immediately brings to mind associations with compressed archive files for many of us. Herein lies the problem: the new .zip TLD could easily lead users to misinterpret misleading links as harmless downloads. This opens the door for phishing attacks, where cybercriminals attempt to make us click on malicious links disguised as legitimate downloads.

TLDs vs. File Extensions: A Brief Explanation

To understand why .zip as a TLD is problematic, we should clarify two fundamental terms: TLDs and file extensions.

For those not juggling technical terms every day, imagine TLDs as the internet's postal codes: they help categorize websites into certain "areas." File extensions, on the other hand, resemble the types of buildings in these areas – whether it's a house (.doc) or a business (.pdf). Both are essential for navigation, yet their confusion can lead to misunderstandings. Now, in a technical context:

Top-Level Domains (TLDs) are used to structure and categorize internet addresses. They serve to group websites thematically or geographically. For example, .com, .de, .it TLDs are geographically assigned to different countries, while specialized TLDs like .edu stand for educational institutions or .gov for government agencies. There are also other regional TLDs, generic TLDs, and special TLDs for various industries and purposes.

File extensions, on the other hand, are the combination of letters at the end of a file name - e.g., .doc or .pdf.

These serve to indicate the file type or format of a digital file. They allow the operating system or applications to correctly identify and process the file. For instance, ".docx" represents a Microsoft Word document, while ".jpg" indicates an image in JPEG format.

TLDs and file extensions are by no means the same, not even similar, but both play an important role in modern attacks.

A Dangerous Potential for Confusion

The mixing of these two concepts opens a dangerous potential for confusion and deception. Cybercriminals are masters at exploiting our expectations to trap us.

In email attachments, it has already become more common for a file named "attachment.pdf.exe" to be sent, and the user expecting a harmless PDF file instead opens a .exe file, which stands for "executable file" that can contain malicious code and be executed on a computer.

For an example with domain names (TLDs), attackers utilize the similarity between words or that known words appear in them - www.securitywho.com and www.securitywh0.com look similar at a first, fleeting glance in the browser line.

A Practical Example: Small Difference, Big Impact!

What do you get when attackers mix these up?

For this, there's a fabulous example from security researcher Bobby Rauch in his article "The Dangers of Google's .zip TLD" - a must-read article that highlights the full technical aspects behind the issue.

https://github.com/kubernetes/kubernetes/archive/refs/tags/v1.27.1.zip

https://github.com∕kubernetes∕kubernetes∕archive∕refs∕tags∕@v1.27.1.zip

Which of these 2 links from Bobby's article is the phishing link?

The second one.

The first link would open a file from the domain github.com, while the second would open a webpage of the domain v1.27.1.zip - owned by the attacker.

Having established this - what is the difference in the two links?

@

The inconspicuous @ sign.

In a URL, the "@" sign was normally used to indicate a username or password in an authentication part of the URL when accessing a protected resource. However, this format is being used less and less as more modern authentication methods are preferred. Thus, any part before an @ is considered user input by the browser.

But that's only half the truth - the following characters look confusingly similar, don't they?

∕ /

Even side by side, they are hardly distinguishable - the second slash is a legitimate Unicode U+002F, while the first is Unicode U+2215. This one just looks confusingly similar - Phishing domains often use Unicode characters because they can look like legitimate letters to the user - but only from the user's perspective.

Since the legitimate slash before the @ sign does not work because it was not part of the actual authentication part - after the browsers, however, accept the non-legitimate Unicode character in its place, attackers have thus found a workaround.

What do we learn from this?

Yes, the decision to introduce .zip as a Top-Level Domain may not have been the wisest in hindsight. It brings a number of security risks that may not have been fully considered before its introduction. However, it's important to recognize that businesses are not defenseless. By proactively blocking all websites and DNS records associated with the new TLDs, they can protect themselves against the potential threats outlined. This measure offers a first step toward protection from unwanted access and minimizes the risk of falling victim to attacks based on these domains.

The challenge becomes more complex when it comes to the attack with the @ sign. This technique is subtler and therefore more difficult to detect, not least because attackers are able to obtain valid HTTPS certificates for their fraudulent websites. This makes the URLs appear legitimate in the browser, significantly complicating the detection of phishing attempts for the user. Although technical aids can offer support here, a comprehensive solution to this problem would ultimately require an adjustment by browser providers to no longer accept these specific Unicode characters in URLs.