This guide will describe the various implications of running a link shortener, and how and why rs-short has been developed to suit sysadmin's needs.
It is based on my personal experience with it, maintaining 42l's link shortener since one year now.
A public link shortener isn't a service you don't need to monitor.
Hosting it is easy, but maintaining it will cost you time. Be prepared for that.
We've launched our link shortener service with another software: Lstu from Luc Didry (Framasoft), under WTFPL license, written in Perl.
It lasted 5 months. In September, we were forced to shut down the service because we were receiving too much spam: about 11 000 phishing URLs has been created during those five months. In September, it was about 1000 phishing links created per day.
On the other side, only around fifty links were legitimate (this is a roughly estimation).
Banning IP addresses
At first, we tried to ban IP addresses that posted a phishing link, or posted more than 5 links per 3 days. We used
fail2ban for that purpose.
But that's no use: we banned dozens of IPs, but the spambots kept changing their IPs. That solution might have prevented about 20% of phishing links creation, but that was far from enough.
Analysis of the spamming wave
How could we differentiate real links from phishing ones?
The links were posted by spambots, so it was easy to guess. They always used the same pattern in their shortcut names. Here are some examples:
mobilelegendshackcom44183 httpdocswakulcomauindexphpuseredwin802318285728511 clashofclanshack85554 fortnitecheat95132 clashofclanscheat61047 cheapjerseys45438 casino83129 washingclotheswithoutdetergent57572 xboxcodes24988
Blacklisting by shortcut name
Based on link names, with a regex, we could actually ban 90% of those spams. It would obviously lead to false positives, but it could've worked for some time.
But we didn't apply this solution for two reasons:
- That shortcut name is an user input.
- Once we'd start blocking them, the spammers would probably adapt their script to not get flagged.
- The easiest solution for them would be to not enter a shortcut name and let the link shortener generate one for them.
- Then, we would have to detect spamming by another way, like looking at the URL they enter. But this is much more complicated (see GSB section).
- Lstu doesn't have that feature and it's written in Perl. And I don't feel like learning Perl at all to implement it.
The Google SafeBrowsing database (GSB)
A feature integrated with Lstu allows the use of an offline GSB database to detect phishing on sent links.
But GSB is hopelessly useless. It doesn't ban even half of the links, it's terribly inefficient. One of the reasons might be because phishing is short-lived and uses hacked / exploited websites to work.
A lot of the created links are also not especially harmful / phishing links. It's just spam, created by spambots using the same syntax, but with random pages from random websites, just polluting your database for free.
I couldn't find no clear motive for that, but the flooding might be to drown the real spamming links among those inoffensive links, so it's harder to notice / figure out how to block them. Some kind of obfuscation.
Lstu was eating about 200MB of RAM. I estimated that it was too much RAM for such a simple service. That last argument convinced me to develop a handmade solution.
Creating rs-short: needs and goals
We needed :
- A lightweight software that doesn't eat much RAM
- No third-party calls
- As little unused features as possible
- Ideally without JS and without too much CSS so we can customize it
- Efficiently blocking spambots and all kinds of phishing links.
We also chose to use a captcha to block spambots. We had two choices:
- Handmade captcha: unless if it's poorly made (and ineffective), you end up preventing visually impaired people from using your service.
- Google reCAPTCHA, you end up compromising your users' privacy and sometimes even the accessibility of your service.
We had to choose the handmade captcha, using the old and unmaintained
captcha crate. We hope to find a solution for accessibility in later stage with a handmade, effective and accessible captcha solution. Unfortunately, there has still been no progress on this.
The results of migrating from Lstu
The spamming waves stopped immediately and the RAM usage dropped from 200MB to 10MB.
But we had yet to face a greater threat: bad phishing cases, made by real people.
It all started with a mail from our VPS provider, Proxgroup, saying they received an email saying they detected a phishing link on one of their VPS.
They contacted us and asked us to remove the link in a delay of 24 hours or they will have to shut down our server.
At the same time, we received a mail on our
abuse@ address, from Amazon, who asked us to remove the link immediately.
The link redirected to a Amazon phishing form (fake website), asking the user to enter their credentials.
It was a chain link : they used multiple link shorteners to create the link.
The link was created during the evening at 20:38PM, we received the mail from our provider at 21:45PM and deleted the link at 23:05PM. During that short period of time, 59 people clicked on the link.
We had multiple cases like that, notably with Apple, Société Générale (French bank) and Orange (French ISP).
For most of them, we managed to notice and detect the link before receiving a takedown request, by watching the access logs regularly and checking any link with an irregular activity.
Phishing cases like that works on the short-term: the link is sent to hundreds of people and generally gets shut down in a few days at most.
So when a link is getting more than 40 clicks in an hour, it might indicate that the link is used for malicious purposes. That's how we managed to spot the next cases, by sometimes checking multiple times per day.
That's the whole point: hosting a link shortener implies to check the logs regularly if you don't want to get polluted by spammers or find bad surprises in your mailbox.
One tip : be sure to listen the
email@example.com email address. Add an alias on it and check it daily. rs-short also allows you to enter your contact details for that purpose.
Domain name reputation
A bad side effect of being subject to phishing attacks is to see your domain reputation getting dirty. Your domain might even get blocked by antiviruses and other protection mechanisms, like Lstu and Framalink's VirusTotal report shows.
At the time of writing, the main instance of rs-short is still marked as safe by VirusTotal reports.
Being blocked might prevent regular users to access your website, all its domains and subdomains, and all the links sent through your link shortener.
This would be obviously fatal to any email service running on the same domain or subdomain than the link shortener.
Blacklisting shortcut names
Since we migrated to rs-short, there has been no need to blacklist shortcut names yet because spambots were unable to create new links.
However, we added a basic support for this feature in rs-short 2.0.0.
We introduced the URL blacklisting feature in version 1.1.0 because of phishing attempts.
We started by listing some known URL shorteners, since phishing attacks sometimes happens with a link chain.
rs-short's link blocking works with a list of keywords instead of domain blocking, because phishing links tends to have similar patterns. It gives more flexibility.
When a user submits a link containing a blacklisted keyword, the service returns a 403 Forbidden error. On the main instance, this error shows up in access logs and is intercepted by a Fail2ban instance that instantly bans the IP for a few days.
Active detection of malicious links
Phishing attacks are short-term attacks. It is quickly shut down by antivirus software, mail providers, hosting providers or the police. A successful phishing attack doesn't last more than a few days.
That's why rs-short 2.0.0 introduces a new feature: now, when a link gets "too popular" within a defined period, the software prints a warning in the terminal (
By default, rs-short will warn if a link gets more than 25 clicks in a timeframe of 12 hours. This allows the sysadmin to not have to check all links, but only the links that generates potentially unusual traffic.
Raising awareness about phishing
Simply deleting phishing links isn't a good solution, for multiple reasons:
- Search engines will crawl the address and record a 404 Not Found error, while it should return a 410 Gone error or the engine will keep checking it on a regular basis to see if it comes back again ;
- The scammed users gets a 404 page saying something like "link not found", which isn't very user-friendly ;
- Even if this is quite unlikely, the person who created the phishing link can always recreate it after you remove it (unless you blacklist their shortcut name / link).
With rs-short, you can flag a link as phishing so the shortcut name can't be used again and displays a "You have clicked on a fraudulent link" message, forever, so people that gets redirected to it don't do the same mistake again (well, actually they will, but they might think twice next time).
"Phishing defense mode"
If phishing starts to go wild, you can enable the
verbose_console option to print ALL links in the console as they are created, including the administration links and phishing flag links associated with them.
It allows you to flag bad links easily in case of a phishing / spam wave.
However, this mode is bad for users' privacy, since all created links are displayed. That's why this feature is meant to be used only if your service is assaulted or if you're operating rs-short on a sensitive domain, and you need to respond to those attacks as fast as possible.
By using your service, your users entrusts you with their personal data. Don't betray them and respect their privacy.