Or at least make it kinda difficult to scrape...
Recently, I was sent a phishing email at work. This one was not from a malicious hacker, though. It was sent internally to teach employees the scary dangers of opening “untrustworthy” emails. Oh the humanity! However, this email didn’t really have to do with phishing – it just contained a link to a supposed tracking number of a package that I didn’t order.
If you didn’t know, phishing is the act of trying to “fish” for information from people by posing as credible sources. Most of us know how to avoid Nigerian Princes and Russian Brides, but there are legitimate scams out there. The important thing to keep in mind is to always check your URLs for misspelled/questionable links and to never give out any personal information. Back to the story, this is where things get interesting.
It seems that the company that sent me this email also sent emails to my coworkers as well (this makes sense, and isn’t too weird). Yet by sending out multiple emails, we could see that the email’s URLs have a distinct pattern:
With the irony surmounting, let’s just say that it’s fairly easy to deduce that the query takes base 16 values: 0-9 or a-f. This means that there are 16 possible combinations per spot, leading to 16^6 possible choices or 16,777,216 possible values. While this sounds like a lot of combinations for most people, it’s a fairly easy number for computers to crack. Even worse, these query variations don’t only lead to internal personalized webpages, but they also lead to external companies!
From the banking industry, to the entertainment industry, to consultants, we’re able to see which companies are using this service – and by extension, how many emails are being sent to each company. For a security company, this doesn’t seem like a very secure practice. Especially because it can be easily remedied. For example, changing to an 8 character code leads to approximately 4 billion additional combinations. A 16 character code (abcd12345678efef for example) allows for 18,446,744,073,709,552,000 combinations! Clearly, it doesn’t take that much additional effort to generate a random number that’s hidden within a larger code, thus protecting customer privacy. Though it should go without saying, do your part when protecting the identity of your customers, don’t group them all together in tiny clusters.