Privacy: August 2015

Saturday, August 22, 2015

Ashley Madison

Ashley Madison dump - no database necessary

Ashley Madison is a "dating" site marketed toward married people seeking an affair.They were recently hacked and a dump of around 9 GB was posted to an Onion site. This was quickly mirrored to Bittorrent and will probably be around until the heat the death of the universe.

Aside from the actual information, which is titillating, it's a fascinating amount of data. I've downloaded the dump and used it to practice my text parsing and scripting skills.

In this post, I'll share some methods of parsing huge text data sets such as this one. These methods are somewhat rudimentary. There are better more complicated methods. These are good basic methods that anyone can use. No database is necessary.

Linux

I've chosen to use a Bash shell on Linux. Manipulating large text files is much quicker in the Linux console. There are also mature tools that allow you to search and manipulate the data without first decompressing it, useful since these data sets are gzipped, saving a significant amount of space. If you don't have Linux installed, there are several versions you can boot from a cd, knoppix is simple, but I'm a big fan of System Rescue CD.

The Dump

As mentioned above, this dump consists of several large compressed files. They are mostly gzipped csv (comma separated value) files, with some collections of files compressed into a 7z archive.

The 7z archives must be extracted in order to search them. The Credit Card transaction logs, CreditCardTransactions.7z is 278MB compressed, and over 2.5 GB uncompressed, so make sure you have space.

Here are the commands I used to extract:

mkdir cc_data
7z x -occ_data/ CreditCardTransactions.7z

This creates a directory and extracts the data to that directory.
To search the data, I use grep, for example, this will find all records from Indiana.

grep \"IN cc_data/*

grep searches for "IN, the \ is necessary before the " because it is a special character

You can pipe your output to additional grep's, for example, this will find all Indiana records with carmel in them:

grep \"IN cc_data/* | grep -i carmel

this uses grep to search for "IN, then searches the results for carmel. The -i makes it NOT case sensitive, by default grep searches are case sensitive.

These searches can take a few seconds to complete.

email addresses

One of the gzipped dumps is a list of email addresses. You can search for an email address you know, or an address attached to a CC transaction log. Either way, you will find a user number that can be searched in all the remaining gzipped files to piece together a users record.

To search a gzipped file, you need to use the z commands. Here is a great post on using z commands in Linux.

If you zgrep, you will get a huge block of text. I pipe zcat to tr, replacing "(" with "\n", this puts each record on it's own line, for readability.
Then I pipe the output to grep to find what I want. Here is an example:

zcat am_am.dump.gz |tr '(' '\n' | grep -i obama

zcat is the equivalent of cat for compressed file. I could use zgrep, but this data set does not have good, readable line breaks, by zcat'ing the file first, I can break it into readable chunks before I grep.

Good Luck! Your on your own in procuring this data dump, but it's not difficult.

Tuesday, August 18, 2015

Protect yourself from your ISP

Recent news confirms that AT&T rolls over for the NSA and probably complies with even the flimsiest "law enforcement" requests. This comes as no surprise to those who have been paying attention to security, encryption, and digital freedom.

This post will give you a rough outline of the ways you can protect your privacy online. Different tactics are necessary if you want to protect yourself on a shared pc, on a shared network, or from your ISP. We're going to start with protecting yourself from your ISP. Future posts can be layered or used alone depending on your desire for privacy.

Bear in mind that the techniques listed here will not protect you if someone seizes your PC, or you log into online services that track you, like facebook. The techniques outlined on this page only serve to hide your location and ISP and hide your traffic from your ISP.

Reasons for this might include, but are not limited to:

Downloading legal, but embarrassing porn. Who wants to get their IP linked to a download of "Ass Bandits 44"? Remember, the internet is forever.
Hiding your location from online predators. Did you know your IP can be used to locate your city? Geo-location services make it much easier to locate you and harass you with potentially dangerous techniques, like swatting.
Using bit-torrent on a network that restricts peer-to-peer downloading. You'll find it curios how much quicker your torrents download when your ISP can't read you traffic.
Circumventing country specific content locks, like netflix or NFL content. With the right VPN service, you can control where your traffic appears to come from.
etc... (leave your suggestions in the comments)

Friday, August 7, 2015

This blog will be used to educate people on privacy and security methods for regular users on the internet.

COMING SOON!