CC3000 Smart Config and keyphrase recovery ~ Depletion Region

Having previously described how the SSID and keyphrase are transmitted to a CC3000 enabled device I thought I should put my money where my mouth is and prove that it's possible to create an application capable of recovering such information.

This proved a bit more difficult than expected but ultimately the code required turned out to be fairly simple.

Capturing wifi packets

While sniffing ethernet packets on a wired network is something pretty much any computer can do the same is not true for looking at wifi packets. To be able to look at all packets, not just ones involving the machine doing the sniffing, one has to be able to enable what's called monitor mode. The ease with which this can be done seems to depend on the wifi chipset, the OS and other factors. Even if one can enable monitor mode one may be able to see the headers for packets without being able to see the data portion, again this seems to be dependent on chipset and other factors.

After much unsuccessful experimentation on Linux I eventually found it was actually easier to get things working without any issues or special tricks on my Mac. I did eventually get things to work on my Linux box and this is described later, but as it's rather more involved I'll stick with describing the Mac setup initially.

I just downloaded and installed the latest Mac version of Wireshark (the de-facto standard packet analysis tool). After installation the command line version, called tshark, and other tools could be found in /usr/local/bin. Note: when I installed Wireshark it created /usr/local/bin such that it belonged to a userid that did not exist and with 0700 permissions, so I did:

$ sudo chmod 755 /usr/local/bin

First I found the wifi device like so:

$ tshark -D

It was en0, then I tested that I could capture packets including the data portion like so:

$ tshark -i en0 -I -V

The options tell tshark to capture packets from en0 (-i en0), using monitor mode (-I) and to produce verbose output (-V). Verbose output will show the binary contents of the data portions of any packets that have data. The fact that the data is encrypted isn't important.

Filtering for relevant packets and outputting relevant information

The -V options shows way more detail than is actually needed and without filters one sees information about many packets that aren't of interest.

After some experimentation I came up the following:

$ tshark -o 'wlan.enable_decryption:FALSE' \
    -i en0 -I -f 'subtype qos-data' \
    -Y 'wlan.fc.retry==0' -T fields \
    -e wlan.bssid -e radiotap.channel.freq -e wlan.sa -e wlan.da -e data.len

As we can't decrypt the packet data we can't look at the higher level protocol information, so we can't simply filter for UDP traffic. But we can ignore all packets that can't possibly contain the UDP traffic we're interested in. We do this by excluding all packets that are not of subtype QoS data (-f 'subtype qos-data'). And we ignore all retransmitted packets (-Y 'wlan.fc.retry==0'), this may not sound intuitive but handling them in a meaningful manner is difficult and on the whole they tend to duplicate data that we actually already have rather than providing data that has somehow been missed (which was my initial assumption).

The -T fields and subsequent -e arguments are our replacement for -V and only output the very limited set of fields values that we are interested in:

BSSID - the numeric address behind the human readable SSID - see Basic service set identification.
Channel frequency - see WLAN channels (and more on channel hopping later).
Source address - the address of the sender of a given packet.
Destination address - the destination address of a given packet.
Data length - the length of the encrypted data portion of a given packet.

BBSID and channel frequency are just output for reference - they are not actually required by any of the SSID or keyphrase recovery logic.

Note: wireshark and tshark can actually decrypt wifi packets if you provide the necessary information. If you've already configured and enabled decryption then tshark will pick this up from your ~/.wirehark file and automatically decrypt packets. Above I've actively disabled this behavior with the -o 'wlan.enable_decryption:FALSE' option, if you don't have decryption already configured you don't need this.

SSID and keyphrase recovery application

I've written an application in Java that parses the output of tshark and recovers the SSID and keyphrase information from this data. If you've got git installed you can just clone the relevant repository from GitHub like so:

$ git clone https://github.com/george-hawkins/betaengine

If you don't have git you can just download the repository contents as a zip file from here:

https://github.com/george-hawkins/betaengine

The code is fairly simple and short (700 lines in total) and just consists of the following classes:

Consumer - contains the main method and reads and parses the output from tshark.
Analyzer - maintains a LinkManager per source/destination pair seen.
LinkManager - looks for data length differences that might indicate Smart Config data.
LengthDecoder - finds SSID and keyphrase sequences.
Solver - attempts to combine partial SSID and keyphrase sequences to generate and decode complete sequences.
EncodedData and Link - trivial support classes.

The files come with a README.md that briefly outlines how to compile and run the application (for the run instructions look at the "Decoder" section). Basically you just run tshark and pipe its output directly into the application. If you then use a Smart Config application to communicate an SSID and keyphrase to a CC3000 enabled devise you should soon see something like:

Solved SSID: [MyPlace]

Solved keyphrase: [LetMeIn]

Scan succeeded

This shows that we succeeded in recovering the SSID, in this case "MyPlace", and the keyphrase, in this case "LetMeIn". Note that it may find the SSID or keyphrase long before the other.

Any characters that are not printable character in the Unicode range 0x20 to 0xFF are printed as Unicode escapes, e.g. the € symbol would appear as "\u20AC".

If you don't succeed in recovering the password then it maybe that you are not listening on the right wifi channel - see the channel hopping section later. However generally you will be on the right channel already as a result of having being previous connected to the relevant wifi access point (AP).

Note that while tshark is running in monitor mode your machine will be disassociated from your AP and other applications on the machine will not be able to access the network.

If you used AES encryption then the keyphrase displayed will be the still encrypted version and will probably appear largely as Unicode escapes due to non-printing characters etc. I haven't added AES decryption logic, this is left as an exercise for the reader, it's simple actually if you create a cipher using AES Electronic Cook Book transformation with no padding as described briefly in the middle of this post. Obviously any such logic will need the relevant AES key to decrypt a given keyphrase.

Important update Dec 8^th, 2014: please see this comment from Mark and my reply. I have not updated my code to reflect any recent changes such as this and do not plan to do so.

Implementation issues

So what were the main difficulties encountered in creating the application?

When I started I thought it would be easier to filter the packets I was interested in from all the other packets and I assumed I would see cleaner runs of packets corresponding to the SSID and keyphrase.

However while one can group packets by source and destination, when one cannot decrypt the packets one can only do so much to distinguish between Smart Config related packets and other similar traffic between a given source and destination.

Using what we know about Smart Config it's possible to filter out many packets but we still end up with a combination of extra invalid values and missing values between the packets that delimit the SSID and keyphrase. I refer to the invalid extra values as spam and the missing values as holes in my code. The holes are presumably the product of packet collision, the spam the product of unrelated traffic that can't be distinguished from Smart Config traffic due to encryption. And remember that the packets that have an appropriate data length such that they appear to be Smart Config tags, separators etc. may themselves just be the result of unrelated traffic that coincidentally involves packets that have the lengths being looked for.

Note: packet collision shouldn't be the issue it is on wired ethernet networks due to the need to use CSMA/CA on wifi networks.

The Smart Config application transmits the same sequences over and over again. The Solver class takes multiple received sequences, each probably containing spam and holes, and tries to construct a clean sequence of the required length that obeys the upper nibble rules etc., described in my previous post, that we know have to apply.

The current Solver is just one possible implementation, one could imagine taking completely different approaches with different pros and cons. It should certainly possible to come up with more complex logic that can recover the SSID and keyphrase from fewer repeats of the underlying sequences in the face of greater amounts of spam and collisions.

Note: the current solver tries hard to patch pieces from multiple sequences together to create a complete clean sequence. Sometimes it will actually produce multiple valid solutions and you'll see output like this:

Solved SSID: [MyPlacf]
Solved SSID: [MyPlace]

Obviously only one solution is the right one - with a little extra effort it would be possible to generate statistics for each solution, on e.g. things like how much patching was involved, to give some indication as to how likely a given solution is to be the right one. Sometimes if an SSID or keyphrase tag gets lost in transmission the current solver can occasionally produce a largish number of very poor solutions.

Channel hopping

The tshark logic described above will only listen on whatever channel your wifi device is currently configured for, typically channel 1, 6 or 11. The CC3000 must presumably do channel hopping to find the relevant channel. Tshark and related utilities don't directly support channel hopping - but it's relatively easy to setup channel hopping - see the channel hopping section of the Wireshark wiki page on capture setup.

On Mac things are even easier - one can use the standard, but well hidden, airport application that can be found here:

/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport

With this command you can scan for nearby networks and see what channel they're using, you can disassociate from your current network and change the current channel of your wifi device. See e.g. CNET's overview of various Mac network related CLI commands for more details.

Setting up aircrack and tshark on Linux

As outlined above it proved to be easier getting tshark working on Mac. I did eventually get it working on my Ubuntu 12.04 machine. The main issue was enabling monitor mode. To do this I required airmon-ng, a tool that's part of Aircrack-ng. Aircrack wasn't available via apt-get so I had to download and compile it. On doing make install it installed airmon-ng to /usr/local/sbin.

Then I was able to enable monitor mode like so, where wlan0 is the name of my wifi device as reported by ifconfig (it may have a different name on your system):

$ sudo /usr/local/sbin/airmon-ng start wlan0 11

It output "monitor mode enabled on mon0" - mon0 is a pseudo device created by airmon-ng that tshark will listen to rather than wlan0. However the command also outputs a warning about processes that may interfere with its operation. They do indeed interfere but it's not as simple as killing the suggested PIDs as some of them are related to services that will simply restart them if they're seen to die. So I had to stop the relevant services like so:

$ sudo service network-manager stop
$ sudo service avahi-daemon stop
$ sudo service upstart-udev-bridge stop

I then stopped monitor mode - note that this needs to done on mon0 rather than wlan0:

$ sudo /usr/local/sbin/airmon-ng stop mon0

Then I started monitor mode, as above, again and this time it only warned about one process and I killed the listed PID with a normal kill (using sudo).

Note that stopping the above services will disconnect you from your wifi network, even before you use tshark with the monitor mode enabled pseudo device mon0. If you need to reconnect, e.g. if you find initially as I did that you don't have tshark installed and need to apt-get it, then just redo the service commands above with start instead of stop.

OK - now we're ready to start tshark almost as above on the Mac:

$ tshark -o 'wlan.enable_decryption:FALSE' \
    -i mon0 -f 'subtype qos-data' \
    -R 'wlan.fc.retry==0' -T fields \
    -e wlan.bssid -e radiotap.channel.freq -e wlan.sa -e wlan.da -e data.len

Note that I use the mon0 pseudo device and use -R rather than -Y as the version of tshark available via apt-get for Ubuntu 12.04 is older than the version I have on my Mac and doesn't support the -Y flag. And I don't use -I as mon0 is already in monitor mode (and trying to use -I will cause tshark to fail).

Unlike on the Mac, where no special steps need to be taken once one you've finished capturing packets with tshark, one should stop the pseudo device as shown above and restart the various services (also as described above).

Note that in the airmon-ng start command above we explicitly specify what channel we want to monitor, in the example it's channel 11. If you wanted to do channel hopping see the Wireshark wiki page (also mentioned above).

Extra features of the Smart Config library

In a previous post I covered details of the TI Smart Config library that may (most likely) be historical left overs from TI's development process or may possibly be useable in combination with some non-default configuration of a CC3000 device. The only one of these that could affect the ability of the code I've written to recover SSIDs and keyphrases is being able to set the length of the two separator value. Currently my code looks for packets that differ in length by the difference between these two values, so this logic would no longer work if these values were changed. However it would be simple to adjust the logic to look for values that reoccur frequently and deduce that they were the separator values being used in this particular situation.

Depletion Region

Thursday, October 10, 2013