Sunday, January 05, 2014

The Board Room Hour

Back in October last year I was out in New York for the Hardware Innovation Workshop and Maker Faire New York where I took part in a panel discussion along with Massimo Banzi and Jason Kridner—and chaired by MAKE's Dale Dougherty—on what's in store for micro-controllers, and what the next generation of board could bring.

Hacking the CES Scavenger Hunt

This post was originally publish on the MAKE Blog 
and co-authored with Sandeep Mistry.

It has just been announced that at this year's Consumer Electronics Show (CES) will feature a promotional scavenger hunt based around Apple's iBeacon technology. What if you could win the hunt, without ever having to go to CES? 
What if you could win the hunt, without ever having to go to CES?
Quietly introduced by Apple at WWDC last year, iBeacon is a technology that allows you to add real world context to smart phone applications. Based around Bluetooth LE—part of the new Bluetooth 4.0 standard—it’s a way to provide basic indoor navigation and proximity detection. As we talked about when we reverse engineered the Estimote beacons, there are three properties of an iBeacon that work together to create the beacon’s identity. These are:
  • UUID — This is a property which is unique to each company, in most use cases the same UUID would be given to all beacons deployed by a company (or group).
  • Major — The property that you use to specify a related set of beacons, e.g. in a retail setting all the beacons in one store would share the same Major value.
  • Minor — The property that you use to specify a particular beacon in a location.
The scavenger hunt is therefore a hunt for a number of beacons that will probably all share the same UUID and Major numbers, but will have different Minor numbers. Effectively, we're looking for a set of beacons. However wandering the hallways at CES hoping to get into the—approximate 100 foot range—of all of the iBeacons they've scattered across the show floor sounds like a lot of work. CES has teamed up with Radius Networks who are providing the iBeacon hardware, and Marc Wallace—CEO and cofounder of Radius Networks—has this to say about the hunt,
This is one of the coolest proximity-aware apps we have worked on. This is also one of the first, tangible applications that leverages iBeacon technology. And it is a great example of how iBeacon technology is not just about advertising as it is about bringing new and innovative solutions to the marketplace. We are very excited to be a part of it.
Since they're using hardware from Radius Networks we can't just assume—as we could with the Estimote hardware—that we know the UUID of the beacons. However the identities of the beacons—all of the beacons—are somewhere where we can easily get our hands on them, the CES mobile app. Sure enough looking at the CES Android application—it's fairly easy just to download the APK without having to install—there are some hints there for us and using a decompiler it was fairly easy to find the details of the target beacons. 
The Minor numbers of the nine target beacons in the code of the CES mobile application.
The Minor numbers of the nine target beacons in the code of the CES mobile application.
The iBeacon UUID we're looking for is 842AF9C4-08F51-1E39-282F-23C91AEC05E, while the Major number—interestingly not actually needed and just ignored by the Android application—is 65000, while the nine beacons scattered throughout the CES venue have Minor numbers from 65001 to 65009.
The completed scavenger hunt—all nine beacons.
An almost completed scavenger hunt—with eight of the nine beacons already "found."
Since we now know the identities of the beacons, it's trivial to finish the scavenger hunt without ever going to CES as it's actually fairly simple to build your own iBeacon hardware and "fake" the app into thinking you've found the beacons. To do that you can either use a Raspberry Pi, or a Bluetooth LE board like the Red Bear Labs BLE Mini board—Radius Networks, the people supplying the hardware to CES, is even selling a "iBeacon Development Kit" which would work just fine for our purposes. 

At which point—now you have your own iBeacon hardware—you can just go ahead and set the UUID, Major and Minor numbers of your beacon to each of the CES scavenger hunt beacon identities in turn, and then bring your beacon into range of your cell phone running which should be running the CES mobile app. Once you've shown the app all of the beacons, you'll have "finished" the scavenger hunt and can claim your prize. Of course doing that isn't legal. It's called fraud and will probably land you in serious trouble. 

Of course it could be worse. If they are using Estimote hardware it'd be easy for someone to make the hunt impossible to complete. Because as we've shown, anyone with the Estimote SDK can modify the UUID, Major and Minor number of the Estimote beacons in the field. Which would have meant that the beacons deployed across the CES floor didn't work for the scavenger hunt anymore. 

We talked about both of the ability to configure "fake" beacons, and the ability to disable beacon in the field—in our discussion of our reverse engineering of the Estimote iBeacon hardware. However, we didn't think we'd see something like this quite as soon.

The Snapchat Leak

This was first published on the O'Reilly Radar
The number of Snapchat users by area code.
The number of Snapchat users by geographic location. Users are predominately located in New York, San Francisco and the surrounding greater New York and Bay Areas. 
While the site crumbled quickly under the weight of so many people trying to get to the leaked data—and has now been suspended—there isn't really such a thing as putting the genie back in the bottle on the Internet. Just before Christmas the Australian based Gibson Security published a report highlighting two exploits in the Snapchat API claiming that hackers could easily gain access to users’ personal data. Snapchat dismissed the report, responding that,

Theoretically, if someone were able to upload a huge set of phone numbers, like every number in an area code, or every possible number in the U.S., they could create a database of the results and match usernames to phone numbers that way.

Adding that they had various "safeguards" in place to make it difficult to do that. However it seems likely that—despite being explicitly mentioned in the initial report four months previously—none of these safeguards included rate limiting requests to their server, because someone seems to have taken them up on their offer.

Data Release

Earlier today the creators of the now defunct SnapchatDB site released 4.6 million records—both as an SQL dump and as a CSV file. With an estimated 8 million users (May, 2013) of the app this represents around half the Snapchat user base. Each record consists of a Snapchat user name, a geographical location for the user, and partially anonymised phone number—the last two digits of the phone number having been obscured. While Gibson Security's find_friends exploit has been patched by Snapchat, minor variations on the exploit are reported to still function, and if this data did come from the exploit—or a minor variation on it—uncovered by Gibson, then the dataset published by SnapchatDB is only part of the data the hackers now hold. In addition to the data already released they would have the full phone number of each user, and as well as the user name they should also have the—perhaps more revealing—screen name.

Data Analysis

Taking an initial look at the data, there are no international numbers in the leaked database. All entries are US numbers, with the bulk of the users from—as you might expect—the greater New York, San Francisco and Bay areas. However I'd assume that the absence of international numbers is  an indication of laziness rather than due to any technical limitation. For US based hackers it would be easy to iterate rapidly through the fairly predictable US number space, while "foreign" numbers formats might present more of a challenge when writing a script to exploit the hole in Snapchat's security. Only 76 of the 322 area codes in the United States appear in the leaked database, alongside another two Canadian area codes, mapping to 67 discrete geographic locations—although not all the area codes and locations match suggesting that perhaps the locations aren't derived directly from the area code data. Despite some initial scepticism about the provenance of the data I've confirmed that this is a real data set. A quick trawl through the data has got multiple hits amongst my own friend group, including some I didn't know were on Snapchat—sorry guys. Since the last two digits were obscured in the leaked dataset the partial phone number string might—and frequently does—generate multiple matches amongst the 4.6 million records against a comparison number. I compared the several hundred US phone numbers amongst my own contacts against the database—you might want to do that yourself—and generated several spurious hits where the returned user names didn't really seem to map in any way to my contact. That said, as I already mentioned, I found several of my own friends amongst the leaked records, although I only knew it was them for sure because I knew both their phone number and typical choices of user names.


As it stands therefore this data release is not—yet—critical, although it is certainly concerning, and for some individuals it might well be unfortunate. However if the SnapchatDB creators choose to release their full dataset things might well get a lot more interesting. If the full data set was released to the public, or obtained by a malicious third party, then the username, geographic location, phone number, and screen name—which might, for a lot of people, be their actual full name—would be available. This eventuality would be bad enough. However taking this data and cross-correlating it with another large corpus of data, say from Twitter or Gravatar, by trying to find matching user or real names on those services—people tend to reuse usernames on multiple services after all—you might end up with a much larger aggregated data set including email addresses, photographs, and personal information. While there would be enough false positives—if matching solely against user names—that you'd have a interesting data cleaning task afterwards, it wouldn't be impossible. Possibly not even that difficult. I'm not interested in doing that correlation myself. But others will.