Right now, a famously secretive organization is amassing personal information about citizens the world over at an unprecedented level. The material they are gathering ranges from contact information to employment history, known personal relationships to specific financial information, sexual orientations to biometric data (one’s unique physical appearance). Their goal is to know as much about every living person as possible, with or without their consent—and this massive documentation effort was revealed only by accident thanks to a flaw in the code behind one of their systems. What is this organization? The FBI? NSA? Russian intelligence? Chinese hackers?
It’s Facebook. And they aren’t discriminating between those who use their services and those who don’t. Even if you have never signed up for Facebook or any of their affiliated companies, their database nonetheless includes an unnervingly meaty and constantly growing file on you.
If you are a willing participant in Facebook’s business, you likely have no recourse to protest that they are preserving records of the information you share with or through them. After all, you volunteered as much. As a non-user, however, you may have never given a second thought to their data collection practices. And as it turns out, Facebook probably knows as much about you as they do about many of their active users. In addition to profiles of active users, Facebook maintains significant data on non-users—in what are ominously known as “shadow profiles.” How they build these shadow profiles may not only erode whatever trust you may have in Facebook, but may also cause you to question how you interact with everyone from new acquaintances to your closest friends and family.
The first building block of shadow profiles is information you have volunteered to your associates and colleagues. Information like the phone number you gave your best friend, the email address you gave your coworker, or the photo you gave to your ex. When all of those people agreed to Facebook’s data policy, they consented to the company’s view that any data you share with them is theirs to do with as they please. In essence, Facebook’s stance is that any information that you share with any other person becomes the property of the person you shared it with. And Facebook demands access to that data in exchange for their products. Thus, Facebook gains access to any information any of your contacts have about you the instant they consent to using Facebook. Of note, active users have shadow profiles too; while a given user may not have added their cell phone number to their profile, for example, if one of their “friends” uploads a contact list containing said number, Facebook automatically associates the number with the correct user. Thus, active users have a two-layered profile on Facebook—the info they add and share manually, and the additional cache of data that only Facebook can see.
For the record, Facebook is not the only company that maintains such a policy. For example, LinkedIn may suggest that you connect with someone from whom you bought an item on Craigslist seven years ago. How did LinkedIn know you had interacted if you never shared your contacts? Because that person did. And you emailed them. Once. Seven years ago. Sound creepy? This example is about as innocuous as it gets.
Facebook has historically denied compiling information about non-users, and for most people aware of their practices during the company’s early days, this was reassurance enough. That is, until a flaw in Facebook’s programming revealed a darker truth. In 2013, cybersecurity firm Packet Storm disclosed a critical bug in one of Facebook’s features that allows users to download their own information. As it turned out, information from Facebook’s shadow profiles of active users was unintentionally being included in these downloads, revealing a glimpse into Facebook’s data collection practices. Some six million accounts were said to have been affected, and the bug had been active and unchecked for the better part of a year by the time it was discovered. Users who suddenly realized that information they had deliberately kept off of Facebook was there all along were none too pleased, to say the least. And Facebook’s galling response was that all of the data they held was given to them voluntarily, and that they could do whatever they pleased with it, regardless of how it endangered or outraged their users.
When we share our personal information with others, we typically consider it a good-faith exchange in which violation of that faith can be met with some consequence; essentially, the equivalent of renting a VHS tape with its accompanying FBI warning that this is for your eyes only. Facebook takes a different stance; they view such an exchange as a grant of unlimited license to distribute the information to any entity, at any time, for any reason. Under current law, this is entirely permissible so long as it is disclosed in a terms of service agreement, which, in this case, it is. Whether such agreements get read in full by most users is not really in question. Most people simply click a button and accept legally binding terms of a relationship they may not fully or even partially understand.
If that were the extent of Facebook’s shadow profiling, it would already seem invasive to many people, but this data collection juggernaut doesn’t stop there. Facebook has invested tremendous sums of money in facial recognition software, the tools they employ to help users tag themselves and others in photos. Once a person is tagged in a photo, Facebook matches the biometric markers of their physical identity to their name (and the rest of their personally identifying information) and files it away in that person’s shadow profile. This has not gone uncontested; for example, a recent class action lawsuit in Illinois contends that collecting this data without express consent is illegal under state law. Such activity is nonetheless presently legal in the majority of the world. In the United States, Texas is the only other state with laws specifically pertaining to biometric data collection at this time of this writing.
Once Facebook matches a person’s “faceprint” to the rest of their data, it is a trivial matter for them to associate any image uploaded thereafter with the owner of said face. It would be similarly straightforward for Facebook to do the same with images from public cameras, security footage, or any other photos or videos. Like the rest of their data, Facebook claims that it will never share this information with any person, company, or government agency, but the best intentions have their limits (does PRISM sound familiar?). Subpoena power may very well evolve into a tool for discovering user profile information and more—not to mention the growing threat of hacking. And while it would in theory be more difficult for Facebook to associate biometric data with the shadow profile of a non-user, since non-users would not have shared what they look like with Facebook, it is far from impossible. After all, Facebook users can tag non-users in photographs manually, and if enough users upload contact lists containing the same non-user and then tag that person in multiple images, what’s stopping Facebook from deciding that they have enough data to definitively identify said non-user’s biometric information?
This is hardly the full extent of this situation. Facebook also actively purchases material from third party data brokers in order to fortify shadow profiles with additional information. As reported recently by ProPublica, this data goes beyond seemingly innocent things like the types of food a person enjoys—it includes financial information such as how many credit cards an individual has, what their income range is, even the typical price range of their frequent purchases. Oracle’s Datalogix, one of many firms that Facebook contracts with, provides some 350 types of such consumer information. This data is bought and sold in bulk, containing facts about Facebook users and non-users alike. Facebook contends that the opt-out process for users who don’t want this information sold to Facebook and others should be performed via the data broker, and that Facebook itself does not need to disclose this information to their users on the basis that it is publicly available through other means.
ProPublica indicated in their report that removing one’s data from third-party brokers’ databases ranges from challenging to impossible, offering little comfort to anyone who values their privacy. Further, it is hardly a stretch to assume, given Facebook’s stance on the data being publicly available, that shadow profiles of non-users contain the same level of this commoditized information as do those of registered users. Given that purchases of this bulk data include material about users and non-users alike, it would be a simple task for Facebook to store records they already bought in what, from a programming standpoint, would be the most logical place.
Facebook first started contracting with outside data vendors in 2012—the year they implemented the flawed code that ultimately led to the discovery of shadow profiles in the first place.
The level of personal information Facebook harvests grows continually as the company invests more and more in their primary revenue model, targeted advertising. According to technology publication Tom’s Hardware, Facebook currently tracks individuals’ online behavior by recording what websites they visit, regardless of whether they are logged into their Facebook account—or are even a user in the first place. This is accomplished by tracking visits to sites that display Facebook “like” buttons, of which at this point there are a staggering number. Thanks to the convenience of offering Facebook login tools and the SEO benefit to sites that offer such social sharing tools, the “like” button and the option to log into third party websites via Facebook has become commonplace. It is entirely possible that any given non-user’s shadow profile contains comprehensive documentation of their every online move. One’s only recourse is to avoid websites that feature “like” buttons—which, of course, cannot be identified without visiting those websites in the first place.
Facebook has long claimed that its data collection methods are completely legal, leaving non-users who do not want the company gathering their personally identifying information with no choice but to let their discontent smolder and accept this non-consensual private surveillance. As of this writing, no sufficient legal challenge in the US has taken Facebook to task on this issue. Thankfully, in the EU there have been many such rebukes due to that region’s far more consumer-friendly laws regarding online privacy and data collection. The US, however, remains a legal environment far friendlier to corporations and government agencies that view individual privacy as an outdated concept. For them, anything that stands in the way of their goals is to be circumvented, no matter the consequences.