Facebook and Cambridge Analytica

This has been a tumultuous week for Facebook (FB), marked most particularly with the Cambridge Analytica (CA) revelations and subsequent fallout. We explore what exactly happened and the different reasons and potential ramifications. 


The particular CA/FB scandal is best understood first with an overview of the pertinent facts. CA was bankrolled by the Mercer family, an exceedingly rich, very conservative, highly political father/daughter tandem (the father co-founder Renaissance Technologies, a highly successful hedge fund focusing on high-frequency trading), in collaboration with Steve Bannon (of Breitbart and Donald Trump fame) and the SCL Group, a British PR firm (SCL is the entity that actually did much of the CA work through subcontractor agreements). CA worked with a number of campaigns around the globe, including Ted Cruz's ultimately-failed presidential PAC. Then, most notably, with Donald Trump's presidential campaign. Some CA employees worked with the Trump campaign. It appears that these employees were primarily engaged in traditional political work like TV ad buying - and this did not involve the data in question (discussed below). Other employees worked for a PAC that supported Trump ("Make America Number 1"), but not on the Trump campaign itself. This effort seemed to leverage the more dubiously-obtained data. Chris Wylie, a former CA employee, revealed the data practices as consisting of the following steps:

  1. Aleksandr Kogan worked at Cambridge and used FB to obtain data with an app he created, "thisismydigitallife" - a personality quiz. The quiz gave Kogan access to all of a user's friends and their profiles (the harvesting of user data from FB was otherwise permitted through FB's APIs at the time, and the data was relatively easy to download)
  2. 270,000 people used the app and consented to the data requested, meaning more than 50 million profiles were downloaded in compliance with FB's terms of service
  3. CA paid $7M for the data, which then used it for commercial/political purposes, despite Kogan getting FB's permission to use the data only for academic purposes
  4. FB subsequently received assurances from Kogan and CA that they deleted the data - but this was not true. 

From the above facts, there are three separate scandals. The media is, at times, not clearly delineating the three, which may confuse matters somewhat. They are discussed below:

1) CA's Misuse of FB Data - CA misused the data it obtained from FB. Kogan should not have transferred it to them, and they used it in violation of FB's terms of service and lied (or at least meaningfully mislead) about it. CA was effectively a PR and campaign strategy firm, and it was boastful about its data strategy - and it also represented Donald Trump. Trump is an extremely controversial president, thus the degree to which a misuse of data contributed to his election, especially when compounded by possible Russian influence, Wikileaks, and more outside factors, has created significant angst about both FB and CA and the role this data played in getting Trump elected. Certainly CA should not have leveraged this data, as it was not rightfully acquired or used. But, in reality, it is not likely that CA was terribly important in Trump's election.

The Obama campaign was particularly effective in leveraging data - including Facebook data - to influence the vote. This included strategies like having users upload their contact's data, identify which friends were likely swing voters, and have that individual target their friends with appropriate messaging. To the degree that one is upset about the use of FB data alone for political purposes, the Obama campaigns leveraged such data in an even more aggressive and effective fashion (albeit in an approved way). There are actually a litany of anecdotes about CA being decidedly ineffective. In fact, the Trump campaign opted not to use the CA data - but instead leveraged the Republican nation party's data as their primary source.  

It is almost certainly not the case that Trump is the president because of CA (statistically, that's most directly related to James Comey). CA may also have colluded with "the Russians" - there's a lot to the story there, but that is not about the FB data. Indeed, there may be many ways that CA broke the law, and it will have to face the consequences. So this part of the scandal will likely result in legal action against CA or a more general set of legislation around the data can be used for political campaigns and/or the type of disclosure that would be required.


2) FB's providing access via APIs to a large number of third parties - the amount of access that FB provided to third parties was and is, in the history of the world, unprecedented. It was generally done without the "consent" of their users. While perhaps an individual would consent to giving an app their info, it may or may not have been truly "knowing" or "informed" and the API at the time would also allow that user to effectively consent to the conveyance of all of their friends' likes and interests, which was definitely not consented to by those third parties. This API functionality was curtailed in 2014, but before then it was a feature that was announced at developer conferences and that FB boasted about, not a bug. FB was trying to leverage apps and the APIs to create an ecosystem in which the FB ID was the epicenter. It is assuredly the case that similar - likely larger - troves of data still exist across any number of apps. 

An important part of Mark Zuckerburg's sorry-not-sorry tour this week was his declaration that FB would audit apps for the data they have downloaded through 2014. FB will simply not be successful ensuring that the data that has been in the wild for four years is cut off. It is on thumb drives, in the cloud, etc - one should assume that data similar to what CA had is available broadly, and that bad actors or subcontractors not bound by any contract FB, and it will continue to be leveraged for some time. 

FB's decision to make this data available is a significant issue, and is much more salient than CA's misuse of the data. Indeed, focusing on potential misuse of data by apps - which was Zuckerburg focus in his interviews - substantially misses the point and deflects from the underlying problem. Indeed, when The Guardian originally reached out to FB about this story, FB's first reaction was that it threatened to sue The Guardian, meaning the sanctity of user data was not the highest on its list of priorities. 

FB's sharing of this data likely violates a consent decree it had made with the FTC. This stemmed from a 2011 claim by the FTC that FB engaged in unfair and deceptive practices by making public the data that its users considered to be private. FB's consent decree included its agreement not to share user data without their consent. The level of data sharing with third-party apps that enabled the CA scandal probably was not permitted under this consent decree. 

That said, the data that was shared was done only through 2014, when the company released the v2.0 of its API. Other than the notion of an audit, which assuredly won't fix the actual underlying problem of the massive data leakage that already happened, there isn't much FB can do at the moment. FB will probably face penalties, possible regulation, and financial fallout (e.g. shareholder suits, class-action suits, etc) on this point, but what's done is done. 


3) FB's general level of data collection - FB is relatively unique in the world in terms of data collection (though Google is similar). There may have been some vague sense that these companies were compiling enormous dossiers on each individual, the public is now beginning to understand scope of this information. Individuals - and notably legislators - are beginning to notice. 

On this point, the concern is not about advertisers changing how they will spend, but about a confluence of other factors, including consumers beginning to learn about how their data is being collected and used - and altering their behaviors, and the significantly increased regulatory risk that comes from regulators learning the same. This is the challenge that FB does not want to address - and thus it has focused its response substantially on point 2, and tried to ensure the discussion is on that much-more-addressable point. 

FB was already the big tech company that people trusted the least, and this certainly did not help. It will likely lose some users, but likely not enough to move the needle (the FB app actually moved up in the app store this week, despite the scandal), especially given the company's ownership of two primary alternatives - Instagram and WhatsApp. While a very small number of advertisers may cut back on their spend on FB, the platform remains highly relevant for those seeking consumer attention and accurate targeting. Unlike the YouTube scandal, this is not about brands being adjacent to content that undermines their message - this is about the very data that FB has sold brands on using. But the scandal and outrage likely marks the beginning of the end for Facebook's mostly regulation- and litigation-free journey. It will face user data restrictions (particularly in Europe), it may face antitrust, unfair trade practices and similar regulatory hurdles. The timing of these effects will be on the order of years, and the nature is largely unknowable.