Você está na página 1de 2

USING DATA AND ANALYTICS POWERS FOR GOOD

RAYID GHANI, TYPED BY TOM LAGATTA

Any errors, inconsistencies, and unclear rambles in the notes are entirely the fault of the typist. Aliations: University of Chicago, Edgeip Objective function: maximize probability of winning 270 votes. Emphasis: winner takes all. 2-2.5 million volunteers. (approximately .075% of U.S. population) Main data source: voter le, database of every registered voter in country. More precisely, every state has its own voter le. Obama campaign consolidated these voter les into a single database. Most people in database (e.g., emails) are not identied to an entry in the voter le. Essential quantities: Support: how likely is somebody to support our side? Turnout: how likely is somebody to turn out and vote? Persuasion: how likely is somebody to vote for each side? Central theme: better than random. Use the data to make estimates that are better than random, and using these estimates, take actions to inuence and aect the outcome. Key point: justify the costs of those actions. Support model. You have some data on who supports whom (e.g., party registration). Augment these data with polling. The central use of polling was to prime the priors for the model. Inputs to the model: Demographics, voting history, email history, fundraising history, calling history. Constraints to model. Accuracy: get good ranked list of supportive people, in order to target actions. Need probabilities to line up with frequencies: if I am a 40% Obama supporter, then 40/100 of people like me should be Obama supporters. Number of features for each person: roughly hundreds. Total database: 10-20 terabytes, very manageable. Interesting: the Narwhal backend was for web apps, and had little to do with data or analytics. Also: investment for the future. To be data-driven is to be rational: change actions based on available data. Most organizations are not rational in this sense: they still make decisions based on their guts.1 Several channels of communication: direct mail, TV ads, knocking on doors, 5 billion emails. Persuasion scripts for volunteers: are you going to vote? wheres your polling place? how are you going to get there? when are you going?
Date : October 2013. 1nb: this is a particular denition of rationality, and not agreed upon by the whole community.
1

USING DATA AND ANALYTICS POWERS FOR GOOD

Goal: identify that small number of people who are persuadable. DIerent channels have dierent purposes: emails and online ads are for fundraising; TV ads are for persuasion. Primary variables for support: saying yes I support Obama, evidence of past support, donations to campaign. Fundamentally: this is a ranking problem. Identify supporters by degree. In an ideal world, we do something more game theoretic with regards to persuasion. This is not that world (yet). There is always a tension between people who are comfortable with data and people who are not. The way to settle this tension is by experiments. Surprising: Facebook Pages do not have access to full lists of the users whove liked them. Built a tool called Targeted Sharing. Authorize our Facebook App, to access their social graph and certain attributes. Try to match them to voter database (30-40%). Inuence model. How likely is your friend to take an action given that you do? They had approx 1 million people authorize the app, and used this to get data on 200 million people. Small world phenomenon. There was a sharp correlation between the level of personalization of emails and the level of engagement. Lots of A/B tests. A mistake that nonprots and campaigns make is that they dont send enough emails! Clever: give us $23, but option is $25. These numbers are carefully optimized. Prediction is great, but how can we inuence behavior?

Você também pode gostar