Published on November 5th, 2013 | by Joseph Iacoviello0
Facebook and Big Data
Facebook is a company that is constantly getting in trouble for privacy violations. To generate revenue, Facebook needs to sell the data you give them to advertisers. The more hits that Facebook can bring to advertisers, the more money Facebook gets. Because of this, Facebook has a vested interest in targeted marketing, meaning they crunch large sets of data in order to bring you ads you are more likely to click on. Facebook has designed algorithms to measure lots of different things about you, where you are most likely to live, where you probably work, and so on, but now a new research paper claims that Facebook can predict who is in a relationship with who and when they are going to break up
This paper, written by Jon Kleinberg, a computer scientist at Cornell University, and Lars Backstrom, a senior engineer at Facebook, will be presented at a social computing conference in February. This study used 1.3 million anonymous Facebook users at least 20 years of age and with between 50 and 2,000 Facebook friends. The study tracked these couples every two months over two years. This is a huge amount of data to analyze, about 379 million nodes and 8.6 billion links. Using this data, the researchers found that “embeddedness”, or how many mutual friends you and your significant other have, is actually not a very good indication of relationship status. They found a different method, called “dispersion”, is much more effective. Dispersion is the measure of how well connected a couple’s friends are. A couple with a high dispersion will have separate groups of friends that are not connected to each other. The algorithm that used dispersion to discover relationships was able to guess relationships with 60% accuracy. What is more interesting, however, is that when the algorithm failed it was often a sign that the couple was about to break up. A couple that declared a relationship but had a low dispersion were 50% more likely to break up in the next two months than couples with a high dispersion.
Large amounts of data, commonly referred to as “Big Data”, are routinely used to predict human behavior. For example, Twitter can use tweet timestamps to detect when most people in a certain region are asleep. Using nothing but the amount of traffic from a given area, Twitter has discovered that people in Istanbul don’t sleep much in August, and the citizens of Sao Paulo, Brazil take a siesta after lunch. Interestingly enough, this type of data isn’t violating any privacy laws, because “Big Data” doesn’t need your name. Personal information is often referred to as “white noise” by people who gather Big Data. Whether or not Big Data is infringing your privacy is still a matter of opinion. Big Data is most useful when gathering information about a certain population, like Twitter’s discovery of sleeping hours, or searching for patterns in social circles, like Facebook predicting when couples will break up. On the other hand, this is exactly the type of data that the NSA is collecting using their Prism program, which many people argue does violate privacy. Unfortunately, whether you think it violates your privacy or not, there is very little you can do about data collection in this world. Everything is tracked in this information age, from census data to browsing history. In today’s world, there is no possible way that you can’t end up on some database.