Tuesday, April 5, 2011

Facebook Places vs Foursquare Checkins

In this post I'll be analyzing Facebook Places and Foursquare data to better understand their relative competitive positions by looking at the number of venues and checkins they have. While it may seem obvious that Foursquare is clearly winning in this space, I wanted to quantify the extent to which this is true and also see if there are any situations (either by geography or by category of business) where Facebook may have a chance.


So, where's the data from?

In order to collect the data for this analysis, I utilized the Facebook Graph API and the Foursquare API. The first step was to collect a list of venues in Facebook's database and another list of venues in Foursquare's database for a particular geographic region. Since there is no easy way to query the API's for a list of all the places/venues and their checkin counts, I had to split up the geographic region into a grid and get the venues for these smaller regions. I then combined and normalized the data to get a list of venues for each service.

Next, to compare Facebook with Foursquare data I had to match places in the Facebook data set with venues in the Foursquare one. Unfortunately, the data sets aren't exactly clean and there can be differences between how a place is listed in Facebook and Foursquare. In order to get around this limitation, I wrote an algorithm that fuzzy matches places in both data sets by using a combination of name, address and geo location. The algorithm can match places even if the places aren't listed in the exact same way. By design, it is overly conservative in its matching - so that only the same places are listed in the "matched" data set. Once the matching was complete, the API was used to get the checkin counts for each matched venue. The aggregate the statistics are presented below.

The Results

Unfortunately  trying to do the analysis for an entire country would have taken far too long computationally and would have certainly had API limit issues. Instead, below are the raw results for 6 different North American cities (San Francisco, Cambridge, Manhattan, Toronto, Orlando and Cleveland).

The first obvious conclusion here is that Foursquare has more venues and more checkins than Facebook Places. However this does vary by geography and venue category. For example, we can compare results from Orlando where Facebook is actually very close to Foursquare in terms of venue count, but in San Francisco, Foursquare almost has double the venues. (see chat 1 and chart 2 below).





The graphs also show how many of the venues in both data sets have been "matched" (we think they are the same venue in both data sets). The amount of overlap in the venues are likely to be higher than what is shown in the data because our algorithm tries to be overly conservative and minimize false positives to get accurate comparison data.

From these matched venues we can compare checkin counts from Facebook and Foursquare. The following graphs show the percentage of these matched venues for which either Foursquare or Facebook has more checkins than the other. For example, in Manhattan, only 8% of matched venues had more checkins on Facebook than on Foursquare:



Where as in Cleveland, Facebook almost twice as much at 14.2% of venues:



It seems that outside of major tech hubs (NY, SF and Toronto), Facebook Places does much better (still nowhere near Foursquare). In Orlando, Facebook Places wins a whopping 23.9%. This seems reasonable because of Facebook's existing penetration throughout the US, they have more usage when the population isn't as drawn to new tech services like Foursquare.

Finally, we can look at how each service fairs when broken down by what type of venue category we look at. For example, in Cambridge, we can see that Facebook Places "wins" (has more checkins than Foursquare does at the same place) when looking at travel spots. This seems consistent with the view that more people from out of Cambridge use Facebook as Foursquare is very popular in Cambridge.


Looking at other cities - one interesting trend is that for the most part, Facebook Places does best at College & University locations. Any guesses why (please comment)?













The above are only a few snapshots of all the data collected, you can find the full analysis + data sets available here:

Additionally, and more for fun, I created a few visualizations of the data. Below is Facebook and Foursquare checkin data plotted on Google earth. Pink lines represent Foursquare, Green lines represent Facebook. The height of the lines represent the number of checkins. This data only shows checkin data for venues where I could find a match on Facebook and Foursquare.













What doesn't this show?

This data is merely a snapshot (as of April 1, 2011), it doesn't show how this is changing over time. Is Facebook catching up in venues/checkins or is Foursquare growing faster? This analysis would have to be done multiple times over a long enough time period to make any meaningful conclusions (no historic data is available from the API's).

Assumptions

One critical assumption in the analysis done is that the places and venues that were matched are a representative sample of all the places and venues that are in common between Facebook and Foursquare. However, I can't think of a reason why it wouldn't be representative (i.e. why would spelling errors of a large enough magnitude that my algorithm discarded the possible match, be limited to one specific segment of venues/places?).

Further Work

Improve matching algorithm - current matching very few of overall venues. Is it the case that this algorithm is too conservative or is the overlap of places not actually very high. If the places aren't very high - what could this mean?

Any gotchas?

While the Facebook API has some fairly generous API limits, Foursquare on the other hand hobbles you with a 5,000 request limit/hour. When their results only return 50 items at a time, this can be limiting. One way to get around the limit is to register multiple API clients and cycle through them when making requests so you don't hit the API limit on any single client.

Social Strategy Implications

Implications for Local Businesses: As a local business implementing a social strategy, you must be engaging with your customers on both platforms. While Foursquare has a lot of the buzz now (rightfully so), Facebook Places usage is not insignificant especially outside of the SF and NYC. Local businesses just starting to experiment with offering deals in location based services clearly need to do the basics: 1) make sure their venue/place is on both services with up to date information and 2) check stats to see whether Facebook or Foursquare has more usage for their particular venue. It is likely worth offering the same deals on both services as the multihoming cost isn't very high (just check activity on two websites) and you are likely to attract different sets of customers on each service (the high variation in checkins per category seems confirms this).

The implications for Facebook and Foursquare are less clear. Only having a snapshot of this data at one particular point in time just tells us that Facebook is currently smaller (almost 10x smaller on average) than Foursquare. This seems obvious as Facebook Places launched well after Foursquare. To get a real sense of who is growing faster between the two, you would need to look at aggregate user data (i.e. checkins per active user and how its changed over time).

Assuming, Facebook is currently losing this battle (i.e. not growing faster than Foursquare), there are several ways they could compete. First, they should leverage their success in photos + mobile. Currently in Facebook, you can tag other people in photos - why not be able to tag the place they are at too? I could imagine a pretty slick interface too - since mobile photos contain GPS coordinates in the EXIF data, Facebook could automatically suggest nearby locations when tagging a photo with a place. It could also automatically checkin all the users tagged in the photo to the place when the photo is uploaded.

Second, Facebook has the benefit of its users using the system for free form status updates. If Facebook could extract or match places mentioned in a status update with the actual Facebook place, it could offer to automatically checkin that user. More interesting though, would be if Facebook would alert you if any of your friends have been to the same location so you can ask them questions. For example, if I posted an update saying "Going to Cafe of India, food always smells great when I walk by....", Facebook should be able to use my current location + my update text to check me into Cafe of India in Cambridge. It should also alert me of all my friends of who have been there so I can ask them what to order.


5 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Awesome analysis Aleem! You acknowledge this, but I'd be most interested to see the trend in FB vs. 4SQ share over time. Also wondering if FB success with college/university locations is related to Facebook's roots and prominence on campuses. I would guess that a higher % of college-aged acquaintances and friends are likely on FB so broadcasting your whereabouts seems more helpful on FB (if you want all your friends to see). - Kelsey P.

    ReplyDelete
  3. hey Aleem, this is great analysis. I was about to do this kind of thing myself and thank god I googled.
    The data you present is very interesting. A few other datapoints might be interesting (1) comparison of Checkin rate per sq mi in these areas that you got data on, for matching businesses and just absolute numbers (2) user stickiness - you mention this in your blog. It would be interesting to see how many users are really checking in many times vs one-shot users

    Would love to link to this article from my blog. Is that OK? My blog is at http://www.yogatailor.com/WorkLifeYogaBlog/

    ReplyDelete
  4. @Ramesh - thanks and go ahead in linking!

    ReplyDelete
  5. Hello Aleem, I find this very interesting. Maybe this project could help in improving your algorithm:

    http://www.factual.com/products/places-crosswalk

    By the way, do you have your algorithm published in github or similar?

    ReplyDelete