We’ve been working on analysis of social media for a while now, and I thought it would be interesting to try out a comparative analysis.  As I watched my own Twitter stream I saw two conversations kicking between OccupyWallStreet and Egyptian unrest this morning.  It made me curious how the conversations on Twitter diverged between a domestic and international protest.  I had a small window of time to work with so I kept by Twitter collection to just 1,000 geo-located Tweets for each topic.  Since #OccupyWallStreet has been going for quite a while there was a defined hash tag to tap into.  The unrest in Egypt did not look to have defined trending terms, so I tested a few terms and “Egypt” looked to be used by several countries and language all discussing the unrest/protests.  As a side note,  it is interesting how wide spread the practice of doing a post in one’s native language, but hash tagging it in English is.

Ended up that both topics were very popular today and I hit the 1,000 tweet limit in 15 minutes for “OccupyWallStreet” and 14 minutes for “Egypt”.  ”Egypt” also had the highest peek of 91 tweets per minute closely follwoed by “OccupyWallStreet” at 87 tweets per minute.  Not surprising – protests make for popular topics.  What did surprised me was the self-similarity between two different protests on opposite ends of the globe.  I took the data from both sets of tweets and did a scatter plot looking at sentiment vs. Klout score.  The idea was to see who was saying positive or negative things about the protest vs. how influential they were in the network.  Check out how similar the distributions are between the two protests:

Scatter Plot for OccupyWallStreet

The scatterplot for OccupyWallStreet shows the most influential people being in the neutral middle, but the overall trend being negative.  This same patterns can be seen for the unrest in Egypt:

Scatter Plot for Egypt Unrest

While it is easy to look at these two scatter plots and deduce that folks are saying lots of negative things about both protests, it is good to keep in mind that sentiment is a tricky thing.  To start with just the algorithms are hard and far from a perfect science at current state.  We are using Repustate’s sentiment engine for this analysis by way of caveat.  The second challenge is that it is difficult to know if the tweet is being negative about the protest or negative towards what is being protested about.  Let’s take an example fro the OccupyWallStreet verbiage:

If #OccupyWallStreet is callin 4 Gov 2 get more N2 my life I’m against U & will GLADLY condone  #Violence 2U  #TeaPartyTerrorist #Bama #RTR

Here we see a Tweet that is against the protests and saying negative things about the protestors, and gets a score of “-1″.  Alternatively we have tweets like:

RT @SenatorSanders: How do we change the system to work for all Americans, not just the top 1 percent? 2) Cap credit card interest rates …

Here is a Tweet in support of the protests but saying negative things about the status quo, and it gets a score of “-2″.  This is a challenge across the board that will be tough to sort out even as sentiment analysis gets better, we will still need to sort out directionality of that sentiment.  Keeping these limitations in mind I think analysis can at best be looked at as a rough leading indicator.  With this as a caveat I dove into some geographic aggregation of the data to see what trends looked like.

For OccupyWallStreet I aggregated the data by both country boundaries as well as USA state boundaries.  Since it is a domestic protest I was interested in seeing what both local and global sentiment was around the protests.  I went with a diverging color scheme and standard deviation class breaks.  This generally sets up a situation where negative opinions are red and positive opinions are blue:

In the United States looks to often follow political lines but there are outliers.  The mix of negative and positive sentiment in the South East and New England is not what many would predict.  This could be a case where sentiment analysis is not doing a great job, or could be a leading indicator of a political shift.  At a minimum these are the locations I’d start digging into the data and potentially running bigger collections.  Internationally we see a mixed response as well with many parts of Europe being supportive which makes sense from a political economy perspective, although Ireland and Spain are intriguing outliers.

Let’s compare now the aggregation of tweets for the unrest in Egypt with average sentiment calculated also using a diverging color scheme and standard deviation class breaks:

In this case the break between positive and negative was not in the middle of the color ramp.  So the light blue is the neutral break point instead of white, which are negative.  The Twitter activity discussing Egypt is, not surprisingly, much higher in the Middle East for “Egypt” than “OccupyWallStreet”.  Which is interesting only from the stand point that it appears the United States is far more interested in unrest in the Middle East than the Middle East is interested in unrest in the United States, at least on Twitter.  Further, reaction in Jordan, UAE, Kuwait, Saudi Arabia, Tunisia, and Morocco are all negative where it had been positive for many during the Arab Spring in Egypt back in January.  Perhaps this is a result of the Muslim/Christian friction driving parts of the protest.   Also Europe is more mixed on Egypt unrest than OccupyWallstreet, but generally sees it more negatively.  Also strong sentiment seems to be the vector for Twitter users in the Philippines taking a very strong positive stance on OccupyWallStreet and a very strong negative stance on Egyptian unrest.

Taking a step back – this really small sample of data highlights a seeming similarity in protests universally, but meaningful differences at a local level.  Overall protests tend to produce primarily negative conversations with the most influential users keeping typically to the middle, while those with the most polarizing views tend to be less influential.  Yet this self similarity quickly evaporates when we slice the data by geography.  These geographic proclivities though can be quite dynamic as events shift during the course of a protest or movement.  For the most part, though, this is all common sense, but sounds better when you write it academic style ;-)

I did find the similarities of the data in the scatter plots surprising at first, but giving it more thought, makes sense both froma a macro and micro perspective.  Sociologists and political scientists have found universal dimensions of protest for quite some time.  Also scientists have found that patterns often repeat themselves across social networks from a macroscopic perspective including protests.  Previously though spatial analysis of crowd behaviors on Twitter found regularities, where here we find deviations when it comes to protest.  It is really just conjecture with such small data samples, but a good first step for where it might be worth while doing some real research.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>