R You Ready for DIY Statistics and Social Media Marketing Analytics?

The logo for the R statistical software package

This post originally appeared in Social Media Marketing Magazine, Issue 1, Number 6

Statistics. For many current and former students, just the sound of this word evokes nightmares. However, once you learn to master some basic tasks using statistical analysis software, the world becomes your data playground. Thus, the purpose of this column is to introduce you to one of the most powerful statistical analysis software packages (open source!) and to provide examples of how it can be used to build your social media marketing analytics capabilities.

R is one of the most popular statistical analysis software programs available. It is used by statisticians, financial analysts, marketing researchers and social media researchers. To download and install R, go to the Comprehensive R Archive Network (CRAN) site.

You’ll find versions of R built for Linux, MacOS X and Windows. In my opinion, the versions for Linux and Windows work best. Issues for Mac users related to installing and updating additional statistical analysis packages (3311 are currently available – and like R, all for free) are known to exist. Mac users experiencing problems are encouraged to consider installing Virtual Box in order to install and run the Linux version of the R statistical program. After selecting, downloading and installing the base program, visit the Quick-R site for examples of how to use R for statistical analysis.

Two recent additional (or add-on) social media statistical packages are available to be installed in R: RGoogleTrends and twitteR. The former is one of the most difficult R add-ons to install in R (it took me a couple of hours to make it work) and the latter performs many of the same functions of one of the earlier software packages recommended in this column, NodeXL.


RGoogleTrends is not available for download within the R “install packages” option. To install RGoogleTrends, one must visit this site to download the .tar.gz file (I use 7-zip for Windows to unzip the file). Hyunyoung Choi and Hal Varian provide a unique perspective of the power of RGoogleTrends in their 2009 white paper “Predicting the Present with Google Trends”. Joe Rothermich developed an interesting presentation (2011) that illustrates the power of using RGoogleTrends to measure market sentiment and events. Download and read both to get an idea of the power of R from a social media marketing perspective.


The easier of the two R add-on packages to download and install (can be done within the “install packages” option included in the R base package), twitteR provides users with access to the networks of businesses or people who are on Twitter. Using twitteR, one can perform network analysis tasks that include basic statistics. Jeffrey Breen recently presented (2011) an incredible example of how to use R and twitteR to mine Twitter for consumer attitudes. His presentation includes advanced R code for how to replicate/duplicate his research.

Admittedly, none of this is easy. But spending the time to master R, RGoogleTrends and twitteR will make you a better social media marketing researcher. R you up to the challenge?


Social Media Marketing Still Generating Interest

Social Media Marketing by Rosaura OchoaIs interest in social media marketing waning, as reported by some researchers and pundits? The short answer is no. Using some creative code written for the R statistical analysis software program as highlighted in a blog entitled “Visualizing Wikipedia search statistics with R”, a graph documenting the number of daily searches for the term “Social Media Marketing” via Wikipedia is developed and presented below.

Wikipedia Search Traffic for Social Media Marketing

The results are interesting from a couple of perspectives. First, little to no search traffic on Wikipedia exists for Social Media Marketing (SMM) prior to the third quarter of 2008. Next, growth in the number of searches for the term SMM approximates a linear trendline with a positive slope for the period beginning with the third quarter of 2008 through the end of 2010. And finally, although the number of searches on Wikipedia for SMM in 2011 does not exhibit consistent growth, it is not declining either.

Additional research is needed in order to correlate the spikes in SMM search activity with events that may have caused these anomalies. More spikes are noticeable in 2011 than in any other time period. Overall, evidence suggests that interest in SMM is stable at about 1,000 searches per day. And, if you’re an optimist, based on the results for the past couple of weeks, interest in SMM may be entering another growth phase.

Do you think that interest in Social Media Marketing has peaked?


Price, Add-ons and Graphics: In My Opinion the Statistical Program R Kicks SPSS’ SAS

Tired of paying the initial cost, annual licensing fee and for each add-on package for your statistical analysis software? It’s time for you to switch to R, an incredible open-source program for statistical analysis and graphics. R was developed by Ross Ihaka and Robert Gentleman at the University of Aukland, New Zealand in 1993. It has become the statistical analysis software of choice for statisticians, financial analysts and economists. Marketing researchers, and business schools in general, have been slow to adopt R. Thankfully, this is changing.

As an avid user of open source (Linux Mint operating system, OpenOffice, Firefox, etc.), I made the switch to R two years ago when I started teaching marketing research again. Both my undergraduate and graduate marketing research classes utilize the R software package for data analysis. Five reasons for you to make the switch to R are presented below.

Reason 1: Price

R is free. I know that my students will be able to afford to use the software after they graduate. In addition, each add-on module for specialized statistical analysis is free. To date, there are more than 2,437 add-on packages available, including structural equation modelling, model-based cluster analysis and lattice graphics.

Reason 2: Multi-Platform Usability

R works in Windows, Mac and Linux. This removes any excuses that students can offer about software compatibility.

Reason 3: Graphics

R’s capability for generating graphics is unparalleled. The ability to incorporate colors, graph in three dimensions and in some packages, grab and rotate the graphic for different views makes R the king of the statistical analysis software in this category. Two examples are provided: a three dimensional rabbit (in color) with individual data points highlighted and a three-dimensional graph from a model-based cluster analysis.

Reason 4: Ease of Data Import and Export

R makes it simple to import data in multiple formats. My students enter data in Microsoft Excel or in OpenOffice Spreadsheet and import what they need by using the copy and paste functions. My preference is to import data from a spreadsheet as a .csv file. R allows you to import SPSS or SAS datasets and export to these same formats.

Reason 5: As Part of the Open Source Community, R is Continuously Improving and Expanding

Since 1997, updates to R are managed by the R Core Development Team, working collaboratively from all over the globe to contribute code, debug, provide documentation and develop add-ons.

It is this open source approach, managed by some of the top statistical and scientific talent available, that makes R so robust and so appealing.


No mistake about it, R is old-school cool. Users have to learn to utilize command lines in their statistical analysis. Seasoned marketing researchers, like me, were taught how to do this in SAS, BMDP and SPSS before they became menu-driven packages. New users face a steep learning curve, but the effort pays off in the end. For once you understand the commands in R, switching to SPSS or SAS is a walk in the park. And the python extension in SPSS allows users to run the plethora of statistical add-ons available in R.

So how do you get started? Visit the R-Project website and learn as much as you can about the R statistical software. Then go to the Comprehensive R Archive Network to download the latest release (currently R 2.11.1). Install R on your computer and begin the relationship. In my opinion, the best source of information for adapting to R as a former SPSS or SAS user is the Quick-R website by Robert I. Kabakoff.

No excuses remain. Join us in using R as your statistical analysis platform or become obsolete in marketing research. The choice is yours.