*"Some companies have built their very business on their ability to collect, analyze, and act on data"*

– ‘Competing on Analytics’. Harvard Bus.Review Jan 2006.

Many businesses make some sort of use of their customer and
market data. For some, it’s just a matter of storing and accessing customer
data for record keeping and transactional purposes. Others like Google, Netflix,
I.B.M. , or the Oakland
Athletics make data analysis and analytics a major part of their business
model.

By analyzing past business records, data mining and
analytics can help identify patterns that can support decisions that are more
cost effective and efficient. This is
the specialty of what has contemporarily been dubbed the

*data scientist.***Do you really have a need for Data Mining and Predictive Analytics?**

There’s only one way to find out how much potential value is buried in your data, and you have to start somewhere. With just a few data mining techniques you can begin to extract insight from your data that you might not otherwise achieve even after hours or years of pouring over lists and and row after row, column after column in excel.

What’s important is that you have someone that can identify the best tool for the task at hand, whether it’s a traditional experimental design involving analysis of variance, a forecast or time series analysis, a predictive model using logistic regression or decision trees, or one of the many other possible data mining tools available to a data scientist.

There are several aspects of data mining and predictive analytics that may be useful to you or your organization including

*Data Visualization, Predictive Modeling, Text Mining, Social Network Analysis, and Causal Inference.*I discuss each of these below.

**Data Visualization**

There’s more ways to gain insight from your data than just fancy models or algorithms. Data visualization allows you to transmit information to end users without the sometimes distracting statistical terminology, complicated equations, or never ending excel sheets.

Created Using R- GoogleVis Package

Flash Enable Browser Required!

__Revenues and Outlays 2003-2009__

**Predictive Modeling**

My most successful analytics accomplishment to this point involves the development of a predictive model

**s**that we use to identify students that have a high risk of dropping out at WKU. Working with my team, we’ve incorporated my model metrics into our data base/reporting/decision support system so that administrators have access to these high level analytical tools for strategic decision making. We won an honorable mention from SAS at the recent SAS Global forum for our presentation. (see here for the paper with screenshots). We've since extended this model to predict the probability of enrollment and retention at the application stage as presented at the 2013 SAS Global Forum.

**Text Mining**

With Twitter, Facebook, email, online forums, open response
surveys, customer and reader comments on web pages and news articles etc. there
is a lot of information available to companies and organizations in the form of
text. Without hiring experts to read through all of the thousands of pages
worth of text available and making subjective claims about its meaning, text
mining allows us to take otherwise unusable 'qualitative' data and convert it
into quantitative measures that we can use for various types of reporting and
modeling.

*Examples:*

Mining Political Tweets

Mining Tweets related to the term 'Factory Farm'

The Mathematics Behind Text Mining

Tools like SAS
Text Miner in conjunction with SAS
Enterprise Miner are designed specifically to do this type of analysis on a
much larger scale. I have used both of these tools in
predictive modeling applications. R
also has open source tools as well.

**Social Network Analysis**

With the rise in the use of social media, data related to
social networks is ripe for analysis using techniques from social network
analysis and graph theory. According to International Network for Social
Network Analysis, ‘

*Social network analysis is focused on uncovering the patterning of people's interaction’.*
Social network analysis (SNA) allows us to answer questions
such as who are key actors in a network? Who are the most influential
members of a network? Who seems to be acting on the peripheral? Which
connections in the network are most important? Are there key players
bridging connections or information between otherwise disconnected groups? Have
policies or other forces changed the overall dynamics/interaction between
people in the network (i.e. has the network structure changed in any meaningful
way) and does that relate to some other performance outcome or goal?

More specific applications of SNA may include Student
Integration and Persistence, Business to Business Supply Chains, Seeding Strategies for Viral Marketing, and Predicting Customer Churn. The open source software R
and NetDraw
provide many tools for conducting social
network analysis.

‘Using SNA in Predictive Modeling'.

‘Using Twitter to Demonstrate Basic Concepts from Social Network Analysis’

‘An Introduction to Social Network Analysis Using R and Netdraw.’

*Examples:*‘Using SNA in Predictive Modeling'.

‘Using Twitter to Demonstrate Basic Concepts from Social Network Analysis’

‘An Introduction to Social Network Analysis Using R and Netdraw.’

**Causal Inference**

**Sometimes we want to do more than just predict outcomes or identify key customer segments. Sometimes we want to know if a current practice or promotion is really having an impact on our business. In the case of an applied research setting, we want to know if a given 'treatment' has a statistically significant impact on an outcome of interest. We know that correlation does not always imply causation. In all of these cases we need statistical methodologies that will allow us to infer causation when appropriate, such as quasi-experimental designs.**

For a very technical look at these methodologies see: Causal Inference Roundup and Quasi-Experimental Design Roundup

**For more information:**

If you feel you can benefit from the services of a data
scientist or have further questions about applied econometrics and analytics
please contact me for more information or feel free to visit my blog or
selected works where you can find a copy of my CV.

LinkedIn Profile: (link)

Selected Works Profile: http://works.bepress.com/matt_bogard/

## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.