you're reading...
Data Scientist

What’s a Data Scientist

Well after the bigdata.sg meet up I went to search a bit more about the definition of what is a data scientist – the growing buzzword for the moment.  I like the one in the yahoo article that interviewed EMC Greenplum’s Steven Hillion.  His take on the definition (the rest of the article can be found here):

To Hillion, data scientists are “analytically-minded, statistically and mathematically sophisticated data engineers who can infer insights into business and other complex systems out of large quantities of data.”

The skill set of the data scientist goes beyond the capabilities of what many would call “traditional business intelligence (BI).” Traditional BI is interested in the “what and the where,” while data scientists are interested in the “how and why,” Hillion says. “They’re interested in inferring things that are not already present in the data.”

I like the part where he mentions that they are “equal parts engineer, statistician and investigative journalist / forensic reporter”.  I can relate to those, but something is missing – the programming/hacker skills?  And of course the need to understand the business.  They need to listen to people, understand what questions they’re asking, but then sort of read between the lines. Skill in mathematics, statistics, modeling and data mining are of course essential.

Can’t wait to jump into Kaggle and have fun!




No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


Exploring and venting about quantitative issues

The Stone and the Shell

Using large digital libraries to advance literary history

Hi. I'm Hilary Mason.

Zoom out, zoom in, zoom out.

statMethods blog

A Quick-R Companion

the Tarzan

[R] + applied economics.

4D Pie Charts

Scientific computing, data viz and general geekery, with examples in R and MATLAB.