Data Mining
When I was a student, a long, long time ago, the term data mining described a frowned upon practice. Only bad people did data mining. It was a term used to describe the practice of doing a series of tests of significance on a set of data until a statistically significant effect was found. It was a form of sequential testing but it was a very unscientific, atheoretical form. Bad practice.
All that's changed. Data Mining today is a generally accepted practice, a tool that any good data analyst should have in their tool box. But today's version of data analysis isn't really the same as yesterday's version. There's some subtle but important distinctions to be made.
Wikipedia gives a simple definition for the term data mining
That's as good as any place else as a place to start defining the concept, but it's just a start. When we use the term data mining we're talking about more than just data, we're talking about a lot of data, a whole lot.
That's one of the ways that data mining differs from statistics. The field of statistics was developed as a tool to extract information from small amounts of data. Small sample statistics is the backbone of statistical theory and practice. We don't do that with data mining -- we're looking at extraction of information from huge collections of data.
Statistics is based on using probability models to fit data or to test hypotheses. Data mining is a process of uncovering models, not fitting models. A process of forming hypotheses, not testing them.
Here's a video of the first lecture a course on data mining given for first year grad students at Stanford and given simultaneously at the Google headquarters.
In this lecture he's mostly just defining the concept. I'll follow along in subsequent posts.
All that's changed. Data Mining today is a generally accepted practice, a tool that any good data analyst should have in their tool box. But today's version of data analysis isn't really the same as yesterday's version. There's some subtle but important distinctions to be made.
Wikipedia gives a simple definition for the term data mining
Data mining is the process of extracting patterns from data.
That's as good as any place else as a place to start defining the concept, but it's just a start. When we use the term data mining we're talking about more than just data, we're talking about a lot of data, a whole lot.
That's one of the ways that data mining differs from statistics. The field of statistics was developed as a tool to extract information from small amounts of data. Small sample statistics is the backbone of statistical theory and practice. We don't do that with data mining -- we're looking at extraction of information from huge collections of data.
Statistics is based on using probability models to fit data or to test hypotheses. Data mining is a process of uncovering models, not fitting models. A process of forming hypotheses, not testing them.
Here's a video of the first lecture a course on data mining given for first year grad students at Stanford and given simultaneously at the Google headquarters.
In this lecture he's mostly just defining the concept. I'll follow along in subsequent posts.
Labels: Data mining
25 Comments:
Very nice post, i certainly love this website, keep on it
Rusforum.com
Information
I was very pleased to find this web-site. I wanted to thanks for your time for this wonderful read!! I definitely enjoying every little bit of it and I have you bookmarked to check out new stuff you blog post.
Bsl24.de
Information
Click Here
I’d have to check with you here. Which is not something I usually do! I enjoy reading a post that will make people think. Also, thanks for allowing me to comment!
Information
Click Here
Visit Web
I discovered your blog site on google and check a few of your early posts. Continue to keep up the very good operate. I just additional up your RSS feed to my MSN News Reader. Seeking forward to reading more from you later on!…
Click Here
Visit Web
Forexfactory.com
Aw, this was a really nice post. In idea I would like to put in writing like this additionally – taking time and actual effort to make a very good article… but what can I say… I procrastinate alot and by no means seem to get something done.
Visit Web
Diggerslist.com
Information
An impressive share, I just given this onto a colleague who was doing a little analysis on this. And he in fact bought me breakfast because I found it for him.. smile. So let me reword that: Thnx for the treat! But yeah Thnkx for spending the time to discuss this, I feel strongly about it and love reading more on this topic. If possible, as you become expertise, would you mind updating your blog with more details? It is highly helpful for me. Big thumb up for this blog post!
Emoneyspace.com
Information
Click Here
Visit Web
Youre so cool! I dont suppose Ive read anything like this before. So nice to find somebody with some original thoughts on this subject. realy thank you for starting this up. this website is something that is needed on the web, someone with a little originality. useful job for bringing something new to the internet!
Visit Site
Ucoz.net
Would you be interested in exchanging links?
Outdoortvreview.com
Information
Click Here
Visit Web
Nice post. I learn something more challenging on different blogs everyday. It will always be stimulating to read content from other writers and practice a little something from their store. I’d prefer to use some with the content on my blog whether you don’t mind. Natually I’ll give you a link on your web blog. Thanks for sharing.
Mibotanicals.com
Information
Click Here
Visit Web
Visit Web
Vivaceproductions.com
Information
Would you be interested in exchanging links?
An interesting discussion is worth comment. I think that you should write more on this topic, it might not be a taboo subject but generally people are not enough to speak on such topics. To the next. Cheers
Information
Visit Site
I’d have to check with you here. Which is not something I usually do! I enjoy reading a post that will make people think. Also, thanks for allowing me to comment!
Information
Click Here
Visit Web
Information
Click Here
Visit Web
There are certainly a lot of details like that to take into consideration. That is a great point to bring up. I offer the thoughts above as general inspiration but clearly there are questions like the one you bring up where the most important thing will be working in honest good faith.
This is the right blog for anyone who wants to find out about this topic. You realize so much its almost hard to argue with you (not that I actually would want…HaHa). You definitely put a new spin on a topic thats been written about for years. Gretuff, just great!
Smartpink.or.id
Information
Click Here
Visit Web
Youre so cool! I dont suppose Ive read anything like this before. So nice to find somebody with some original thoughts on this subject. realy thank you for starting this up. this website is something that is needed on the web, someone with a little originality. useful job for bringing something new to the internet!
Click Here
Visit Web
Autohub.ng
Nice post. I learn something more challenging on different blogs everyday. It will always be stimulating to read content from other writers and practice a little something from their store. I’d prefer to use some with the content on my blog whether you don’t mind. Natually I’ll give you a link on your web blog. Thanks for sharing.
Tcossalab.net
Information
Click Here
I’m impressed, I must say. Really rarely do I encounter a blog that’s both educative and entertaining, and let me tell you, you have hit the nail on the head. Your idea is outstanding; the issue is something that not enough people are speaking intelligently about. I am very happy that I stumbled across this in my search for something relating to this.
Issues.apache.org
Information
Click Here
Visit Web
Spot on with this write-up, I truly think this website needs much more consideration. I’ll probably be again to read much more, thanks for that info.
Freelives.net
Information
Click Here
Visit Web
After study a few of the blog posts on your website now, and I truly like your way of blogging. I bookmarked it to my bookmark website list and will be checking back soon. Pls check out my web site as well and let me know what you think.
Redclayyoga.org
Information
Click Here
Visit Web
I was very pleased to find this web-site. I wanted to thanks for your time for this wonderful read!! I definitely enjoying every little bit of it and I have you bookmarked to check out new stuff you blog post.
Andalas
Wisata Andalas
Andalas Travel
Oh my goodness! an amazing article dude. Thank you However I am experiencing issue with ur rss. Don’t know why Unable to subscribe to it. Is there anyone getting identical rss problem? Anyone who knows kindly respond. Thnkx
Muda.co.id
Muda Indonesia
Muda Fashion
I am often to blogging and i really appreciate your content. The article has really peaks my interest. I am going to bookmark your site and keep checking for new information.
Emoneyspace.com
Information
Click Here
Visit Web
I’d have to check with you here. Which is not something I usually do! I enjoy reading a post that will make people think. Also, thanks for allowing me to comment!
Wisata Bandung
Bandoeng
Bandoeng.co.id
I’d have to check with you here. Which is not something I usually do! I enjoy reading a post that will make people think. Also, thanks for allowing me to comment!
Ulule.com
Information
Click Here
Visit Web
I’d have to check with you here. Which is not something I usually do! I enjoy reading a post that will make people think. Also, thanks for allowing me to comment!
Mypoeticside.com
Information
Click Here
Visit Web
Post a Comment
Subscribe to Post Comments [Atom]
<< Home