Sinopsis
To an extent, data science is synonymous with or related to terms like business analytics, operations research, business intelligence, competitive intelligence, data analysis and modeling, and knowledge extraction (also called knowledge discovery in databases or KDD). It’s just a new spin on something that people have been doing for a long time.
There’s been a shift in technology since the heyday of those other terms. Advancements in hardware and software have made it easy and inexpensive to collect, store, and analyze large amounts of data whether that be sales and marketing data, HTTP requests from your website, customer support data, and so on. Small businesses and nonprofi ts can now engage in the kind of analytics that were previously the purview of large enterprises. Of course, while data science is used as a catch-all buzzword for analytics today, data science is most often associated with data mining techniques such as artifi cial intelligence, clustering, and outlier detection. Thanks to the cheap technology-enabled proliferation of transactional business data, these computational techniques have gained a foothold in business in recent years where previously they were too cumbersome to use in production settings.
In this book, I’m going to take a broad view of data science. Here’s the definition I’ll work from:
Data science is the transformation of data using mathematics and statistics into valuable insights, decisions, and products.
This is a business-centric defi nition. It’s about a usable and valuable end product derived from data. Why? Because I’m not in this for research purposes or because I think data has aesthetic merit. I do data science to help my organization function better and create value; if you’re reading this, I suspect you’re after something similar.
With that defi nition in mind, this book will cover mainstay analytics techniques such as optimization, forecasting, and simulation, as well as more “hot” topics such as artifi cial intelligence, network graphs, clustering, and outlier detection.
Some of these techniques are as old as World War II. Others were introduced in the last 5 years. And you’ll see that age has no bearing on diffi culty or usefulness. All these techniques—whether or not they’re currently the rage—are equally useful in the right business context. And that’s why you need to understand how they work, how to choose the right technique for the right problem, and how to prototype with them. There are a lot of folks out there who understand one or two of these techniques, but the rest aren’t on their radar. If all I had in my toolbox was a hammer, I’d probably try to solve every problem by smackingit real hard. Not unlike my two-year-old. Better to have a few other tools at your disposal.
Content
- Everything You Ever Needed to Know about Spreadsheets but Were Too Afraid to Ask
- Cluster Analysis Part I: Using K-Means to Segment Your Customer Base
- Naïve Bayes and the Incredible Lightness of Being an Idiot
- Optimization Modeling: Because That “Fresh Squeezed” Orange Juice Ain’t Gonna Blend Itself
- Cluster Analysis Part II: Network Graphs and Community Detection
- The Granddaddy of Supervised Artifi cial Intelligence—Regression
- Ensemble Models: A Whole Lot of Bad Pizza
- Forecasting: Breathe Easy; You Can’t Win
- Outlier Detection: Just Because They’re Odd Doesn’t Mean They’re Unimportant
- Moving from Spreadsheets into R
0 komentar:
Posting Komentar