Data Mining and Anlaytics are the foundation technologies for the new knowledge based world where we build models from data and databases to understand and explore our world. Data mining can improve our business, improve our government, and improve our life and with the right tools, any one can begin to explore this new technology, on the path to becoming a data mining professional. This book aims to get you into data mining quickly. Load some data (e.g., from a database) into the Rattle toolkit and within minutes you will have the data visualised and some models built. This is the first step in a journey to data mining and analytics. The book encourages the concept of programming by example and programming with data - more than just pushing data through tools, but learning to live and breathe the data, and sharing the experience so others can copy and build on what has gone before. It is accessible to many readers and not necessarily just those with strong backgrounds in computer science or statistics. Details of some of the more popular algorithms for data mining are very simply and, more importantly, clearly explained. Technology for transforming a database through data mining and machine learning into knowledge is now readily accessible.
There are a few, or maybe several, GUIs for R - R commander, an R Excel plug-in, etc. Rattle is an interface for doing data mining. The book explains things in a straight-forward and practical way, and the interface is pretty cool. My favorite thing is the log which shows you all of the R-code, so you can see which library and functions are used to generate your results.
This is a decent intro to data mining with Rattle but it really is about the tool. There is some R in it but I wouldn't consider this a textbook for that. It does walk you through the standard workflow and go over the different kinds of analysis you can do but it's not for a complete noob. I found Rattle itself buggy and my review may have been influenced by that (although probably shouldn't be).
Data Mining with Rattle and R is an excellent book. The author has put a graphical shell on top of the R language, and structured it around the main steps of the CRISP-DM (Cross Industry Standard Process for Data Mining) methodology. If you have a set of data that is reasonably well prepared (Rattle does provide some useful options to transform data, but it's not the answer to this thorny problem)then you can be up-and-running and experimenting with data almost immediately.
One important element of Rattle is that it is an 'R' code snippet generator. You express what you want to do and it produces the code (and there are a good number of configuration options to customise it). However, you can then take that code back into 'R' and modify and improve it to your heart's content: you are not constrained (or limited) by Rattle's high-level approach. This is a great way of learning a new language, especially 'R' which has over 4000 independently developed packages running on top of it (Williams also does an excellent job of highlighting and including the analytics and visualisation packages which have proved their worth to the community, and uses them from within Rattle).
His style of writing is clear, and his section on Building Models (including cluster analysis, association analysis, decision trees, random forests, boosting and support vector machines) is particularly well structured.
The worked examples do have a number of errors in them. I was particularly frustrated by the occasional inability to take the Rattle generated R code and run it natively inside RStudio. Further, some of the database connectivity doesn't work (I wanted to use SQL Server, which is currently broken), but you can't reasonably expect one individual - in their spare time - to support myriad storage products.
This book is excellent, and Rattle should be considered by any commercial organisation who is wanting to get a Data Mining team up-and-running very quickly and effectively, and to produce models for taking into production. It is not a substitute for subject-matter expertise and does not address handling "Big Data" volumes of information, but is a highly effective way of getting a team started.
I started reading this because I was getting a bit fed up with Witten's book that covers data mining using Weka; I felt that Witten et. al. were being extremely long-winded, and not doing that great a job of explaining, a bad combination.
This book was an excellent antidote. Good explanations, and a good tutorial for Rattle (an alternative to Weka, especially in the R environment). I'm giving it four stars only because it could have included coverage of some additional techniques that are in Rattle, but not covered in the book.