Rate this book

Text Mining with R: A Tidy Approach

Name: Text Mining with R: A Tidy Approach
Rating: 4.33 (13 reviews)
ISBN: 9789352135769

Julia Silge

Rate this book

Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, youíll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. Youíll learn how tidytext and other tidy tools in R can make text analysis easier and more effective. The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. Youíll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media.Learn how to apply the tidy text format to NLP Use sentiment analysis to mine the emotional content of text Identify a documentís most important terms with frequency measurements Explore relationships and connections between words with the ggraph and widyr packages Convert back and forth between Rís tidy and non-tidy text formats Use topic modeling to classify document collections into natural groups Examine case studies that compare Twitter archives, dig into NASA metadata, and analyze thousands of Usenet messages

GenresProgrammingNonfictionCodingTechnicalTechnologyComputer ScienceScience

196 pages, Paperback

Published January 1, 2017

76 people are currently reading

217 people want to read

About the author

Julia Silge

4 books30 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

81 (50%)

4 stars

55 (34%)

3 stars

18 (11%)

2 stars

5 (3%)

1 star

0 (0%)

Displaying 1 - 13 of 13 reviews

Joanna

130 reviews

December 19, 2023

Did i read this for work? Yes. Was it useful? Also yes. Am i going to count it towards my total for the year? You bet!!

2023 nonfiction

Justohidalgo

79 reviews3 followers

August 10, 2020

Really comprehensive book about text mining with R and tidy. While it is understood that some R and tidy knowledge are required to work out the examples of the book, at around the TF/IDF chapter I started to feel that I was spending more time checking out google to see what that specific R function was doing, than to fully grasp the theoretical concepts applied to the cases. That made me lose interest and wanting to find other references. But I finally found the time to finish it and I have to say that all in all, this is a good book to see how to handle basic-to-medium use cases of text mining with R. I believe it may become a reference book for me when trying to work out my own datasets.

technical

Sahelanth

45 reviews6 followers

Read

July 21, 2018

Great code examples! Easy to emulate, shows the necessary data cleaning and preprocessing and gives good tips for what to do in other contexts. You'll need to be already familiar with R and the dplyr package to get anything out of this book, though.

If you don't know R or dplyr and want to jump straight in to natural language processing, I'd instead recommend starting with the vignettes for the tm or quanteda packages.

262 reviews35 followers

September 6, 2020

It covers the basics (sentiment analysis, tf-idf, n-gram, topic modelling, and visualization) well and the chapters on case studies are pretty helpful. The use of literature (Jane Austen's novels and more) as data also makes it more engaging to a literary minded reader.

It's just when the author says "slightly familiar with dplyr and ggplot2" on the preface, she means she is not going to explain any codes relating to these two packages. Compared to all those annotated-line-by-line codes in other online tutorials, this book may not be that accessible to a beginner.

On topic modelling, you may want to google how to determine the number of topics as more systematic approaches to such determination are not covered.

Emil O. W. Kirkegaard

183 reviews395 followers

May 10, 2020

Disclaimer: I am not an expert on text mining, but I do have ~8 years of data science experience.

This was a very nice introduction to doing it in R, and the examples were very interesting too. In general, I recommend books by these authors.

My only complaint is that they did not go into details about how they scraped Twitter posts. The API is quite annoying and limited, so one might have to do some regular webscraping. Guess I should read a textbook on that next.

Book is free at https://www.tidytextmining.com/

Tony Murray

7 reviews

August 6, 2020

I enjoyed working through the book but it is a bit dated at this point and has some areas that are not functioning due to outdated packages. At times I had to go to the website and then review what they had updated on website. Also there are times where they don’t have code set up for a user to actually execute it. For example, the code related to the twitter files were a bit confusing. I had to go to the github repository to actually download the data and this should have been explained since the book really is a mixture of coding and commentary.

Yuan

31 reviews1 follower

July 24, 2022

Although this book is no longer the most up-to-date book on text mining. I really like a lot of the plots (ggplots) in this book, in particular for exploratory analysis. You can inspect the outputs at each stage, and visuals are great ways to make sense of the text data and communicate your findings. I will definitely reuse the plots in this book for further work.

Thien Dong

8 reviews

March 9, 2019

A great primer for text mining

The examples are interesting and very easy to follow. If you have any problem applying the techniques to your data set, just a quick search would lead you to the solutions!

Robert Campbell

Author 9 books17 followers

June 20, 2019

Excellent coverage of taking a tidy approach to text analysis, with a generous number of worked examples. The one drawback is that much of the code used requires at least an intermediate-level working knowledge of R.

Pritesh Shrivastava

80 reviews6 followers

September 5, 2018

I found this book fairly useful to do sentiment analyis and topic modelling using the faimiliar tidyverse tools.

Luis Amigo

7 reviews

May 17, 2019

Nicely written, deep concepts, easy lecture. IMHO a must-read for anyone interested on text mining, no matter which language he plans to use.

Anthony

154 reviews

May 22, 2019

Good overview of the tidytext library in R. Note the end-all of text analyses, but a good place to begin. I now need to get something to do some analyses on...

coding

Oconnor

76 reviews2 followers

April 30, 2021

Awesome book - with great step by step code to follow. The author's clearly explain analytical questions and walk through their analysis. Its so good I read it twice.

Displaying 1 - 13 of 13 reviews