making a ‘wordcloud’ in R
Wordclouds (I like to write it in one word) provides a great visual representation of texts that are generally hard to make sense of otherwise. for busy researchers, what can be more fatiguing than having to go through endless pages to generate the key themes out of it. WC is a sleek answer to that that can be made in R in a matter of few minutes.
Here is a quick demo to that by the help of the following packages,
install.packages(c("tm", "SnowballC", "wordcloud", "RColorBrewer", "RCurl", "XML")
We will a text file of Jane Austin’s Pride and Prejudice available for free from gutenberg.
filePath <- "https://www.gutenberg.org/files/1342/1342-0.txt"text <- readLines(filePath)
After a few lines of codes:
wordcloud(words = d$word, freq = d$freq, min.freq = 1,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=brewer.pal(8, “Dark2”))
The wordcloud package does a great job in producing an eyecatching figure…
We get a shorter list of words by restricting the min.freq of the keywords to five:
or ten:
Right now the text contains apostrophes and many commonly occurring words (e.g. am/is/are/me/too) aka stop words. The tm_map function eases the task of removing special characters and produces a much neater chart by removing those.