R Text mining - how to change texts in R data frame column into several columns with bigram frequencies? -


in addition question r text mining - how change texts in r data frame column several columns word frequencies? wondering how can manage make columns bigrams frequencies instead of word frequencies. again, many in advance!

this example data frame (thanks tyler rinker).

      person sex adult                                 state code 1         sam   m     0         computer fun. not fun.   k1 2        greg   m     0               no it's not, it's dumb.   k2 3     teacher   m     1                    should do?   k3 4         sam   m     0                  liar, stinks!   k4 5        greg   m     0               telling truth!   k5 6       sally   f     0                how can certain?   k6 7        greg   m     0                      there no way.   k7 8         sam   m     0                       distrust you.   k8 9       sally   f     0           talking about?   k9 10 researcher   f     1         shall move on?  then.  k10 11       greg   m     0 i'm hungry.  let's eat.  already?  k11 

data set above:

library(qdap); data 

the dev version of qdap (should go cran within next few days) ngrams. you'll need use dev version. on toy data set fast on larger data set such qdap's mraja1 data set requires ~5 minutes complete. could:

  1. select bigrams more wisely (i.e., don't use them there's going ton)
  2. wait time
  3. run in parallel
  4. figure out way this
  5. get faster computer

here's code dev version of qdap , run bigram search:

library(devtools) install_github("qdap", "trinker") library(qdap)  ## gets bigrams bigrams <- sapply(ngrams(data$state)[[c("all_n", "n_2")]], paste, collapse=" ")  ## searches grouping variable bigram use termco(data$state, data$person, bigrams)   ## raw values termco(data$state, data$person, bigrams)[["raw"]] 

Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

CSS3 Transition to highlight new elements created in JQuery -