get_diag() is a helper function to compute average and median semanticCoherence and exclusivity for a number of stm models. The function does not work for models with content covariates.

get_diag(models, outobj)

Arguments

models

A list of stm models.

outobj

The out object containing documents for all stm models.

Value

Returns model diagnostics in a data frame.

Examples

#> stm v1.3.6 successfully loaded. See ?stm for help. #> Papers, resources, and other materials at structuraltopicmodel.com
#> #> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats': #> #> filter, lag
#> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union
#> Package version: 2.1.2
#> Parallel computing: 2 of 8 threads used.
#> See https://quanteda.io for tutorials and examples.
#> #> Attaching package: 'quanteda'
#> The following object is masked from 'package:utils': #> #> View
# prepare data data <- corpus(gadarian, text_field = 'open.ended.response') docvars(data)$text <- as.character(data) data <- dfm(data, stem = TRUE, remove = stopwords('english'), remove_punct = TRUE) out <- convert(data, to = 'stm') # fit models gadarian_3 <- stm(documents = out$documents, vocab = out$vocab, data = out$meta, prevalence = ~ treatment + s(pid_rep), K = 3, max.em.its = 1, # reduce computation time for example verbose = FALSE) gadarian_5 <- stm(documents = out$documents, vocab = out$vocab, data = out$meta, prevalence = ~ treatment + s(pid_rep), K = 5, max.em.its = 1, # reduce computation time for example verbose = FALSE) # get diagnostics diag <- get_diag(models = list( model_3 = gadarian_3, model_5 = gadarian_5), outobj = out) # \dontrun{ # plot diagnostics diag %>% ggplot(aes(x = coherence, y = exclusivity, color = statistic)) + geom_text(aes(label = name), nudge_x = 5) + geom_point() + labs(x = 'Semantic Coherence', y = 'Exclusivity') + theme_light()
# }