SCCL work will be presented at the Computer-Supported Cooperative Work 2013 conference [1]. It has been the result of work by a diverse, interdisciplinary group of people working to understand the role of affect expression in distributed scientific collaboration.
But there are many more chat messages (half a million!) than can reasonably be labelled manually, so we decided to try to automate identification of affect expression; our CSCW2013 paper reports a detailed description of our approach, including trade-offs of various decisions in the machine learning pipeline [1]. The automation is not expected or intended to replace the human process of interpretation, but to provide an analytic lens that makes a large dataset accessible for analysis of the role of affect. Automated labels of affect can be used to enable large-scale analysis of social media data, including but not limited to chat logs [3].
There are three lessons for applying machine learning to affect detection in chat we have carried from our experiments [1]:- Use specialized features: add to counts of words, with features particular to the medium (in chat, after all, it pays to distinguish “what.” from “WHAAAAAAAT”) or particular to the context (such as acronyms or conversation participant names, where known)
- Different features benefit different codes: we were inclusive with adding features to the set, and trained separate classifiers for each code, as different features have different effectiveness across codes (eg, swear words and “frustration”)
- Use an interpretable classifier: this helped to improve feature sets by reasoning about what features were deemed important
Our resulting pipeline is available on GitHub as a command-line tool, ALOE. In ongoing work, we are incorporating automation provided by ALOE into a web-based tool for large-scale analysis of social media data, TextPrizm [3].
[1] M. Brooks, K. Kuksenok, M. K. Torkildson, D. Perry, J. J. Robinson, T. J. Scott, O. Anicello, A. Zukowski, P. Harris, C. Aragon. Statistical Affect Detection in Collaborative Chat. CSCW 2013. PDF
[2] T. J. Scott, K. Kuksenok, D. Perry, M. Brooks, O. Anicello, C. Aragon. Adapting Grounded Theory to Construct a Taxonomy of Affect in Collaborative Online Chat. SIGDOC 2012. PDF
[3] K. Kuksenok, M. Brooks, J. J. Robinson, D. Perry, M. K. Torkildson, C. Aragon. Automating Large-Scale Annotation for Analysis of Social Media Content. Poster at 2nd Workshop on Interactive Visual Text Analytics, IEEE VisWeek (2012). PDF
Leave a Reply
You must be logged in to post a comment.