Human-Centered Data Science Lab

Megan Torkildson Presenting at CHI Student Research Competition

Posted by Daniel Perry on February 15, 2013

Lab member and HCDE undergraduate Megan Torkildson has been accepted to the CHI Student Research Competition (36% acceptance rate) for her work on “Visualizing Performance of Classification Algorithms with Additional Re-Annotated Data”. The next round of the competition involves a poster presentation during the conference. Currently, she is working with PhD students Katie Kuksenok and Sean Mitchell to run additional user studies on the visualization.

Michael Brooks & Katie Kuksenok Win Shobe Prize

Posted by Daniel Perry on February 12, 2013

Lab members Katie Kuksenok and Michael Brooks have won the 2012-2013 Shobe Prize with their proposal for Feedback Sandwich, an “app for collecting real-time feedback from friends and colleagues in a non-awkward way.” Competing teams submitted a pitch for a technology design project, and two winning teams were selected by a panel of judges to receive $5000, office space, and one-on-one mentoring to develop their product idea. The other winning team was Go-Go-Games, a startup founded by HCDE PhD student Alexis Hiniker and Stanford University Graduate School of Education alumni Joy Wong Daniels and Heidi Williamson.

Statistical Affect Detection in Collaborative Chat (CSCW 2013)

Posted by Katie Kuksenok on December 22, 2012

SCCL work will be presented at the Computer-Supported Cooperative Work 2013 conference [1]. It has been the result of work by a diverse, interdisciplinary group of people working to understand the role of affect expression in distributed scientific collaboration.

Our affect taxonomy contains 44 codes [2]

We have been working with a tremendous, rich dataset – half a million chat messages sent by several dozen scientists, some in US and some in France, over four years while sharing a remote telescope. In order to understand the role of affect, we first developed a method for manually labeling the presence of affect in chat messages [2]. We are interested in high-granularity labels, and our labels include “excitement,” “frustration.”

But there are many more chat messages (half a million!) than can reasonably be labelled manually, so we decided to try to automate identification of affect expression; our CSCW2013 paper reports a detailed description of our approach, including trade-offs of various decisions in the machine learning pipeline [1]. The automation is not expected or intended to replace the human process of interpretation, but to provide an analytic lens that makes a large dataset accessible for analysis of the role of affect. Automated labels of affect can be used to enable large-scale analysis of social media data, including but not limited to chat logs [3].

Different codes benefit from different kinds of features [1]

There are three lessons for applying machine learning to affect detection in chat we have carried from our experiments [1]:

Use specialized features: add to counts of words, with features particular to the medium (in chat, after all, it pays to distinguish “what.” from “WHAAAAAAAT”) or particular to the context (such as acronyms or conversation participant names, where known)
Different features benefit different codes: we were inclusive with adding features to the set, and trained separate classifiers for each code, as different features have different effectiveness across codes (eg, swear words and “frustration”)
Use an interpretable classifier: this helped to improve feature sets by reasoning about what features were deemed important

Our resulting pipeline is available on GitHub as a command-line tool, ALOE. In ongoing work, we are incorporating automation provided by ALOE into a web-based tool for large-scale analysis of social media data, TextPrizm [3].

[1] M. Brooks, K. Kuksenok, M. K. Torkildson, D. Perry, J. J. Robinson, T. J. Scott, O. Anicello, A. Zukowski, P. Harris, C. Aragon. Statistical Affect Detection in Collaborative Chat. CSCW 2013. PDF

[2] T. J. Scott, K. Kuksenok, D. Perry, M. Brooks, O. Anicello, C. Aragon. Adapting Grounded Theory to Construct a Taxonomy of Affect in Collaborative Online Chat. SIGDOC 2012. PDF

[3] K. Kuksenok, M. Brooks, J. J. Robinson, D. Perry, M. K. Torkildson, C. Aragon. Automating Large-Scale Annotation for Analysis of Social Media Content. Poster at 2nd Workshop on Interactive Visual Text Analytics, IEEE VisWeek (2012). PDF

“Gender Browser” Analyzes Gender Composition of Scholarly Articles

Posted by Daniel Perry on October 24, 2012

A recent article in The Chronicle, titled “Scholarly Publishing’s Gender Gap“, discussed the Eigenfactor Project, a University of Washington research effort spearheaded by biologists Jevin West and Carl Bergstrom. The goal of the project is to “use recent advances in network analysis and information theory to develop novel methods for evaluating the influence of scholarly periodicals and for mapping the structure of academic research.” As part of the project, the Bergstrom Lab analyzed two million scholarly articles to determine the gender of the authors and calculate the gender gap in scholarly publishing.

SCCL member Michael Brooks and lab director Cecilia Aragon helped developed the gender browser referenced in article, in collaboration with Jevin West, Carl Bergstrom, and Jennifer Jacquet. The browser allows readers to view the gender composition of scholarly articles from 1665 to 2011. Aragon and Brooks developed the hoptree visual navigation used in the browser.

Megan Torkildson Presenting at CHI Student Research Competition

Michael Brooks & Katie Kuksenok Win Shobe Prize

Statistical Affect Detection in Collaborative Chat (CSCW 2013)

“Gender Browser” Analyzes Gender Composition of Scholarly Articles

Affiliations