Human-Centered Data Science Lab (HDSL) Blog

How I Tracked Down a Peculiar Problem in a Fanfic Dataset Using Visualizations – Travis Neils

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on How I Tracked Down a Peculiar Problem in a Fanfic Dataset Using Visualizations – Travis Neils

Originally posted on June 27, 2021 at

Every review on has an associated timestamp telling us exactly when it was posted, or so we thought. When trying to find the hours of peak review activity across different fandoms, I saw some fandoms with very uneven review count distributions (shown below). 


What made this even more confusing was that some of the fandoms had much more pronounced spikes at 7 and 8 UTC. I compared the fandoms with large spikes to those without. I noticed that ones with spikes tended to be fandoms with many reviews in the early 2000s. I wanted to look at how the distribution of review time changed over each year. I chose to make a heat map of the average daily distribution by year. I did some data wrangling so that I could put year on the Y axis and hour of review on the X- axis. Below is the result.

First Exploratory Visualization:


The resulting visualization made the situation much clearer. Every year should look like the ones between 2012 and 2017, where reviews are relatively evenly spread across the day with small variations at peak hours.  Before 2012 we see very different behavior. Around 60% of reviews have a time stamp of 7 UTC, and the other 40% have a timestamp with 8 UTC. We see absolutely no reviews with timestamps for other hours. 2011 is a unique case where we have almost all reviews posted during the 7-8 UTC with less than 1% posted at other hours. To get a closer look at 2011, I filtered the data to just 2011 and used months instead of years on the Y-axis.

Second Exploratory Visualization:


This graph reveals two important clues as to what is wrong with the dataset. The first is what the split between 7 and 8 values is from. 7 and 8 values each have a specific time of year where they are the only review time, switching in March and November. I realized that something else important happens with dates in March and November, daylight-savings time. I looked up the daylight-saving times for 2011 and it was from March 14th to November 6th we see those dates reflected exactly in the data. We even see that March is evenly split because the 14th is close to the middle of the month. In November  the 6th is close to the beginning so we see an uneven split. The other thing that this graph shows us is that at some point in December the dates started to match the expected values. To get the most accurate value of the date that this happened I had to switch to looking at the day instead of month, and found that on December 27th all dates are 0 UTC and then after that they seem to be accurate to the minute.

Now that I had the issue clearly defined, I had to figure out why this was an issue in our data in the first place, and hopefully fix it. Instead of exploring our collected data, I saved a lot of time by going right to the source. I went on, found some old reviews, inspected the webpage to find the UTC time stamp, and converted the timestamps into datetimes. I found that all the old reviews on the site were either 7 or 8 pm. I wasn’t able to find an exact reason that the site is inaccurate, but I believe that when the backend was built in 2000 they decided to save some hard drive space by only saving dates by the day. 


Here is 2015, a typical year, showing what the review distribution should look like. We can see people reviewing later in the day during summer and winter break. 8 UTC is midnight Pacific and 3 am Eastern, we see the lowest usage during the three following hours when many of the US reviewers are sleeping. 

While creating these visualisations I learned: 

1. Visualise both before and after processing data. Before calculating the month from the date, the visualisations didn’t discover the inaccuracies. After splitting date into the year and hour of the day variables, the visualisations showed the problems with the dates.

2. Look deeper if something seems weird. When I first saw the problem by accident I almost dismissed it. Going off on a tangent ended up making a discovery that will be helpful for future research with this dataset.

3. Creating a presentation can help with findings. When creating a presentation to the group I built an interactive version of the graph (linked below). The interactive version showed that the data was missing a lot of reviews from when we were scraping the site in late 2016 to early 2017, another important thing to know when using this dataset.  

You can see the code to create the visualizations here:

You can find an interactive version here:


Why Fanfiction Authors Should Find a Tiny Corner of Fandom – Jenna Frens

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on Why Fanfiction Authors Should Find a Tiny Corner of Fandom – Jenna Frens

Originally posted on October 29, 2019 at


Fandom is a giant online space place made up of thousands of tiny online places, and in this article, we’re going to talk about why it’s important for each author to find a tiny place of their own. Fandom corners are internet places where people connect in small numbers over highly niche interests. In our analysis of 29 hour-long interviews with authors, three central themes around fandom corners emerged: connection, encouragement and feedback. A fandom corner could take the form of a Discord server, a chat group on Facebook or Skype, a board on a less-traveled forum, or the right intersection of tags on AO3. Finding your corner means connecting with relatable people who make you feel comfortable, encourage you, and can give you feedback on your ideas and writing.

Connection and Comfort

Authors feel more connected in fandom corners because of the niche intersections of interests and identity that bring people together. For example, one author spoke with us about a set of Discord servers around a ship and fandom that brought together queer women:

“I particularly like that there’s sort of these little communities of queer women or mostly queer women or queer aligned groups… It’s just nice to talk to people who get it, who get why you’re so excited” (P4).

People in this group had common ground because they shared a traditionally underrepresented identity and they were into the same fandom. The term affinity has been used by internet researchers to describe shared interests and identities. In P4’s experience, affinity in the fandom corner created a safe environment to talk about writing queer sexuality in fanfiction.

“I’ve seen how friendly and nonjudgmental everyone is in responding [to others]. That makes me feel quite safe to go and ask them, ‘how do I write this thing?’ And it’s something that’s quite sort of deeply personal and intimate” (P4).

The community Discord provided authors with a safe place where they can connect to each other in a carefully moderated and curated group. But the feeling of small community extended into public spaces as well, where P4 noticed the same usernames coming up repeatedly in AO3 comments and on Tumblr. Although the overall community is large, the group of people interested in a few specific tags can be very small. P4 began connecting with others over private message.

“When I get the same people commenting on things that I’ve written, that makes me feel like I’m part of a little group… I’m part of the gang that does this. And privately talking to people who’s stuff I read who are other fans, it’s a quite nice feeling of belonging” (P4).

Finding a space where a small group of people connect over niche interests can help authors to feel like they belong, give them a comfortable place to talk about sensitive topics, and help them find people to connect to over private chat. As we discuss next, these connections and comfortable places are also support writing as authors receive encouragement and feedback from their communities.

Overcoming Writer’s Block

Fandom corners are helpful spaces for authors experiencing writer’s block. Almost everybody we interviewed mentioned a time where they felt stuck, unable to make forward progress, or that a scene was just not working. A common strategy was to take a short break from writing and come back to it, but sooner or later authors would turn to others to vent their frustrations or get help. A small community provides the perfect setting for authors to feel comfortable talking about their frustrations and supported by people who are close to them.

“There was a chat that I used to visit a lot… We will be very encouraging toward one another and to encourage others to just continue writing even though we were complaining” (P3).

Small groups provide a space where authors can commiserate about the sheer difficulty of writing, letting them vent out frustrations to an audience that really understands. The encouragement they receive helps them keep going. In addition, authors might send a snippet of their writing and receive fresh ideas that help them to get unstuck.

“I’d copy out a section and paste it into discord… they read the part where I was stuck and said just keep doing this, I think what you’ve done here is cool and maybe try doing this as well” (P12).

“Sometimes just, even though I’m very introverted, turning to my trusted group of friends and having them help me troubleshoot is very, um, it turns a problem into something that’s really fun and silly” (P10).

Because they were comfortable enough to share rough writing when they’re stuck, authors could get encouragement and new ideas to help them move forward. Communities also organized ‘sprint’ events, where writers held each other accountable for writing as much as they could for a short period of time.

“We will set time and be like “in the next 30 minutes, we are going to write as much as you can and when we come back, share the sentences”. So kinds of being forced to write. Some people come back be like “I wrote a thousand words” and I will be like “I got 10”. I will be like I didn’t come up with anything but they will be like “well those 10 words you didn’t have them before. So overall it’s a positive thing” (P3).

The closeness people feel in fandom corners creates an environment where they can commiserate, give each other encouragement, and be accountable for writing. This does a great deal to help authors break through and make progress when they’re feeling frustrated or stuck.


A fandom corner is the perfect place for authors to get feedback on their writing. Since these small communities are places where people share interests and build relationships, authors felt that there was a high likelihood that others would be interested and respond to requests for feedback.

“There’s a couple of different communities where there’s people that I trust, and I might post a general message saying ‘Anyone willing to give a look at this, and tell me what you think?’ And then usually then somebody will reply” (P1).

Authors preferred to get feedback from people they already knew and trusted, especially when seeking feedback on unfinished work. So an online space frequented by those folks was a great place to ask for help. Authors would often get immediate responses when they posted to group chats, allowing them to ask for help as they were in-process with their writing.

“I’ll post a snippet of a scene and it will be hey how do you guys think about this part I am working on it right now. With the discord group they are very immediate. They’re really good for in the moment help…We have a very close friendly relationship” (P15).

Having an ongoing relationship with feedback providers also helped authors to get deeper, more thoughtful feedback. They felt understood by their in-group because of their shared context. If someone has read all of your prior fic, they start with a lot of common ground for giving feedback.

“They’ve all read my fic pretty in depth. So I can be like remember when this happened, or where should I go for this part of, you know, my next venture into this universe or whatever. They know what’s up there so I don’t have to reexplain everything or force them to watch the show or something, so they can understand what I’m thinking all the time” (P17).

The benefit of fandom corners boils down to being understood by others. Authors in these tight-knit communities mutually understand each others’ interests, their writing contexts, and the experience of writing fanfiction. That’s why others in the fandom corner will offer encouragement and comments to each other in public spaces like Tumblr and AO3.

“I feel like there’s a sort of comment exchanging between writers in fanfic, you know, I’ll comment on yours and you’ll comment on mine, cause we all know how much we love it.” (P4).

People maintain the same connection whether they’re in a small group chat or a public internet space. This begs the question: is the fandom corner the small group chat, or is it the tightly-connected people who are there? We’d love to hear your thoughts on this in the comments.


For every combination of tags, there is an opportunity for people to connect and form a small, close community around their shared interests. Wherever fandom corners emerge, people feel comfortable, build relationships and find support. These little groups are great places for authors to connect with each other, get encouragement when they’re stuck, and get helpful feedback on their in-progress writing. So go find your corner!

About This Series

This series is a breakdown of findings from an interview study run by a fanfiction research group within the department of Human Centered Design & Engineering at the University of Washington. In January and February 2019, we interviewed 29 fanfiction authors to understand how they connect with each other, build relationships, and seek out writing feedback. We learned profound lessons about the importance of building connections, the reciprocality of relationships and feedback, and the intersection of fandom with real life identity.

Authored by Regina Cheng and John Frens. 

This work was first posted to tumblr in August & September 2019.


Four Benefits to Participating in Fanfiction Events – Regina Cheng

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on Four Benefits to Participating in Fanfiction Events – Regina Cheng

Originally posted on September 7, 2019 at


It is challenging for writers to get feedback for in-progress works. Fanfiction authors, especially those who recently started writing fanfiction, report facing various barriers connecting with feedback providers. Some frequently-mentioned barriers in our interviews include the anxiety of reaching out to people and requesting feedback, the difficulty identifying people with the right knowledge, skills and interests that they are looking for, and the lack of a community where they can ask for feedback. Fortunately, we’ve found fanfiction authors use multiple strategies to overcome these barriers. One important strategy we are going to talk about in this blog post is participating in fanfiction-related events.

During our interviews, 16 out of 29 authors mentioned events as an important part of their experience in fanfiction communities. These events include a range of experiences, such as “bangs” where artists and fanfiction authors create based on one another’s works, exchanges where authors share story prompts and write stories based on those, and many informal activities such as writing sprints, roleplays and many others. Participating in these events can result in positive effects on fanfiction authors in many different ways.

Exposure to the community

Getting exposure to new stories and authors is a direct benefit of participating in events. Many of the authors told us that they get to know more people and works in the community. “Secret santa for example is something that makes you possibly read something from someone you otherwise may not read before…” one author said. “There might be good works but I just don’t know because I just don’t read them. So in the whole thing you definitely get more exposure to people you haven’t read their work before.” – p9

Especially for authors who newly entered the community, contributing to events is a good way to get their work read by more people and become recognized in the community: “People aren’t going to see your story if you’re starting out. But if it is for a fest people will check the fest stories on AO3. The fest will be promoting your work. It’s a good way to get involved, a good way to get your work seen by people”. -p13

Connecting with beta readers and feedback providers

Authors get to find beta readers and feedback providers through participating in events. In many formal events, organizers will pair participating authors with beta readers who have similar interests: “They do a good job of making sure if you can’t find a beta, they’ll have a list where people are ready.” – p13 “At some point they’re going to ask the writers to send a rough draft of what they have, and send that to the beta readers. -p24”

For authors who’ve had difficulty finding good beta readers, it’s a great opportunity to try out working with one during an event. We heard stories from authors that they first met their long-term beta readers when they were paired with them in an event. After they worked together, they discovered they had matching tastes and could work well with each other, so they continued beta reading and feedback exchange for later works. “That was another sort of Big Bang event where they had originally been assigned to me as a beta to sort of critique and give feedback on my story as I was working,” one author said to us as they were describing how they first met their beta reader, “so I’d found that connection with them as someone who was really good at giving feedback and someone whose feedback I trusted.” -p1

Writing together and building community

Writing has long been regarded as an individual activity, but events provide fanfiction authors opportunities to write with others, and to build bonds with people in the community. Many events provide their own group chat for discussion such as a discord channel, where people carry out informal writing activities and chat about their in-progress stories.

Activities such as writing sprints may help authors with writers’ block. Authors encourage each others while writing: 

“I was always on there (a discord channel for National Novel Writing Month) doing writing sprints with other writers, encouraging other writers being like hey even if you can’t write today, or like even if you didn’t hit your word count today, you still wrote something, which is better than zero…Because we’re also in the middle of a sprint together, we’d celebrate being like oh my god, it was ten minutes of hell but it was worth it.” -p20

Authors also discuss the writing process in event group chats. Compared to beta reading, this form of feedback exchange is less formal and structured, but equally beneficial: 

“We would just throw ideas at each other, oh this character did that? It would be really cool if this person reacted in that way. Or I think that’s kind of out of character, maybe this should happen instead, kind of thing. Because it was an online group, just a group chat, it was just like immediate responses.” -p20

Many authors told us that they appreciate this kind of dynamic feedback exchange during an event. By actively reading about others’ progress and giving each other timely suggestions, authors feel a sense of community and close bond with people in the chat: 

“It’s building something with my friends…I know these people better than some of the people who leave a wow I like that sort of comment on a story because there’s just more information shared, there’s more communication going on. It’s not just a simple exchange of a compliment and thank you. It’s not shallow.” -p12

Relationship beyond fanfiction

Many relationships between authors in the community stem from events. Giving feedback on fanfiction works is definitely an important part of these relationships, but they often develop into more personal connections and friendships. Working on fanfiction together during an event makes the bond between people stronger: 

“You know something about them, you’ve written something for them, you end up spending a little bit of time… I followed some people that I did exchanges with for the Sherlock fandom. I follow them on Twitter and Tumblr still. I mean I didn’t rock to them all the time, but I know who they are and they know who I am, it’s a mutual sort of thing.” -p3 

Working with others tends to increase mutual trust, and thus people become more open to discussing real life in addition to fandom topics. One author told us about people he had worked with during a roleplay event. “I spend a lot of time just talking about our days, venting our frustrations about whatever has happened in our lives or work.” – p12 

Another author shared their story of meeting a later good friend through participating in a bang event: “They paired me with an artist that then I had conversations with over email. I got to know her better because we had a back and forth about making the art… One of the things we talked about, if I was ever in her area I would let her know so we could meet in person.” They then added, “And I don’t know if we followed each other on tumblr before but we definitely followed each other after that. I read her posts sort of more than I read about other people’s on tumblr. We will sometimes comment on each other’s posts, we have more of a connection than usual on tumblr.” -p18

Our findings suggest that participating in events can be beneficial to fanfiction authors, both with their writing and in many ways beyond writing. Authors may wish to consider participating in events to practice writing and become more connected to others.

We are actively posting blog posts about other findings from our interview study. Check out our blog later for more findings!


Three Reasons It’s Difficult to Connect in the Fanfiction Community – Jenna Frens

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on Three Reasons It’s Difficult to Connect in the Fanfiction Community – Jenna Frens

Originally posted on August 26, 2019 at


Online connection changes lives. Authors we interviewed found support, feedback, friendship and even lifelong partnership from people they met online through fandom. People online can help authors develop their writing by providing feedback both in public spaces like AO3 and in private chat. But getting connected is not always easy. Several authors encountered difficulties with making new connections, turned away from their communities, or never connected much with others online at all. Feelings of social anxiety stopped people from reaching out, communities that didn’t feel like safe spaces turned people away, and a culture of fear towards internet strangers made others difficult to trust. In this post, we’ll describe the barriers that can make it difficult to connect in the fanfiction community. 

Social Anxiety

Talking to people is hard, and talking to people you can’t see on the other side of the computer screen can be even harder. An experience shared by many of our interviewees was a feeling of social anxiety. Social anxiety is a fear of judgment from others  that manifests in many ways, for instance, the perception (without any particular evidence) that the person you are considering reaching out to is not interested or welcoming towards being contacted.

“I’m a shy person, so usually I may not [reach out]. I just say I feel embarrassed in speaking to people that seem cooler than me, more experienced… maybe I’m bothering them. Maybe they don’t see me as a friend, they have just been polite, this kind of thing” (P11).

Contacting someone with a higher profile, such as a prolific author or fan celebrity, amplified social anxiety. Interviewees held the perception that these folks are likely overwhelmed with contact already and therefore would be unwelcoming.

“I wouldn’t want to reach out to them because they’re on a different level than me in terms of popularity and probably get hundreds of messages all the time” (P18).

Authors who were reluctant to reach out oftentimes waited for others to contact them first. Or, instead of reaching out online, they relied on irl friends outside of the fandom for writing feedback. As a result of social anxiety, people don’t take that first step of reaching out to an online stranger, and therefore they don’t receive the benefits of a potential connection.

Unsafe Spaces

Writing and posting fanfiction in a public space is, in a way, baring your soul to complete strangers, and one sure way to stifle the soul-baring process was the institution of discriminatory rules that disproportionately affected a subset of the fan community. Restrictions on free expression created the feeling of an unsafe space, and it is this feeling that drove authors away from the community. During our interviews, authors discussed situations where they left communities because they felt the space was unsafe. The ban of NSFW content from Tumblr after its acquisition by Verizon, widely viewed as an attack on sexual expression that disproportionately affected queer people, was a recent example.

“When Tumblr banned not safe for work, it was really distressing for a bunch of us who don’t really fit on the very heteronormative sexual scale. So there was a lot of trying to figure out where we were going to go now, where we were, how would we stay connected, how would we continue to figure out and find stuff that we enjoyed” (P25).

Long-time fandom authors told us of similar exoduses from LiveJournal and after similar content bans and mass content deletions. These actions by platform owners divided fan communities and forced authors to find connections elsewhere.

In addition to institutional actions, individuals who made personal attacks or used hate speech also made authors feel unsafe. These antisocial actions happened in prominent fandom spaces.

“P27: There are people who write things that a fandom may consider controversial. This could cause them to get unhelpful criticism, rude and discouraging comments, so they will be constantly discouraged from writing.

Interviewer: That would happen on AO3?

P27: Both AO3 and tumblr, that’s where I know it happens.

Interviewer: And you’ve seen other people treated like that?

P27: Yes.” (P27).

One author spoke of an experience where she was berated for years by a reader because she wrote about drug addiction.

“Because [my fic is] about drug addiction, that brought a reaction that I really didn’t expect… sometimes [readers] impose their views. I got bullied for a couple of years, and even when I went on hiatus that person came back… they were imposing what they thought on me” (P7).

A single bully discouraged and pushed away this writer, even as they received an outpouring of messages from readers who connected with the story.

Another author discussed how controversy over a gay character sent them elsewhere:

“At the time I was writing about neon genesis evangelion. And it turns out at the end of the show, the main character Shinji Ikari is gay. Well, it’s revealed he has a thing for this guy. And I was writing this kind of thing too, and people got super angry. So they’d leave because, oh my God, he’s gay and that was terrible at the time. So, uh, yeah, I quickly left” (P8).

Fanfiction authors tackle difficult, important and controversial topics, and they need a space where they can find others to relate to without fear of harassment. If personal attacks, hate speech, or discriminatory rules are present, they may feel they can’t stay connected and will be forced to go elsewhere.

Personal Disclosure

Navigating personal disclosure can be difficult for authors when they’re interacting with internet strangers. Although authors disclosed deeply personal facts about themselves through fanfiction writing itself, some preferred to avoid connecting their fanfiction identity to their offline identity. Others wanted to protect themselves and their families from potential exposure on the internet. Identity and safety concerns associated with online personal disclosure slowed relationship building between authors and online friends.

Is this faceless individual actually a murderer, merely feigning deep interest in Star Trek and waiting for an opportunity to strike at unsuspecting fans?

Being raised to fear internet strangers was a shared experience among several participants in our interviews. Authors chose different degrees of disclosure they were comfortable with, and made nuanced decisions about who to reveal information about themselves to, where and when.

“There’s not any set guidelines. I think it really depends on who you talk to… how long have you been with the person? What type of things do you talk about? Do you feel like it’s safe to give that information?” (P21).

Oftentimes, the decision of whether to reveal a piece of identifying information had to be on-the-spot. P21 had to decide whether this person who wanted to be co-author was someone who could be trusted.

“… You kind of have to sometimes make a snap judgment and ultimately it worked out fine in this one case… it really does have to come down to instinct, gut, sometimes, there’s no kind of set formula to be sure” (P21).

Personal disclosure and relationship-building go hand-in-hand. But In an environment where personal disclosure requires caution and nuance, building connection and relationships becomes much more difficult.

To Be Continued…

There are powerful isolating elements like social anxiety, unsafe spaces and stranger danger keeping fanfiction authors apart. How do authors overcome these barriers to make connections, build relationships, exchange feedback and change lives? Stay tuned for the next part of our series on connection and feedback in fanfiction communities.

About This Series

This series is a breakdown of findings from an interview study run by a fanfiction research group within the department of Human Centered Design & Engineering at University of Washington. In January and February 2019, we interviewed 29 fanfiction authors to understand how they connect with each other, build relationships, and seek out writing feedback. We learned profound lessons about the importance of building connections, the reciprocality of relationships and feedback, and the intersection of fandom with real life identity.


All I Want for Christmas is Comments, and Specific Comments Are the Best – Regina Cheng

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on All I Want for Christmas is Comments, and Specific Comments Are the Best – Regina Cheng

Originally posted on August 17, 2019 at


In January and February 2019, our research group did an interview study on how fanfiction authors seek feedback for their fiction. We interviewed 29 fanfiction authors and learned about their insights on feedback and relationship with feedback providers. In this blog post, we are going to talk about our findings on the impact of public comments, and the particular positive outcomes of comments that contain specific thoughts and insights.

Comments Are Generally Valued

Fanfiction authors appreciated public comments on their works. “All I want for Christmas is comments, if you liked it please let me know,” (P2) one of the authors we interviewed once wrote in her author’s note. All kinds of comments are welcomed as long as they are conveyed in a friendly and respectful tone.

Concise positive comments, such as “wow”, “amazing”, “awwww this is so cute”, though simple and maybe not that informative, are still valuable to authors. Those comments “tell you that you’re hitting the right emotional chords, that you’ve been on the right track” (P21). “That’s really helpful.” One author said:

“If there’s a session where you’re getting absolutely none of that, that might prompt you. Like okay, I meant that to be really eliciting a certain emotional response and it wasn’t getting it, so that might also be a sign to go in and kind of work on that” (P21).

Comments Help New Authors Enter the Community

Comments were especially valuable to new writers in the fandom. They made authors feel welcomed in the community, and helped authors learn about community norms and writing styles. One author told us that when they posted their first work in a new fandom, comments helped them get connected in the new community:

“You enter into new fandoms, you’re writing for a new audience and you don’t really know anyone… I don’t really know the rules of this particular fandom and I don’t really know what people are going to think of my stories is going to fit. And that initial bit of support and positive feedback to get that on early works, and to feel, okay, I’ve just sort of arrived in this in this fandom, and in this community, but people are making me feel welcome, and making me feel like what I’m writing is valued and appreciated by people” (P1).

Specific Comments Are Particularly Appreciated

While in general all kinds of comments were welcomed, almost all of the authors we talked to expressed particular appreciation for comments in which the reader expressed opinions and thoughts about particular aspects of the story. One author talked about how she formed a personal practice to write substantive comments when reading others’ fictions:

“I try to copy certain lines while I’m reading and try to leave a substantial comment… and say I really liked your story because of ZYX…I do both because I’ve been writing fanfiction for a long time and I know it’s fun to have substantial comments. I do it because that’s what I like and I know it can make someone’s day” (P29).

Authors recognized the effort that readers put into substantial comments, so they regarded receiving those long specific comments as an honor:

“You don’t write a big long comment like that, if you’re not affected by something. Because it’s hard enough to get readers to click the Kudos button and just give me a little heart, let alone write a comment, let alone write a long detailed comment whenever that happens” (P14).

Specific Comments Recognize Authors’ Effort

Many authors mentioned that receiving long and specific comments made them feel that their effort had been recognized. When they were proud of a part of their story in particular, they liked to hear about whether the emotion and thoughts that they tried to convey impacted readers in the way they expected.

“It just feels great when I spent so much time working on something and working on a particular detail, I absolutely love hearing someone’s reaction to it, like, what specifically they liked about it… It’s great getting those compliments but I want to know about their experience living in the story that I’ve created” (P2).

“The most interesting and in depth sort of feedback, people really seem to connect with the characters and the characterization of the story and the writing of the story, which I really, really like. It’s very satisfying when you put a lot of effort into something and they actually noticed and they’re like, and they comment on it like, oh my God, the way you wrote this, the flow of it, the thing… that’s my own kink, hearing people say that they understood what I was writing and that they understood what I was going for” (P14).

Specific Comments Teach Authors about Writing

Being able to hear what the audience thought about their fiction was not only a joyful experience to fanfiction authors, but also a valuable learning opportunity. Authors learned from specific comments about whether their writing style and the direction of the story worked for their readers. Some regarded specific comments as feedback for parts of their writing that they were not sure about. One author told us they endured writer’s block when writing a certain character in their story. When they received positive comments on that character, 

“they were commenting on my characterization for the character. And I was like ‘oh, thank you god,’ because I struggle with this particular character a lot” (P20). 

The author was able to validate their writing from comments that specifically pointed to a characterization.

Specific Comments Connect Readers and Authors

Another important benefit of specific comments was that it fueled connection with readers. Specific comments elicited discussion between authors and readers. As one author said in the interview:

“I like having comments that are thoughtful and trying to analyze what I wrote, and are picking up what I put down basically… I usually respond to the comment and say thank you, and if they left analysis I talk back and forth” (P29).

These back-and-forth discussions lead to further connections outside the specific story.“I’ve made friends with a lot of people who started out just commenting on my fics a lot. You end up commenting back, and start talking” (P13). In some cases,  these connections developed into later beta reading relationships and friendships. “Most of the people that I sent work to… they’ve made comments that are the right sort of comments I suppose…” one author said while telling us the story of meeting her beta readers,

“and so I sent work to them after that sort of built up a bit of a relationship by then so that I know what sort of person they are and what sort of comments they might make… I know that if I sent it to them, they will be looking for the sorts of things that I’m looking for them to look for. The comment that they’ve made tells me that they’re reading it the way I want a beta reader to read it”  (P26).


Our findings about public comments suggest that it is more than valuable for fanfiction authors to receive comments, especially comments that provide thoughts and insights on specific aspects of the story and writing. We suggest that encouraging exchange of specific public comments will be beneficial to fanfiction communities.   

We are actively posting blog posts about other findings from our interview study. Check out our blog later for more findings! 


Exploration of Fan Fiction Community – Kush Tekriwal

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on Exploration of Fan Fiction Community – Kush Tekriwal

Originally posted on December 10, 2018 at


At the University of Washington, I am working in a research group that uses data science techniques to explore informal learning in the fan fiction community. To conduct my analysis, I used data collected by past participants in this research group. Specifically, I explored the Doctor Who fan fiction, as it is an accurate sample and is relatively easy to manage.


In this blog, I explore how do the popularity of a story and the timing of new chapters affect the running total of reviews received. The x-axis represents the review date while the y-axis depicts the cumulative sum of reviews. Every point on the plot is a review received. The red circles indicate the release of a new chapter.

The Doctor Who dataset I explored contains 53,621 stories. I created three categories to distinguish the popularity of a story — top 5 percentile, following 20 percentile, and bottom 75 percentile due to the long tail. In the top 5 percentile, I sampled the following three stories: Dear Whovian Authors (5,432 reviews), Weathering the Storm (792 reviews), and The Time That We Love Best (549 reviews).


In the next 20 percentile, I sampled the following three stories: Archetype (250 reviews), Misadventures with the Doctor (109 reviews), and Five Times the Sonic Screwdriver was Useless (45 reviews).


In the bottom 75 percentile, I sampled the following three stories: Being Human (29 reviews), Centenary (10 reviews), and Make The Day Go Faster, Please? (5 reviews).



• The number of reviews increases when a chapter releases, as the new content attracts readers. 

• The number of reviews stabilizes after one year from publishing the story. 

• Popular stories entice readers soon after the story is published. This may be due to 1) author alerts or story alerts, 2) authors provided a prevue to excite readers, or 3) the story is easily accessible.

Furthermore, I sought to determine the relationship between the number of reviews and various metrics in the dataset by computing the correlation coefficients. A correlation coefficient is a statistical measure that calculates the strength of a relationship. A correlation of 0.0 shows zero or no relationship between the variables while 1.0 shows a perfect positive correlation. As depicted in the figure below, favorites have the highest correlation with the number of reviews. This is because readers closely track updates, which leads to increased reviews. The number of chapters, however, does not guarantee higher reviews. For example, Archetype has 250 reviews but only 6 chapters.


Limitations and Future Work

For this analysis, I used the first review of the new chapter as the chapter’s published date. In the future, I want to investigate how does the structure of subcommunities affect reviewing in a fanfiction network. Additionally, I want to provide authors a guideline to increase followers and favorites and, consequently, increase the number of reviews.


How Do Fanfiction Authors Find Favorites? – Ajinkya Sheth & Apurva Saksena

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on How Do Fanfiction Authors Find Favorites? – Ajinkya Sheth & Apurva Saksena

Originally posted on July 6, 2019 at



The fanfiction community is huge and growing. It’s an intricate network of authors, reviewers, and readers contributing to the creation of some form of contemporary culture. 

At the University of Washington, we are a group of researchers studying the fanfiction community and exploring the informal learning taking place there. 

We were particularly interested in authors who are user-favorites on Fanfiction.Net. When a user favorites an author, there are certain characteristics of the author that the user finds intriguing. It could be that the story is very interesting or the style of writing of the author fascinates the reader. We aim to find which authors have been favorited the most and what factors correlate with a user favoriting an author. 

This blog post explores the connection between users (in a particular fandom) and the authors that they have favorited on We us a metric to measure this relationship and try to find out how it correlates to other factors such as:

  1. Number of stories and chapters published by the authors 
  2. Number of reviews received for the published stories
  3. Total number of words written by the author
  4. Number of favorites received

In our analysis below, we have used the PageRank algorithm on authors in the “Game of Thrones” fandom on Each author has at least one favorite author, and we have exploited this detail for our analysis.


Both of us are huge Game of Thrones fans. The exciting season 8 finale and the massive popularity of GoT on social media motivated us to explore this fandom. Our current goal is to analyze authors that have been favorited by users, and which features might have earned them favorites in the GoT fandom on This analysis can pave the way for building a recommendation engine for users on


Our dataset has been scraped from For the analysis, we used two primary tables – Story and Author_favorites. The ‘Story’ table contains data about the stories – including but not limited to a unique story identifier, user id, fandom id, number of reviews, number of followers, and so on. The ‘Author_favorites’ table contains data about the users and their favorited authors. Because the data in these tables were humongous, we limited our scope to the Game of Thrones fandom. We used a cluster of the data by only retrieving the data that consisted of stories written in the “Game of Thrones” fandom.

The dataset we used was formed by combining the User Favorite table, Fandom table, and Stories table. This gave us a table consisting of User IDs and their Favorited Author IDs, both belonging to the Game of Thrones fandom on

Method and Process

PageRank is a billion-dollar algorithm which made Google what it is. Whilst the most popular application of PageRank is web search, it can be exploited in other areas as well. The web is a gigantic graph interconnected by the web links. And PageRank assigns a score of importance by calculating the ‘inlinks’ to a website. In our case, we have considered the dataset of users and their favorited authors as a form of a graph: Many users favorite authors and these users could be authors themselves who have been favorited by other users. Hence, every author will have none, one or more users who favorite them. And thus we can assign a score of ‘connectedness’ to the authors by using PageRank.

A visual representation of the graph is shown below. The blue dot at the center represents a user and the yellow dots represent the favorited users as well as the favorited users who have favorited other favorited users! When there are no out-links, the graph stops traversing.


Fig 1. Network of a user (of highest pagerank)  and his/her favorite authors. Blue dot represents the user with highest PageRank and yellow dots are favorited authors

This graph shows the ‘connectedness’ amongst fanfiction authors. Now we attempt to determine which characteristics (features) have a good correlation with the PageRank score that we obtained. In simple words, we try to find out how closely related the PageRank score is with characteristics such as ‘number of reviews’, ‘total words written by the author’ and so on. How can this be done?

A simple way to do this is through Linear Regression. In linear regression, we plot the features against a single response and try to explain the relationship through a straight line. We are conducting our regression analysis by using metrics which depict an author’s output (quantity):

  1. Total words written by authors
  2. Number of stories and chapters published by the author

And those depicting the recognition received (quality) by the author in the form of:

  1. Number of reviews received
  2. Number of times the author’s stories have been favorited

The intuition behind our analysis is to discern if there is any correlation between the PageRank scores which is obtained through network analysis and the above-mentioned metrics.

Findings and Results

PageRank Distribution

The histogram below shows the distribution of PageRank scores. As expected the histogram follows the Power Law which means a small number of items are having high page rank scores while the majority of items is concentrated towards minimum scores.


Due to the nature of our distribution, we decided to strip off all the authors having a score above 0.5, as clearly they are outliers and may not represent how the majority of the community behaves. In fact, there is a possibility of authors with high page rank scores skewing our results.

Regression Analysis

The graph below shows a plot of PageRank score against total words and the line denotes the amount of correlation between the two. A positive slope indicates a  positive correlation. Please note that even a slight increase in the PageRank makes a big difference. We can safely discern that as the authors increase their output, their score improves.


The second metric we used to measure an author’s output is Story-Chapter product which is the number of stories multiplied by the number of chapters. The reasoning behind multiplying both is that author adopt different styles for structuring their content. One author may have a story with multiple chapters, another one may write multiple stories with one chapter each. The plot below depicts positive correlation yet again.


Running regression over other variables yields the following output:


Our initial assumptions were correct and the features we selected are all indicators of a good PageRank score. However, which one of these is the best predictor?

Enter p-value. P-value helps to determine the level of significance of our results. In statistics, a p-value <0.05 typically indicates the trend is statistically significant. P-value only helps to infer significance which means all the variables we included in our study are important predictors for the page rank score. What p-value does tell us is how important these variables. To know which feature is better predictor, we use another metric called r-square. R-square helps to know the degree of correlation between two expected output and the actual output. It is conceived in terms of percentage. 

The p-values obtained for the above features are as follows:


Based on our analysis, it’s safe to conclude the number of reviews received by the author indicates a higher probability of that author being favorited often. 


In our analysis, we used four features, two of which Total Words and Story-Chapter product indicate the output (quantity) of an author while the other two; the number of reviews received and number of times the author’s works have been favorited indicate the quality of an author’s work. These features have been plotted against the page rank score which indicates the degree of an author’s presence in the community. Through data science and statistical analysis, we were able to discern that the quality of works and feedback received by an author is a better indicator than the output.

Future Work

Our analysis can help pave the way for a recommendation engine for new users. This recommendation engine would leverage the PageRank algorithm to recommend authors to a user which he/she would most likely favorite. Just like Google and Amazon recommend products to users, our recommendation engine would suggest authors for users depending on the fandom they like. To build a recommendation engine as effective as Google or Amazon would require tons of optimization and fine-tuning, hence we have kept this as future work for this project.

As for fanfiction enthusiasts ourselves, we want to connect with the community so that they can help us in our analysis. Inputs from the community are always encouraged as this would help us make a better recommendation engine. So please comment on your views on the following questions:

  • What would you like to see recommended? We aim to recommend the Authors, but are open to suggestions!
  • What parameters do you think would affect the action of a user favoriting an author? Do you think it’s just the story or could it be the number of reviews, genre or style of writing? Comment below! Our analysis indicates the number of reviews, however, it will be interesting to see if our analysis is aligning with what the community thinks.
  • Which other fandoms do you want us to explore?


How to Get More Reviews: A Data Analysis – Arthur Liu

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on How to Get More Reviews: A Data Analysis – Arthur Liu

Originally posted on December 13, 2020 at

A time-shifted serial correlation analysis of reviewing and being reviewed.

Acknowledgements: Investigation by Arthur Liu with thanks to Dr. Cecilia Aragon and Jenna Frens for feedback and editing and also to team lead Niamh Froelich.

Is it true that giving someone a review will make that person more likely to write reviews as well? Conversely, is it true instead that writing more reviews yourself will help you get more reviews from others?

In this post, we explore one avenue of reciprocity by analyzing the time series of reviews given vs. reviews received. 

Of course, you have to be careful with this technique. The inspiration of the analysis we utilized comes partly from Tyler Vigen’s Spurious Correlations site ( where he shows interesting correlations between clearly unrelated events. With a humorous perspective, he reminds us that correlation is not evidence of causation (since sociology doctorates and rocket launches are totally coincidental), but the analysis techniques here are an interesting technique to investigate potential relationships between two different time series.


Back to our topic of reciprocity, we wanted to investigate the relationship between reviews given and reviews received. We had two hypotheses that we were interested in testing: first, we were curious if users who received more reviews would be more inclined to give reviews themselves. Second, we were curious if giving reviews would help increase the number of reviews you personally received.

To get into specifics, here is an example plot of a real user’s review activity.


Let’s break it down. This plot follows the activity of a single user over the course of several years. It plots the total amount of reviews that they gave (in red) and also the total number of reviews that they had received on their fan fictions (in blue). What this chart shows us is that this is a user who has had a very consistent amount of activity in terms of giving out reviews. It also captures spikes in the number of reviews received (blue) which may correspond to having released a new chapter.

If there was a strong link between reviews given and reviews received in either direction, we would expect to see that increases in one is followed by increases in the other. Here is an example where we witness such a relationship:


Since it is harder to analyze the change in activity level from these cumulative plots, we then looked at the total number of reviews given each month. Here’s what that looks like for the same person:


This time, it is more apparent that there is a similar pattern in the activity behavior for the reviews given and reviews received. For this example, that similarity is a similar spiking pattern.

From Vigen’s website, we could naively apply a correlation calculation here, but there is a glaring flaw: one of the time series is clearly ahead of the other. So, what if we just shifted one of the time series so they overlapped and then computed the correlation? This is the basic intuition of serial correlation: we apply a range of possible shifts and then compute the correlation between these shifted graphs. The one with the highest correlation would be the one with the best match.

The results for different shifts:


The best shift of “11 frames”:


In other words, for this person, giving a lot of reviews correlates well with receiving a lot of reviews roughly 11 months later. Of course, this doesn’t prove any sort of causation, but we can speculate that the increased amount of reviews this user gave helped boost the amount of reviews they got later!

From this analysis of an individual person, we were curious how this extended to the larger community to see if these same trends existed! The short answer, “eh, not really,” but it is interesting to see why this cool pattern might not generalize adequately.

1. Not all individuals get reviews and give reviews at the same scale

Some users just like to give reviews and some users just like to write reviews!

For instance, here is someone who gives a lot of reviews and didn’t get many themselves.


Here is someone who gave some reviews, but then focused on writing stories and received a lot more reviews instead!


For graphs like these, it is hard to apply the analysis we did earlier because the relationship is likely a lot weaker or there might just not be enough data points to capture it anyway.

We can summarize these examples for the overall population by looking at the ratio between reviews given to reviews received.


For this sample of 10k users, we see that those who primarily receive reviews will have a larger ratio (right), and users who primarily give reviews will have a smaller ratio (left). In more detail, a ratio of 1.0 means that they only received reviews. For example: 10 reviews received / (10 reviews received + 0 reviews given) = 1. For a ratio of 0.0, it means they received no reviews. For each ratio, the graph shows the total count of the 10k users who had that ratio.

To address issue (1), we reduced the scope down to users who had a relatively equal ratio of reviews given vs. reviews received.

Additionally, we pruned for users who had received at least 10 reviews. This way, we would have enough data points to use for our analysis. In fact, this is also why there is a large spike in the 0.5 ratio which consisted of a lot of users who had written one or two reviews and received an equal amount.

With this cleaned up, we also computed the lags on a finer scale–weeks–instead of months since we noticed that months were not granular enough. We computed the most common lags, and here is a plot of the results. This lag is the shift applied to received reviews, and the correlation is how well the two series correlated with each other after the shift. A correlation of 1 means that as one increased, the other increased as well, a correlation of -1 means that as one decreased, the other increased, and smaller values such as 0.8 mean that the correlation was positive, but less strong.


So the result here is both a little messier and structured than we had hoped from our hypothesis, but that’s part of the research process!

To elaborate, in the X dimension, the lag, there isn’t a particular range that was significantly denser than the rest. In fact, if we looked at the histogram, we see something like this:


So we lied a little, it looks like that last lag of +20 weeks looks really popular, but this is actually an artifact caused by the serial correlation process. If you recall this graph:


The red line is the chosen lag at the peak. In this case, the shifting actually peaked, but if we had truncated the graph at 5, it would have simply picked that highest shift.

Not convinced? Here’s the same analytics, but now we calculated up to a lag of 40.


Looks like the 20 bucket wasn’t particularly special after all.

So ignoring this last bucket (and the first bucket for a similar reason), we notice that our histogram matches this noisiness that we observed for the lags.


What does this mean? It suggests that there is no general pattern that can succinctly summarize the larger population, and that we are unable to conclude that there is a common average positive or negative lag relationship between the number of reviews someone has given and the number of reviews that they have received. Some authors sent more reviews after receiving more reviews (positive lags), some authors received more reviews after getting reviews (negative lags), and some authors did not exhibit much of a relationship either way (the first and last buckets which didn’t find a reasonable shift). Although these relationships do exist, the timing was not consistent overall so we can’t say anything about authors in general.


2. Looking across users, we do not see consistent behavior in a time-shifted relationship between a person’s received and given review count

Even when we look at the lags with the highest correlation (r > 0.7), we see that this even distribution of lags still holds.


In summary, this isn’t the dead end! (With research, it rarely is!) But it helps paint a better picture of the users in the community and why this approach may not be well suited to encapsulate it well. We see that the relationship between reviews received and given doesn’t follow a necessarily time-shifted relationship and that in fact, this shift can go either direction. Try taking a look at your own reviewing trends, and see where you would be located within these graphs! Are you someone who has a positive shift or a negative time shift… or no strong correlation at all?

In the meanwhile, we’re still exploring some other interesting approaches in reciprocity! Stay tuned :)


Do longer chapters get more reviews? – John Fowler

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on Do longer chapters get more reviews? – John Fowler

Originally posted on December 6, 2020 at


One of the questions we occasionally get from authors is: “What kinds of submissions get the most reviews?” We think this is a really interesting question and we’ve started doing some exploratory analyses related to the quantity of reviews that authors receive based on a variety of factors. One of the factors that we decided to check out was the number of words in a chapter. We were curious: Would shorter chapters get more reviews because they might take less time to read? Or longer chapters because there is more for reviewers to dig into? Or maybe there’s a sweet spot somewhere in between?


To look into this we took a random subset of 10,000 authors from with chapter publications over a 20 year period from 1997 to 2017. We then created a scatterplot with each point being one of these 10,000 authors, the x-axis showing the median number of words across their published chapters, and the y-axis showing the median number of reviews received on those chapters. The points are segmented into six groups based on percentile of the total number of reviews received on all chapters they have ever published. We then put trendlines in for each of these segments, so we can more easily observe if there are any relationships between chapter length and reviews received across each of these groups. We also performed this analysis at the chapter with similar findings. The results are preliminary and warrant further exploration, but we’ll share what we’ve found so far. 


It turns out that the small number of most highly reviewed authors in the top 1% saw an increase in reviews received up until chapters of almost 5,000 words in length, at which point their chapters began to receive fewer reviews on average.


For those authors whose works are in the top 25% of reviews received (excluding the top 1%), as chapter length increases, the number of reviews received on those chapters does as well. Interestingly, there does not appear to be the same drop off in reviews received for longer stories for these authors as there was for the authors in the top 1% of reviews received.

On the other hand, the remaining authors whose chapters are less highly reviewed saw little change in the length of chapter published with the number of reviews received.


These preliminary results point to some interesting potential implications on how an author might be able to get the most reviews. For the most highly reviewed authors, shooting for a chapter of around 5,000 words in length is most likely to result in the highest levels of engagement. However, for the vast majority of authors, writing longer chapters is not likely to have a negative impact on engagement from reviewers, and may even result in more reviews. 

How about you?

What are your experiences with receiving or providing reviews based on chapter length? We’d love to hear whether this is a factor that motivates you or something that you consider when writing or reviewing!


Can it wait? Can THEY wait? – Serene Gao

Posted by Human-Centered Data Science Lab on October 29, 2022
Fanfiction Data Science, Human-Centered Data Science Lab (HDSL) Blog / Comments Off on Can it wait? Can THEY wait? – Serene Gao

Originally posted on December 28, 2020 at

A study on fanfiction stories’ update frequency and number of reviews received

As a grad student, I often find myself debating over finishing tasks all at once or spacing them out over a reasonable time period. In the fanfiction community, we have seen stories where multiple chapters are posted on the same day, while others are updated every few months or even years. As authors, if our goal is to attract readers and reviews, how long should we wait between chapters? Is it better off to satisfy our readers with content all at once, or to keep them hooked by posting a bit at a time?

Our Approach

To answer these questions, we defined the “frequency” of updates in a story as the average number of days between chapters posted, and looked at stories with more than one chapter and at least one review from during the period of 1997 to 2017. In this particular study, we are considering the first story posted by each author to avoid miscalculating the accumulated review count for their subsequent stories. Note that the original chapter publish date/time was not available in the dataset so researchers estimated it from either the story publish time or time of the first review. As a result, this dataset is representative of stories with more reviews. 

What We Found

In this graph, each data point is a story mapped to the total number of reviews received (y-axis) and average days between chapters posted (x-axis). The x-axis is then divided into 14 bins, to represent “buckets” of stories where chapters were posted every 1, 2, 3… 14 days on average. While there are quite a few stories with up to hundreds of reviews, the median line plotted for each bin indicates that the data is skewed to the right. 


My initial guess was that stories with chapters posted 3-4 days between each other might receive the most reviews, as readers are likely to revisit the same story for updates every few days. This graph seems to be consistent with this speculation and shows that the first peak is at five days. This means that half of the stories with chapters published five days between each other are observed to have 9 reviews. Other peaks are observed at ten and thirteen days.

Your Thoughts?

How often do YOU update a story? What factors do you consider when planning to post a new chapter? As a reader, would you prefer coming back every few days to read the new chapter and review, or reading them all at once? We look forward to seeing your comments and learning more on this topic!