This summer I had the opportunity to focus solely on my research thanks to a research fellowship through my department. I set out to l learn various skills that I felt would qualify me as a digital sociologist. A leading scholar in the field, Deborah Lupton describes digital sociology in four ways:
- The use of digital technology to practice sociology
- Research conducted on digital technology and its effect on people
- The critical examination of digital technology
- The relationship between humans and digital technology
I started to venture into digital sociology when I reached out to the Maryland Institute of Technology for the Humanities (MITH) to use a database of tweets they collected on Ferguson. I had no idea what I was doing, but the events that unfolded around Mike Brown’s death solidified for me the importance of research on Black lives. Since then, however, I have built a skill set that has expanded my understanding of digital technology and social media to the point I feel comfortable reflecting on my growth and what this means for the type of sociology I aspire to do.
Using Python to Collect Twitter Data
Ed Summers at MITH created the first Twitter dataset I used with his software Twarc, which relies on the Python programming language to run tasks through the command line (think Terminal on Mac or the Command prompt on Windows). The dataset included tweets that contained the word Ferguson, collected over a period of a year at four points in time. The dataset ultimately included over 31 million tweets. Documentation for Twarc can be found on Github, a website that makes software building accessible to professional coders and novices alike.
Twarc pulls data from Twitter by communicating with Twitter’s Application Programming Interface (API). Using a few lines of code, Twarc allows researchers to pull tweets from specific words. However, Twarc, like any program that communicates with Twitter, can only access about 1% of tweets at any given moment. This protects Twitter users overall but also can distort the interpretation a researcher can make of the tweets that they collect. Nevertheless, social media data has started to have increasing significance in research from the digital humanities, digital sociology, and social movements scholarship.
For instance, Earlier this year, researchers Deen Freelon, Charlton D. Mcilwain, and Meredith D. Clark published a report on #BlackLivesMatter. Like Summers, Freelon provides documentation on the tool used to collect data on Github. Their research spanned the period before Eric Garner’s death in Staten Island, New York in July 2014 until Freddie Gray’s death in Baltimore, Maryland in April 2015. They found a wide range of people on Twitter discussed incidents of police brutality, some using social media to educate, amplify marginalized voices, and call for police reform. Thus, social media data gives insight into how online activism contributed to the growth of #BlackLivesMatter.
#SayHerName: Black Cyberfeminism and the Movement for Black Lives
My second venture into social media data analysis started with #SayHerName data also collected by MITH. Kimberlé Crenshaw, known for her work on intersectionality, started #SayHerName due to the lack of focus on violence against Black women in the current movement for Black lives. #SayHerName represents a form of ‘Black Cyberfeminism,’ which combines intersectionality with digital tools. Like #BlackLivesMatter, #SayHerName combines social media activism with traditional forms of activism like political lobbying, marches, and protests.
Summers pulled the #SayHerName data from Twitter through Twarc and saved it asJSON. For now, to make this data accessible to those interested in learning a bit about #SayHerName, I used Python to convert the JSON files into three word clouds below. The word cloud operates similarly to content analysis, relying on the frequency of words to provide a sense of the major themes in a text. The larger the word in the word cloud, the more frequently the word appears within that dataset. Above each image, I provide the total number of tweets and the period of data collection.
1/29/16 – 3/19/16 – 114, 493 tweets

Image Credit: Melissa Brown, University of Maryland
4/22/16 – 6/26/16 – 60,925 tweets


7/7/16 – 8/5/16 – 217,924 tweets


Total: 393,342 tweets from 1/29/16 to 8/5/16


A quick glance at the largest words from the final count suggests police violence against Black women remained central to the conversation in #SayHerName during the first half of 2016. Beyond that, this conversation revolved around a number of victims of violence including Black transwomen, but appeared to focus on Sandra Bland in particular, which reflects that Bland is a significant inspiration behind #SayHerName:
In honor of Bland, and to continue to call attention to violence against Black women in the U.S., the African American Policy Forum, the Center for Intersectionality and Social Policy Studies at Columbia Law School, and Andrea Ritchie, Soros Justice Fellow and expert on policing of women and LGBT people of color, have updated a report first issued in May, 2015, “Say Her Name: Resisting Police Brutality Against Black Women.”1
At first glance, #SayHerName remains true to its mission in the digital sphere. However, a good sociological analysis does not remain at this level of descriptives. I also used Python to convert the JSON files into a CSV, which makes the dataset into a spreadsheet accessible to familiar programs like Microsoft Excel or Apple Numbers. This format will serve as the basis for the data used in content analysis, “A method for identifying the meaning of a text by reducing it to a much smaller summary or representation of its principal themes or ideas.”2 Stay tuned for the results of the content analysis.
- http://www.aapf.org/sayhernamereport ↩
- Excerpt From: John Scott. “A Dictionary of Sociology.” iBooks. https://itun.es/us/wU6S3.l ↩