This lesson introduces students to the basic methods and tools of text analysis. It uses Voyant, an open-source, web-based platform to analyze “Goblin Market,” Christina Rossetti’s 1862 poem.
This lesson introduces students to the basic methods and tools of text analysis. It uses Voyant, an open-source, web-based platform, to analyze “Goblin Market,” Christina Rossetti’s 1862 poem.
This lesson teaches digital literacy and the Bryn Mawr Digital Competency 5.4, Digital Research and Scholarship, which involves learning how “traditional and emerging processes of knowledge creation” are interacting within current scholarship. Students gain this competency by using two methodologies to form research questions about literary texts, the “traditional” method of close reading and the “emerging” process of computer-assisted text analysis, and comparing what they can learn from both methods. In addition, they gain a disciplinary understanding of how to apply textual analysis in humanities research and hands-on experience using a digital tool as part of their research methodology.
This original version of the lesson was taught in an undergraduate classroom. However, with adaptations, this lesson is appropriate for faculty, graduate students, or anyone interested in text analysis.
This synchronous eighty-minute workshop was taught virtually in an upper-level, semester-long undergraduate English literature course. The course introduces students to research methods and tools they can use when working on their senior thesis projects. In addition, multiple librarians visit the class over the semester to provide instruction on topics such as library databases, citation management, and archival research.
I worked with an English professor, Adela Pinch, who specializes in eighteenth and nineteenth-century British literature, in adapting this lesson for her class context. I chose the poem “Goblin Market” because it connected to the course and because the subject liaison librarian, Sigrid Anderson, had used it for an earlier session on finding primary and secondary sources.
Students analyze the poem using Voyant to explore its many intersecting themes of feminism, capitalism, sexuality, and trauma and examine its use of language. For instance, students have graphed where the word “like” appears to understand how similes function and whether they occur at moments of trauma or calm. They have also compared the two main characters, Laura and Lizzie, to see how they are represented differently. For example, is one character associated with more active verbs than another, and, thus, is that character portrayed as more commanding or stronger? (See slides for screenshots of these examples.)
Miranda Marraccini, Digital Pedagogy Librarian
Adela Pinch, professor of English
Voyant, an open-source, web-based platform with a low instructional barrier
Laptop or desktop computer (Voyant is difficult to navigate on a tablet or phone)
This lesson can be adapted for any text, literary or otherwise. For example, students could use Voyant to analyze their writing by uploading one of their papers. Likewise, they could use the tool to explore academic jargon by uploading journal articles or explore “fake news” by uploading news articles.
The lesson can also be adapted to different audiences. For example, I have taught it as a workshop open to the public and as a one-shot for graduate students enrolled in a graduate certificate program. For graduate students, I emphasized how they can use the tool to develop research questions and how they can use it in the classroom to help their students understand literary analysis.
It can be taught in person with little adaptation. In an in-person environment, students would need access to computers (they can work in groups of two or three), and the instructor would be able to move around the room to help those having trouble navigating the Voyant interface.
Students will understand:
What text analysis or text mining tools can and cannot do;
How to be critical of text analysis tools;
How to find and clean texts for text analysis purposes
The basics of Optical Character Recognition (OCR);
How to connect text analysis tools with traditional methods of literary scholarship, including archival research and close reading;
How to apply the critical framework of literary close reading to other forms of analysis, including evaluating the reliability of online sources
Voyant does not require users to make an account or use a powerful computer. However, ensuring the site is functioning before the session is a good idea since it can be down for maintenance purposes. The instructor should also test the text in Voyant they intend to use to make sure it will serve their teaching and demonstration needs.
The version of the poem “Goblin Market” used for this lesson
After learning how to use Voyant, students apply the tool to investigate questions, including:
What does the application reveal in the text that you didn’t notice? What do you see that the application couldn’t?
How would this experience be different if you were looking at text on a different scale, such as a large corpus?
From there, we move on to broader ideas:
How do scholars develop research questions?
How do you test a humanities hypothesis?
In this lesson, students learn how to ask the right questions about digital tools and to be critical of technology while understanding its transformative capabilities for scholarship.
Another question that this lesson raises is how to get usable electronic versions of texts. I found that when I demonstrated different ways of uploading a text to Voyant, for instance, copying and pasting plain text versus inputting a URL, there were variations in the accuracy of the results. (The URL version included the website title and other unrelated text, such as the navigation menu, so the resulting analysis included more than the poem.) This demonstration helped students understand the limits of text analysis tools and the importance of cleaning data.
After completing the session, students in the undergraduate English course wrote short papers using Voyant to analyze different texts. For example, one student chose to track the usage of the word “thought” in Virginia Woolf’s novels and noticed how the word ebbed and flowed depending on whether characters were alone or grouped or speaking in present or past tense. Another student compared the complexity of the diction in Anne of Green Gables and Jane Eyre. The students included screenshots to show how they used Voyant and ended their papers with research questions that they wanted to investigate further.
There was no formal assessment of the lesson. Sample student papers, as described above, demonstrated that students had learned to use Voyant for literary analysis purposes. However, a more formal assessment could be used to improve the lesson. Other digital scholarship workshops at my institution have administered Qualtrics surveys before and after a session that ask questions about participants’ level of comfort with digital tools. Enhanced assessment of this lesson would provide further opportunities to adjust and adapt it for different audiences. For example, it would show whether participants want an introduction to literary analysis or to spend more time experimenting with Voyant.
I found that this lesson works best when students have already read the text being analyzed. They can better understand the tool’s limitations when comparing the Voyant results with their textual insights. Also, familiarity with the text, in the case of “Goblin Market,” helped students get over the text’s strangeness and focus on the analysis in Voyant. “Goblin Market” is a nineteenth-century British poem with many uncommon words and figures of speech. That complexity makes it an interesting subject for literary analysis but also a non-ideal choice for use in an open workshop. For an open workshop, I would choose a more accessible text, perhaps prose instead of poetry, to lessen the chances that a participant would find the content a barrier to learning.
This lesson is challenging if Voyant is not functioning well. It relies heavily on the platform, though screenshots and videos can serve as a semi-replacement. Voyant has a version that can run on a local server, but it requires students to install a specific version of Java on their computers. Because I believe Voyant is the best tool for this kind of lesson, I recommend rescheduling any synchronous sessions if it is down.
Presentation (see slides in Materials) (20 minutes)
What is Voyant?
Why use Voyant?
Datasets, OCR, copyright
Resources for learning text analysis
Hands-on activity with Voyant (40 minutes)
Screen share using Voyant
Input text by copy-pasting
Tour the Voyant interface
Introduce main functions in Voyant
Students experiment with Voyant
They try out at least two new tools
They take notes on what they find
Discussion (40 minutes)
Each student or group reports out
More general discussion based on broad questions in slides