This lesson introduces students to metadata development and reuse, using the DPLA digital collections and the DPLA API to harvest metadata. It includes handouts.
Three hour workshop with two ten minute breaks
Discipline-specific teaching faculty
Mid-track undergraduates majoring in fields requiring a high-level of visual literacy about the cultural or physical world (anthropology, art, history, biology, geology, pre-medicine, etc.); graduate students; MLIS students learning about metadata development.
This course was designed using a crosswalking method shared by Marcia Lei Zeng (Post-It note paper to tag, label, and simulate the crosswalking experience) and expanded upon in Metadata workshops held by the author.
The course supports the identification of metadata by individuals with local knowledge or subject-specific knowledge.
The primary goal of this lesson is to familiarize students with the methods of metadata development and reuse, and to instill the confidence in their ability to contribute to curated knowledge. This is accomplished by following the following four steps:
Introduction to the effort required to create simple and robust metadata.
Use of the DPLA API to harvest metadata.
Practice of empathy in the assessment and use of digital collections, identify bias and how bias may be addressed, and identify gaps in access.
Enhancement of metadata for use by a specific user group, specifically through the identification of keywords, enhanced description, coverage, or additional fields.
Prior to workshop, students will need to identify a digital collection on the Digital Public Library of America (DPLA), which includes items affiliated within their area of special interest.
Students should read “Queering the Catalog,” before workshop.
Prepare printed images from the DPLA:
Images for students to discuss -- instructor can select any image shared with the DPLA, note that these image will be discussed in section 1.0 and cognitive biases will be assessed. Instructor should select images which could challenge student’s identification of the content.
Maps to illustrate standards development (e.g.,
hand drawn Florida map (few standards)
engineer drawn (some standards)
Shell Oil Company map (highly standardized)
API practice
Note: it is very important for the instructor to gain some experience working with the API prior to teaching the course; the DPLA documentation is excellent.
Create Sample datasets (backup).
Prepared bookmarks/open tabs/software:
Excel
Computer with World Wide Web access
Document software (Microsoft, Google, etc.)
Paper, pencils, Post-Its of various colors
Supplementary handouts (see Additional Instructional Materials below)
Introduction to the effort required to create simple and robust metadata
Students —
Organize into small groups, each group selects a single image to discuss.
Instructor —
Introduce the Panofsky-Shatford matrix (generics, specifics, abstracts)—see handout in Additional Instructional Materials below.
Discuss cognitive biases and the effect on the transfer of information—see handout .
Students —
Discuss two or three cognitive biases and the potential effect of these biases on the process of creating metadata for the image in hand.
Discuss in small groups how the image could be “tagged” what “labels” could apply to the “tags,” what research may be required to identify more information, etc.
Record the labels and the tags on Post-Its.
Students —
Reconvene as a large group, and crosswalk/map all the groups labels/concepts.
Discuss difficulties mapping labels.
Instructor —
Introduce the idea of standardized information (examples: hand drawn map, 19th century map, Rand-McNally map, Google map).
Discuss the ideas communicated through standardized information.
Share and discuss the standards funnel — see handout in Additional Instructional Materials below.
How is information created by humans?
How is information discovered and used by humans?
What information can be contributed?
How is information crawled and used by machines?
Students —
Use a pro & con grid (on paper or whiteboard) to list the benefits and drawbacks of standardized or centrally managed metadata — see handout in Additional Instructional Materials below.
Use DPLA API to harvest metadata
Instructor —
Briefly introduce the idea of exposing metadata for use and reuse.
Students —
Discuss possible ethical and legal issues to consider when using metadata created by others.
Instructor —
Demonstrate harvest and creating calls.
Students —
Construct queries and harvest metadata DPLA API Instructions
Collect affiliated images from JSON results with an image scraper (e.g., Tab Save extension).
Students —
Use a concept mapping approach to describe the systems (machines, protocols, and people) used to harvest metadata.
Practice empathy in the assessment and use of digital collections, identify bias and how bias may be addressed, and identify gaps in access.
Students —
Explore the CMS Mukurtu.
Discuss communities and attributes of communities (cultural, professional, knowledge domains, sub-Reddits, Twitter cultures, etc.)
Instructor —
Demonstrate various tools and methods of metadata assessment (word cloud guessing, sorting, visualization, etc.)
Students —
Use various tools and methods to investigate the metadata.
Instructor —
Demonstrate the assessment rubric — see handout.
Students —
Investigate a few original collections that metadata is drawn from, and use the rubric (available as part of Handouts in Additional Instructional Materials below) to assess the standards and policies related to metadata development.
Students —
Discuss and Identify possible cultural bias, knowledge gaps, inclusive/exclusive approaches to the metadata.
Students —
Identify gaps in access — Who may not discover this collection because some information is missing? Who is the primary contributor? Who is the primary audience of this collection? Who is excluded from sharing their knowledge? Are the labels useful for finding information?
Enhance metadata for use by a specific user group. Keywords, enhanced description, coverage, or additional fields.
Instructor —
Discuss sharing metadata with the collection creators and working collaboratively to enhance metadata.
Students —
Use Criterion 3 of the assessment rubric to identify methods of making knowledge about the dataset more robust.
Instructor —
Demonstrate various tools and methods of refining and enhancing data (Excel, OpenRefine, or Regex depending on instructor’s experience and comfort level).
Students —
Identify methods of making access to the collection more robust, identify schemas, local elements, controlled vocabularies, local terms, which might be used.
Students identify issues which surround the curation of information by discussing biases and knowledge gaps.
Students successfully use tools to collect and refine data.
Students are empowered to research and to contribute to record-keeping.
Metadata records are living records, which should be revisited and revised when new information is acquired or new perspectives require re-cataloging. As the myriad communities responsible for metadata creation, curation, and use continues to evolve, they need to consider best practices which allow for knowledge experts to collaborate and contribute. Perhaps we need interfaces which allow experts to fork and create enhanced collections. Whatever solutions are devised, subject experts should be involved as thinkers, contributors, and collaborators.