Critiquing Generative LLM Output in Linguistics

Critiquing Generative LLM Output in Linguistics

Nathan Sanders teaches in the Department of Linguistics, with a focus on phonology, phonetics, and innovative pedagogy. He has recently worked on ways to help students better understand various issues concerning LLMs, such as unreliability and ethical considerations.

Objectives

As part of the first-year course, Introduction to Linguistics: Sound Structure (LIN101), this assignment is designed to deepen students’ understanding of both linguistic concepts and the capabilities and limitations of generative AI. By evaluating an AI-generated diagram of the human vocal tract, students practice critical analysis and apply foundational anatomical knowledge to identify errors and omissions.

The manual drawing component reinforces visual communication and precision, while the generative AI experimentation introduces students to prompt engineering and iterative design. Through comparative reflection, students consider the reliability, efficiency, and broader implications of AI-assisted work in academic and professional contexts. Importantly, the assignment also includes an ethical opt-out pathway, allowing students to reflect on the reasons they might choose not to use AI tools and to articulate the conditions under which they might find such tools appropriate.  

Learn more in Professor Sanders’s detailed assignment guide.

Process

Students engage with the assignment through a multi-step process:

  1. AI Critique: Students are shown a midsagittal diagram of the human vocal tract generated by ChatGPT. They must:
    • Evaluate each of the ten labels for correctness.
    • Identify missing labels that are essential for introductory linguistics.
  2. Manual Diagram Creation: Students draw their own accurate version of the diagram by hand. They:
    • Record how long the drawing took
    • Photograph and submit the drawing
  3. Generative AI Experimentation: Using a generative tool of their choice (e.g., ChatGPT, Copilot), students attempt to recreate their hand-drawn diagram through iterative prompt engineering. They:
    • Document all prompts used
    • Submit the best or final AI-generated image
    • Describe their process and reasoning
  4. Comparative Reflection: Students compare the manual and AI-assisted methods in terms of:
    • Time and effort
    • Output quality
    • Trustworthiness
    • Broader implications in academic and professional contexts

Future-Focused Skill Development

The activities align with the University of Calgary’s STRIVE model for integrating generative AI into assessment design. It promotes Responsibility by asking students to critically evaluate the accuracy and limitations of AI-generated diagrams, reinforcing the need for verification rather than blind trust in technology. 

In addition, it supports Validity by ensuring that core learning outcomes – such as understanding the anatomy of the vocal tract and developing visual communication skills – are authentically demonstrated through both manual and AI-assisted methods. Importantly, it also advances Equity by offering an ethical opt-out option, allowing students to participate in ways that respect their values and comfort with AI use. 

Student Feedback

Professor Sanders shares: “Students report immense frustration with the process of trying to get LLMs to create useful output. No matter how much detail they provide or how much time they spend, the generated images simply do not improve very much. The output may look superficially more impressive, but the knowledge of linguistics they have gained in just a few weeks is enough for them to see the many serious inaccuracies.

It is also encouraging that some students do take the opt-out option and provide nuanced and well-argued justifications, indicating that they are coming to university already equipped with some of the AI literacy skills this assignment is designed to support.”

Back to Top