Carnegie Mellon’s campuses, lecture halls, and grading rubrics are now the front lines in a debate every university is quietly having: can generative AI be taught as a skill without letting it hollow out the learning it’s supposed to support? The answer CMU’s School of Computer Science is arriving at is complex and pragmatic — a patchwork of course-by-course rules, redesigned assessments, mandatory process artifacts, and experiments in teaching with AI rather than simply policing it — even as instructors watch students slide toward dependence on instant model outputs.
The rapid arrival of ChatGPT, Microsoft Copilot, Google Gemini and other large language model tools transformed student workflows practically overnight. Where previous generations learned to draft, debug and reason through problems by iteration and human feedback, many students today reach for a chat window or a desktop copilot that can generate a first-pass solution in seconds. That shift has produced two simultaneous realities: powerful gains in productivity and accessibility when tools are used thoughtfully, and glaring learning gaps when they’re used as shortcuts.
Across institutions, three recurring patterns appear in the emerging evidence and pilot reports:
That redesign included:
Key emphases in 15-113:
The results there are stark: instructors report that students who used AI inappropriately tended to earn markedly lower grades, indicating that delegation to AI in such a course produces genuine learning deficits. That empirical signal — a large negative effect on learning measured in course outcomes — is what drives strict prohibitions in reasoning-heavy classes.
First, TA office hour attendance fell as students turned to chatbots for instant explanations. TAs note that human-help is slower but often more pedagogically precise and diagnostic; AI is instant but can provide plausible — not necessarily correct — reasoning. When students accept model outputs without interrogation they may complete homework but not internalize the reasoning needed for exams.
Second, some students self-reported what instructors called an “addiction” to AI: a pattern of over-reliance that persisted even after encountering poor quiz or exam outcomes. Faculty reported cases where students recognized the problem but struggled to abandon the shortcut, describing an inability to revert to slower learning strategies that had previously produced competence. This phenomenon — a behavioral lock-in to easy outputs — raises questions about long-term study habits and self-regulation.
Two notable survey findings cited in contemporary reporting:
CMU’s experiments are still evolving, and the broader research community must prioritize robust, replicated studies of long-term retention and transfer in AI-augmented learning environments. But the university’s combination of course-level nuance, pedagogy-first interventions, and candid reporting of student outcomes offers a practical, evidence-aligned path forward: teach the tools, preserve the reasoning, and grade the process.
Source: The Tartan AI's Impact on Education at CMU and Beyond - The Tartan
Background / Overview
The rapid arrival of ChatGPT, Microsoft Copilot, Google Gemini and other large language model tools transformed student workflows practically overnight. Where previous generations learned to draft, debug and reason through problems by iteration and human feedback, many students today reach for a chat window or a desktop copilot that can generate a first-pass solution in seconds. That shift has produced two simultaneous realities: powerful gains in productivity and accessibility when tools are used thoughtfully, and glaring learning gaps when they’re used as shortcuts.Across institutions, three recurring patterns appear in the emerging evidence and pilot reports:
- High adoption rates among college students for drafting, coding, summarizing and test preparation.
- Mixed learning outcomes: efficiency and personalization gains on the one hand; lower retention and “cognitive shortcutting” where AI substitutes for encoding on the other.
- Institutional responses that favor managed adoption — procurement, literacy training, and assessment redesign — over blanket bans that are hard to enforce and often counterproductive.
What’s happening at CMU: three courses, three strategies
Carnegie Mellon’s approaches are illustrative because the school is both a producer of AI talent and a testing ground for pedagogy in a discipline where AI literacy is quickly becoming an employability requirement. The SCS response is not uniform; instead it is tailored to the learning objectives of different courses.15-112: Fundamentals, iterative redesign, human scaffolding
15-112 (Fundamentals of Programming and Computer Science) is CMU’s large introductory course. Faced with a wave of AI-driven homework submissions and fewer students attending TA office hours, instructors adjusted the course structure to emphasize formative checks, real-time rewrite exercises, and a flipped classroom model. The goal: make in-class and quiz performance — observable measures of individual understanding — carry more weight, while preserving homework as a learning scaffold rather than a sole indicator of mastery.That redesign included:
- Increasing quiz weight and frequency so students gain immediate feedback and lecturers can see mastery gaps earlier.
- Requiring in-class or live rewrite sessions where students must reproduce or adapt their submitted work in a monitored environment.
- Mandatory recitations and TA interactions to re-center human explanation and dialogue as primary feedback loops.
15-113: teaching with AI as a professional skill
15-113 (Effective Coding with AI) is CMU’s experimental counterpoint: instead of banning AI, it teaches students how to use it responsibly in engineering practice. The course is framed as a collaborative search for best practices — an acknowledgment that the field’s norms are still evolving and that graduates will need to explain how they incorporate AI in workplace flows. Instructors treat the moment as both a pedagogical challenge and an opportunity to cultivate industry-facing literacies.Key emphases in 15-113:
- Prompt engineering, iterative code construction with model suggestions, and critical verification of outputs.
- Distinguishing between “Ask” conversational modalities (like ChatGPT) and “Edit” or IDE-integrated modes (like Copilot’s live edit), and teaching the affordances and risks of each.
- Assignments that require students to document their AI use, critique outputs, and show incremental development rather than a single final artifact.
15-122: no-AI zones where the learning goal requires pure reasoning
By contrast, Principles of Imperative Computation (15-122) operates with a clear no-AI policy. The instructors’ rationale is instructional fidelity: the course prioritizes theoretical proofs and reasoning skills that are compromised if AI produces the code or formal arguments for students. Where the learning goals are explicitly about mastering reasoning processes, the course enforces human-only work and uses probabilistic detection tools as part of enforcement and deterrence.The results there are stark: instructors report that students who used AI inappropriately tended to earn markedly lower grades, indicating that delegation to AI in such a course produces genuine learning deficits. That empirical signal — a large negative effect on learning measured in course outcomes — is what drives strict prohibitions in reasoning-heavy classes.
Student behavior: instant answers, less office hour time, and the addiction problem
TAs and instructors at CMU described a behavioral shift familiar to instructors across the country: students increasingly prioritized AI over human help, trading slower, noisier human guidance for immediate, polished answers. This has two connected consequences.First, TA office hour attendance fell as students turned to chatbots for instant explanations. TAs note that human-help is slower but often more pedagogically precise and diagnostic; AI is instant but can provide plausible — not necessarily correct — reasoning. When students accept model outputs without interrogation they may complete homework but not internalize the reasoning needed for exams.
Second, some students self-reported what instructors called an “addiction” to AI: a pattern of over-reliance that persisted even after encountering poor quiz or exam outcomes. Faculty reported cases where students recognized the problem but struggled to abandon the shortcut, describing an inability to revert to slower learning strategies that had previously produced competence. This phenomenon — a behavioral lock-in to easy outputs — raises questions about long-term study habits and self-regulation.
National context: surveys, pilots, and what the evidence says about learning
CMU’s experience aligns with a growing body of institutional pilots and surveys. Across districts and universities, AI delivers measurable operational gains — time savings for faculty, scalable differentiation, accessibility improvements — when paired with training and governance. But randomized trials and controlled studies show important caveats: relying solely on AI for learning tasks can reduce retention compared with active note-taking and other cognitive encoding strategies. The Cambridge–Microsoft classroom experiment is one example that found written note-taking produced stronger delayed recall than LLM-only conditions, and a hybrid approach preserved benefits.Two notable survey findings cited in contemporary reporting:
- A sector report indicated that college students often use generative AI to produce coursework, but are more likely to use it constructively when professors guide that use. This pattern — guidance increases constructive use — recurs across multiple higher-education surveys.
- A study of faculty attitudes found a large majority expecting their teaching models to be affected and many reporting academic integrity incidents related to generative AI. These faculty concerns reflect the practical realities instructors face when assessing learning in an era of accessible generative tools. Because specific survey numbers (for example, 79% and 73% figures circulating in November reports) are tied to particular institutional studies, it is important to treat them as time-bound and context-specific; readers should consult the original reports for methodology and sampling details.
Pedagogical responses that actually work
The best practice emerging from pilots and CMU’s experiments is to treat assessment as the lever that preserves learning validity. If models can produce polished final artifacts, then assessments must pivot to evaluate process, judgment and verification. Across campuses, instructors are converging on a menu of durable tactics.Process-first assessment (what to require)
- Staged submissions: outlines, annotated drafts, and final submissions show development over time.
- Oral defenses and in-class demonstrations for summative work that must be explained under observation.
- Prompt logs or tool-annotation artifacts documenting how AI was used and what checks the student performed.
- In-class timed assessments measuring on-the-spot reasoning for high-stakes evaluation.
Teach verification and prompt literacy
Faculty should treat source verification and prompt evaluation as core literacies. Students must learn to:- Ask models for sources and then cross-check primary materials,
- Recognize hallucinations and statistical fluency versus factual accuracy,
- Edit and refactor AI-generated code or prose, documenting the reasoning behind changes.
Institutional governance and procurement
Institutions should centralize procurement to secure enterprise or education SKUs that offer stronger data protections (non-training clauses, deletion rights, tenant isolation), rather than allowing ad-hoc consumer tool use that can expose student data. Centralized licensing also helps address equity by ensuring campus-wide access. These procurement choices are not merely contractual; they shape whether student inputs are used to train public models and the legal privacy surface the university must manage.Risks that require urgent attention
While there are clear benefits, several concrete hazards deserve prioritization.- Hallucinations and factual errors: models confidently produce plausible but incorrect content. This is especially dangerous in domains that require precision. Verification circuits must be taught and enforced.
- Academic integrity and fairness: polished AI output can conceal absence of process and produce unfair grading dynamics unless assessment design changes.
- Data privacy and vendor governance: uploading exams, proprietary code, or private data to consumer models can create compliance and IP risks. Enterprise contracts with clear non-training clauses mitigate but do not eliminate governance work.
- Equity and access: premium tools, newer multimodal models, and reliable broadband are uneven across students; without institutional support, AI adoption risks widening achievement gaps.
- Emotional dependence and social impacts: some students treat chatbots as companions, raising mental health and social-skill development concerns that require monitoring and guidance.
Practical recommendations for instructors and institutions
Below are actionable recommendations distilled from CMU’s trial-and-error and a cross-section of high-quality institutional pilots.- Redesign assessments to value process: require draft histories, annotated revisions, and oral explanations for summative tasks.
- Teach AI literacy explicitly: verification practices, prompt design, and critique of outputs should be part of coursework and syllabi.
- Use enterprise/education contracts: secure non-training clauses and deletion rights where possible and standardize campus access to reduce inequity.
- Pilot, evaluate, and scale: run controlled pilots before broad rollouts and publish measurable outcomes (time saved, student performance changes).
- Make transparency routine: require students to declare AI assistance and attach short reflections assessing the model’s reliability and their verification steps.
What employers are already asking — and why CMU is pragmatic
Industry interviews increasingly incorporate questions about AI use. Firms hiring software engineers and data scientists want to know how candidates use tools: whether they can systematize prompt patterns, verify outputs, and integrate model-assisted code into reliable pipelines. Students who refuse to learn these skills risk appearing out of step with practical expectations; yet students who over-rely on tools risk lacking foundational problem-solving resilience. CMU’s pragmatic middle path — teaching AI-literate coding in some courses while preserving AI-free zones for theoretical depth — seeks to produce graduates who can both use tools and explain their limitations.A note on evidence: what’s verified, what needs more study
Many claims in the public conversation are supported by converging institutional pilots and surveys, especially about adoption rates, time-savings and the need for assessment redesign. Randomized trials on long-term retention and transfer remain limited, and specific survey numbers should be read with attention to sampling and methodology. Where single-study claims state precise percentages or effects, readers should consult the original reports for context; policy decisions should be based on replicated, transparent findings whenever feasible. CMU’s internal signals — drops in course enrollment after heavy AI use, lower exam performance for AI-dependent students in theory-heavy classes, shifts in office-hours attendance — are concrete institutional observations, but they are best understood as part of a broader evidence mosaic that includes controlled trials and multi-site replication.Conclusion: a design problem, not just a policing problem
The arrival of generative AI in higher education is not a single cultural shock but a sustained design problem for pedagogy, assessment, procurement and equity. CMU’s approach — differential policies by learning goal, an experimental course that teaches how to use AI, and pragmatic redesigns that privilege process and human interaction — is an instructive model for other institutions wrestling with the same tension. The central lesson is simple but hard: AI should be treated as an augmentation of human learning workflows, not a shortcut around them. When institutions redesign assessments, teach verification, and secure data governance, they can capture AI’s productivity and accessibility benefits while guarding against the measurable learning losses that occur when students outsource core cognitive work.CMU’s experiments are still evolving, and the broader research community must prioritize robust, replicated studies of long-term retention and transfer in AI-augmented learning environments. But the university’s combination of course-level nuance, pedagogy-first interventions, and candid reporting of student outcomes offers a practical, evidence-aligned path forward: teach the tools, preserve the reasoning, and grade the process.
Source: The Tartan AI's Impact on Education at CMU and Beyond - The Tartan