Rethinking Grading in Higher Ed: Mastery, Equity, and AI in the Post-Pandemic Classroom

  • Thread Author
A growing national debate over grading, accelerated by pandemic-era policies and reshaped classroom practices, has been reignited by a new Harvard College report and coverage in The Chronicle of Higher Education — and it lands squarely on the classroom floor where faculty, students, and institutional leaders must now decide whether to revert to older assessment models, adapt them for Gen Z learners, or invent something better that preserves rigor without sacrificing equity.

Background​

The conversation Jessica Johnson raised in her Creators Syndicate column — that post‑COVID expectations, active‑learning shifts, and pandemic-era grading leniency have altered students’ ideas about what merits an A — follows a pattern now visible across higher education. Her classroom vignette, which describes requiring peer tutoring, explicitly banning AI for grading, and encouraging responsible AI use for preparation, positions the instructor as both coach and gatekeeper in a landscape of changing assessment norms. That account is consistent with a broader institutional reckoning over grades, workload, and learning design reported by national outlets and by Harvard’s Office of Undergraduate Education. This feature examines the evidence behind the headlines, evaluates competing proposals (from returning to in‑person high‑stakes exams to alternative‑grading models), weighs the pedagogical research on active learning and generational learning preferences, and offers practical options colleges and instructors can use to align assessment with learning goals while guarding against unintended harms.

What Harvard’s grading report actually says — and why it matters​

The headline numbers​

Harvard’s October update on grading and workload argues that undergraduate grading has become compressed and inflated, with the share of A‑range grades rising dramatically over recent decades and now accounting for a majority of recorded course grades. The report cites a median GPA climb (Class of 2015: ~3.64; Class of 2025: ~3.83) and finds that roughly 60% of undergraduate grades now fall in the A range. Harvard’s Office of Undergraduate Education frames the issue as one that undermines grading’s core function: to differentiate levels of mastery and guide learning improvement. These specific figures are important beyond Harvard: they crystallize a national anxiety about how employers, graduate programs, and students themselves interpret transcript signals in an era where many elites now award high marks en masse. Multiple independent outlets corroborated Harvard’s statistics when the report was distributed to faculty and the campus community.

The deeper diagnosis​

Harvard’s analysis points to several drivers beyond student effort: faculty concerns about receiving poor course evaluations; departmental and institutional incentives that reward student satisfaction; pandemic disruptions that prompted temporary grading shifts; and structural pressures that make it socially awkward for faculty to assign low marks when peers in other departments avoid them. The report does not simply blame students; it focuses on systemic alignment problems between grading practices and academic goals.

What changed during COVID‑19: the policies and practices that reshaped expectations​

Emergency grading and the “no-fail” era​

When the pandemic forced instruction online in spring 2020, many K‑12 districts and colleges adopted emergency grading policies — universal pass/fail options, deferred failing marks, or “no‑fail” directives — to avoid punishing students who lacked reliable internet access or faced illness and family crises. Large public systems, including Los Angeles Unified, temporarily suspended failing grades; many top universities offered pass/fail schemes for one or more terms. These policies were intended as stop‑gap measures of equity and compassion, but they also normalized exceptions that some students and families now recall as precedent.
  • Examples: LAUSD’s 2020 “no‑fail” extension; Ivy League and peer institutions offering pass/fail options or optional pass/fail during spring 2020.

Pedagogical adaptations and active learning’s accelerated rollout​

At the same time, instructors were forced to reimagine courses for remote delivery. Many pivoted toward active‑learning strategies, project‑based tasks, open‑book assessments, and alternative demonstrations of mastery that emphasized formative feedback and iterative improvement rather than single high‑stakes summative exams. Those approaches — often rooted in sound learning science — can produce stronger learning outcomes but may also complicate external signals (like letter grades) that stakeholders expect to be crisp and comparable across courses.

Active learning, grade inflation, and the tension between mastery and signaling​

Active learning improves outcomes — but changes grade distributions​

A large meta‑analysis in STEM education showed that active learning improves exam performance by about six percentage points and reduces failure rates substantially compared with traditional lecture formats. Those gains are real and repeatable across class sizes and disciplines, and they argue for wider adoption of student‑centered instruction. But when instruction becomes better at elevating baseline performance (and when assignments are restructured to prioritize iterative projects), grade distributions can compress toward the top — a positive educational outcome viewed by some administrators as “grade inflation” and by others as genuine mastery.

Two separate functions of grades​

It helps to separate two purposes grades have historically served:
  • Formative function (learning): Provide feedback to help students improve; motivate sustained engagement through scaffolding and revision; reward demonstrated mastery of course outcomes.
  • Summative/signal function (sorting): Communicate relative standing to external parties — graduate schools, employers, and scholarship committees — who expect a distribution they can interpret.
Pandemic-era shifts prioritized formative flexibility; institutional leaders now worry the signal function has weakened. The dilemma is real but solvable: assessment design can preserve both functions with intentional mapping of assignments to outcomes, clear rubrics, and transparency about what a grade represents.

Gen Z learning preferences: “digital native” expectations and the myth of fixed learning styles​

What students say they want​

Jessica Johnson’s classroom anecdote — students asking for more interactive, hands‑on evaluation and wanting assignments that reflect Gen Z concerns — aligns with multiple surveys and studies showing younger learners often prefer applied, participatory learning, blended digital/physical materials, frequent formative feedback, and opportunities for creative work. Many describe themselves as digital natives, value immediate, actionable feedback, and favor a mix of independent and collaborative learning activities.

Learning styles vs. learning strategies​

What modern cognitive science warns against is rigidly matching instruction to static “learning styles.” The evidence does not support the idea that a “visual learner” learns best only from visuals; instead, learners benefit from varied strategies and task‑appropriate modalities. The practical takeaway is to offer multiple pathways and varied assessments, not to lock students into one purported style. That approach increases accessibility and helps prevent the label “learning style” from becoming an excuse for shallow instruction.

AI in the classroom: Copilot, academic integrity, and instructor choices​

Copilot and other AI tools are already in campus workflows​

Microsoft’s Copilot family (Microsoft 365 Copilot, Copilot Chat, GitHub Copilot) has been explicitly positioned for education: Microsoft has expanded availability to students 13+ (with enterprise data protections), produced educator guidance, and showcased pilot implementations where AI augments feedback, ideation, and project design. Institutions are adopting AI for tutoring, assignment scaffolding, and productivity tasks — which raises both opportunity (scaffolding higher‑order work, scaling feedback) and risk (misuse for ghostwriting, unacknowledged assistance).

Practical classroom policies that work​

  • Transparency first: Explicitly state in syllabi what constitutes allowed AI use (outline vs. drafting vs. citation) and how you will evaluate originality and learning.
  • Design around AI: Favor assessments that require process documentation, in‑class demonstrations, oral defense, curated portfolios, or iterative drafts — tasks that make ghostwriting difficult and learning visible.
  • Teach AI literacy: Spend classroom time on prompting, verifying sources, and ethical AI use so students can use tools responsibly rather than hiding use out of fear. Jessica Johnson’s approach — encouraging responsible Copilot use for slide design and research scaffolding but disallowing AI for final drafts — is an example many instructors are adopting.

Alternative grading models: promises, pitfalls, and a pragmatic middle path​

Popular alternatives and why educators consider them​

  • Contract grading / mastery contracts: Students earn grades by meeting predefined milestones; can motivate deliberate practice.
  • Specifications grading: Work is evaluated pass/fail against explicit specs; grades composed of bundles of passing tasks.
  • Narrative evaluations: Qualitative comments replace letter grades entirely; better for feedback but worse for external comparability.
  • Ungraded or ungrading approaches: Shifts focus to self‑assessment and revision; can increase motivation but often demand more instructor time.
Proponents argue these models emphasize learning, equity, and motivation. Critics worry they complicate transcripts, require more labor from faculty, and shift burden to explain grades to external evaluators.

Risks and systemic effects​

  • Signaling erosion: If a transcript no longer translates into an expected university‑wide scale, students may be disadvantaged in the job market or graduate admissions.
  • Resource intensiveness: Contracts, portfolios, and narrative evaluation require faculty time and institutional support that many departments lack.
  • Equity paradox: Well‑resourced students often navigate nontraditional assessment more successfully, while some disadvantaged students may prefer straightforward criteria and clear paths to credit.

A pragmatic “best of both worlds” (a hybrid approach)​

Jessica Johnson’s claim that instructors can “have the best of both worlds” — preserving rigorous grading while offering Gen Z‑friendly assignments — is achievable with design discipline:
  • Map every assignment to a small set of measurable competencies.
  • Use rubrics that translate qualitative mastery into consistent grade bands.
  • Reserve some assessments for controlled, in‑person, or proctored demonstration of skills (where signal clarity is crucial).
  • Offer frequent formative tasks (peer tutoring, iterative drafts) that feed into a smaller set of summative artifacts used for high‑stakes ranking.
  • Consider transcript annotations (where allowed) that explain alternative assessment methods, or provide departmental statements about grading norms.
These steps preserve the external comparability of grades while honoring active‑learning and Gen Z preferences for interactive, meaningful work.

Practical recommendations for instructors and departments​

For individual instructors​

  • Be explicit in the syllabus: State grading philosophy, allowed AI uses, and how formative activities translate into the final grade.
  • Make learning visible: Require one or more in‑person or synchronous demonstrations (presentations, brief oral exams, lab practicals) that validate mastery.
  • Use clear rubrics: Publish rubrics that explain what an A, B, or C represents in terms of skill and evidence.
  • Embed process evidence: Ask students to submit work logs, drafts, reflections, or version history that show individual contribution.
  • Model AI use: Demonstrate how to prompt tools and verify outputs; teach students to cite AI assistance the way they cite sources.

For departments and institutions​

  • Provide faculty time and training for rubric creation, grading calibration, and anti‑bias moderation.
  • Publish departmental medians or grade‑range guidelines to reduce cross‑course inconsistency while preserving academic freedom.
  • Invest in AI governance (privacy protections, training, tool procurement) so faculty need not rely on ad‑hoc BYOAI approaches.
  • Pilot transcript annotations or supplemental narrative statements for nontraditional programs, and liaise with graduate‑admissions and HR communities to explain assessment methods.

Risks, trade‑offs, and red flags​

  • Backsliding to high‑stakes testing: Moving abruptly to a regime of frequent in‑person, high‑stakes exams to “fix” inflation risks excluding students with disabilities, caretaking responsibilities, or jobs — unless accommodations are thoughtfully provided.
  • Overreliance on technology: Delegating formative feedback to AI without oversight can mislead students and erode learning if instructors don’t validate AI outputs.
  • Equity blind spots: Whatever model is adopted, ensure that flexibility doesn’t become lower expectations for historically marginalized students or, conversely, that stricter norms don’t disproportionately penalize those juggling external responsibilities.
When public discussion focuses narrowly on “easy grades” or “entitlement,” it misses the more structural issues Harvard’s report raised: assessment incentives, cross‑departmental comparability, and institutional governance. That’s where sustainable reform begins.

How to measure success after reform​

Set clear, measurable goals before changing practice:
  • Track distributional changes (median, variance) and relate them to course‑level learning outcomes rather than raw grade percentiles.
  • Survey alumni, employers, and graduate programs about preparedness and readiness for advanced work.
  • Monitor student workload, mental‑health metrics, retention in majors, and equity of outcomes across demographics.
  • Run small controlled pilots (e.g., specs grading in one course section) and compare learning outcomes vs. traditional sections.
This iterative, data‑driven approach helps avoid crude fixes and supports evidence‑based policy.

Conclusion​

The debate over grading is not a simple contest between old‑school exams and pandemic leniency. It is a deeper question about what grades are for — are they primarily feedback mechanisms, sorting signals, or both? Jessica Johnson’s classroom practice — combining peer tutoring, explicit policies on AI, and assignments that connect to Gen Z concerns — is a practical microcosm of the balanced path many instructors should consider.
Institutions should heed Harvard’s warning about compressed grading distributions, but they should also respect the pedagogical gains of active learning and the legitimate needs of a generation raised with digital tools. The best reforms will be those that: (1) clarify the purpose of grades, (2) design assessments that make learning visible, (3) integrate AI literacy rather than banish the technology, and (4) create cross‑departmental norms that protect both rigor and equity.
If higher education can align assessment with clear learning outcomes, communicate transparently to students and external stakeholders, and invest in the training and time faculty need to do this work well, the result will not be a return to an inaccessible past or a capitulation to pandemic-era exceptions. Instead, it will be an intentional, evidence‑based advancement of assessment that serves learners, preserves academic standards, and recognizes that the classroom — post‑COVID and AI‑enabled — must evolve to meet both students’ needs and society’s expectations.
Source: Creators Syndicate Accounting for Different Learning Styles Post-COVID-19, by Jessica Johnson