
Microsoft’s push to have employees lean on AI for routine work—including the way performance reviews are written—has rippled through the company and beyond. In June 2025 a division leader’s internal note made clear that AI adoption would be treated as a core expectation, and reporters and commentators have since connected that directive to a broader industry trend of tying AI fluency to performance and career outcomes. The result: a debate that mixes practical productivity questions, concerns about quality (what people call “AI slop”), fairness in evaluation, and basic questions of trust and governance.
Why this matters now (short version)
- A June 2025 internal message from Julia Liuson — who runs Microsoft’s developer tools / GitHub-related organization — asked managers to consider employees’ use of internal AI tooling as part of “holistic reflections” on performance. That memo included the line that “using AI is no longer optional,” and it explicitly encouraged managers to factor AI adoption into how they evaluate people.
- Several outlets have reported that some teams are considering or piloting formal metrics that would measure AI tooling usage as part of reviews; other companies in Big Tech are moving in the same direction.
- That shift has practical consequences for everyday work (what tools you use and how you document output), for how managers assess staff, and for the culture inside organizations that now see AI as central to “how we work.”
- The core message attributed to Julia Liuson in reporting was: “AI is now a fundamental part of how we work. Just like collaboration, data-driven thinking, and effective communication, using AI is no longer optional — it’s core to every role and every level.” Managers were asked to make AI usage “part of your holistic reflections on an individual’s performance and impact.” That line is what sparked headlines about mandatory AI adoption and the idea that AI could affect reviews and promotions.
- Reporters framing the story noted the memo is one signal among many: product teams want higher internal adoption of Copilot and other Microsoft AI services; investors and analysts are watching adoption metrics; and leadership is signalling that the company’s strategy depends on the workforce actually using the tools it sells. Those dynamics help explain the urgency in the language.
- When people call model output “AI slop” they’re pointing to the common experience of receiving plausible-sounding but low-quality, weakly specific, or factually shaky text from generative models—boilerplate that needs heavy editing to actually be useful. This term has moved from forums into journalism because employees and managers often describe AI drafts of reviews, emails, or design documents as “slop” when they’re repetitive, generic, or obviously machine-generated.
- That quality problem matters here because the very thing leaders want—faster drafting and consistent language—can backfire if the output is impersonal, inaccurate, or misapplied in decisions that affect careers. An AI-written paragraph about someone’s “impact” that contains errors or obvious generic phrasing can feel insulting to the employee and risky for a manager who later has to justify ratings or promotions.
- Business Insider obtained reporting about the internal note and published the key quotes and context around managers being asked to consider AI use when reflecting on employee performance. That article is the primary outlet most other writeups cite.
- Forbes and other business outlets framed the same development as part of an industry trend: companies are increasingly asking staff to show “AI-driven impact” and sometimes offering internal tools (or assistants) to help staff write reviews. That trend is visible across multiple large employers.
- The manager who uses AI to draft 20 reviews in an afternoon
- A manager might feed bullet notes into Copilot, get first-draft paragraphs back, edit, and finalize. That saves time—but if editing is cursory, the voice becomes generic and employees notice the boilerplate. Over-reliance without human personalization produces feedback that undermines morale.
- The engineer told to use Copilot every day
- Developer teams are being asked to experiment with and use internal coding assistants in their workflow. For some engineers, this genuinely speeds debugging and scaffolding; for others, it’s an added review burden because the generated code needs careful inspection and fixes.
- The employee who pastes a generic AI self-evaluation into the form
- Workers tempted to paste AI output into self-assessments risk misrepresenting facts or losing the authenticity of their voice. HR practices and calibration meetings may catch some errors, but inconsistent editing and the presence of “AI disclaimers” (or even obvious template text) have surfaced in real internal examples and public anecdotes.
- Quality and accuracy: Generative models can hallucinate facts, misstate dates, or invent metrics. Using them to draft review language that feeds compensation decisions raises clear accuracy and fairness issues.
- Dehumanization: Boilerplate AI language reduces the personal tone of feedback, making recognition feel transactional. That damages morale and trust if employees feel their manager didn’t meaningfully evaluate them.
- Incentive misalignment: If adoption metrics are tracked (raw Copilot calls, number of queries, etc., teams may gamify the measures—calling “use of AI” a proxy for performance rather than a demonstrable impact on outcomes.
- Legal and compliance exposure: Depending on data handling and model training, using AI to process proprietary documents or sensitive performance notes can create privacy or IP risk if not governed properly.
- Unequal access and skill gaps: Not everyone is equally skilled at prompting or evaluating AI output. Turning AI fluency into a performance criterion without training or remediation risks penalizing those who lack access or prompt-engineering experience.
- Product-market alignment: Microsoft (and other Big Tech companies) earn revenue from AI products; higher internal adoption helps product teams find bugs, improve UX, and demonstrate enterprise ROI to customers. There’s a strategic carrot-and-stick element: the company wants its own employees to be flagship users.
- Productivity hypothesis: Leaders argue that AI frees time from routine tasks to concentrate on higher-value work. Where that happens, it can boost throughput and allow people to focus on judgment-intensive tasks.
- Competitive posture: In a market where rivals loudly say “you must use AI,” not using it can be framed internally as falling behind on the firm’s core strategic capabilities.
- Saying “using AI is no longer optional” is rhetorical pressure to adopt tools, not necessarily a straight rule that an employee must submit evidence of X Copilot actions. The nuance matters: managers were told to include AI adoption in their reflections about performance, not to grade people solely on raw usage counters. How teams operationalize that guidance will determine whether it becomes punitive, developmental, or merely descriptive.
- Human-in-the-loop requirement
- Never accept an AI draft as final. Require managers to edit and personalize every review paragraph and to document what was changed.
- Source-provenance checks
- If the AI produced a fact (e.g., “you increased conversion by 32%”), ensure the manager attaches or references the data source.
- Train-and-test
- Provide mandatory, role-specific AI-skilling sessions that include safe prompting, bias detection, and validation techniques before usage affects formal assessments.
- Transparency with employees
- If a manager used AI to draft feedback, the review meeting should include a human explanation of the reasoning behind the rating and specifics that the AI could not produce (context, trade-offs, subjective judgments).
- Don’t turn adoption into a raw metric
- Measure outcomes (time saved, features shipped, customer impact), not raw Copilot query counts. Raw usage is a poor proxy for impact.
- Audit trails: Implement logging that records AI outputs, prompts used, and edits made for performance-critical documents (with employee notice).
- Privacy guardrails: Block high-risk data from being sent to external models; vet allowed models for training-data provenance and retention policies.
- Dispute-resolution mechanisms: Create a clear path for employees to contest review language that appears inaccurate or machine-generated without human context.
- Calibration and bias mitigation: Use diverse calibration panels to spot patterns where AI-assisted reviews systematically under- or over-value certain groups.
- Own your narrative: Keep notes throughout the year (achievements, metrics, feedback) so you can feed accurate inputs to any AI tool and then verify the output.
- Use AI as a drafting partner, not an autopilot: Use prompts to produce drafts, then edit heavily to add voice and specificity.
- Document sources: When you paste AI-crafted claims into a self-evaluation, attach links or evidence (dashboards, PRs, customer quotes).
- Push for transparency: Ask HR or managers how they expect AI to be used in your review cycle and whether there are support resources or training.
- Fairness and auditability: If AI adoption influences pay and promotion, regulators and labor advocates may demand auditability and a clearer definition of what metrics are used and why.
- Worker upskilling and access: Firms should fund equitable training programs; otherwise, tying outcomes to AI fluency will advantage already-resourced employees.
- Disclosure and consent: Employers should disclose if parts of reviews will be AI-assisted and should obtain employee consent for how their data is used in the generation/editing pipeline.
- Public anecdotes—forum threads and internal leak reporting—show managers sometimes copy-pasting AI drafts with minimal edits; employees notice canned language and sometimes find factual errors or embarrassing boilerplate. That phenomenon is what people often refer to as “AI slop.”
- Commentary from analysts and business outlets contextualizes the memo as part of a larger push across major tech firms to make AI fluency a career currency. Some see this as inevitable; others warn it’s premature without proper guardrails and training.
- Pilot phase with opt-in teams (3–6 months)
- Select teams, provide training, instrument tools to measure both usage and the quality of outputs.
- Measurement design (concurrent)
- Define outcome-based KPIs (e.g., time saved on administrative tasks, defect reduction in code generated with AI, customer-facing metrics) instead of raw usage counters.
- Transparency and consent (policy)
- Publish an internal AI use policy that clarifies permitted models, data handling, and disclosure requirements for performance documentation.
- Calibration and audit (quarterly)
- Run blind calibration sessions to check for biasing patterns and adjust guidance.
- Company-wide rollout with remediation
- Scale only after pilots show meaningful productivity or quality gains and after training and governance structures are in place.
- The technology’s promise is real: AI can reduce grunt work, surface forgotten accomplishments for reviews, and help managers synthesize feedback. But the promise is conditional on human oversight, governance, and a careful focus on outcomes rather than raw usage tallies. Without those precautions, the risk is a culture that rewards quantity of AI queries rather than quality of work, or a workplace where people feel their careers hinge on ability to “prompt-engineer” rather than deliver sustained impact.
- Managers: Use AI to help organize facts and craft first drafts, but always add human judgment, context, and specificity before a review is final.
- Employees: Treat AI as a drafting assistant—keep evidence, own your story, and ask for clarity on how AI will be used in any evaluation that affects pay or promotion.
- HR and leaders: If you plan to embed AI use in reviews, design outcome-focused measures, provide training, ensure auditability, and be transparent with employees.
- This feature is grounded in contemporaneous reporting of an internal Microsoft memo first reported in late June 2025 and widely discussed thereafter. Business Insider reported the memo and provided the direct quote attributed to Julia Liuson; analysis and context are available in business coverage that followed.
- Community and internal discussion fragments reflecting the phrase “AI slop” and examples of low-quality AI output are captured in public forum archives and internal commentary that we reviewed as part of this piece. Those excerpts illustrate the lived experience that drives employee skepticism about boilerplate AI reviews.
The debate is not whether AI will be used at work—AI is already here—but how it will be integrated into managerial practice in a way that protects fairness, accuracy, and the human relationships that make feedback — and careers — meaningful. Microsoft’s memo signalled a clear strategic direction; how that direction is operationalized will determine whether the outcome is better, more consistent reviews or a proliferation of impersonal “AI slop” that destroys trust. The choice between those futures is not automated—it's managerial.
If you’d like, I can:
- Expand any of the sections above into a standalone deep-dive (for example, a manager’s playbook with editable templates and a sample training curriculum).
- Produce a short policy memo HR could use to pilot an AI-assisted review program (with suggested wording for disclosure, audit trails, and appeals).
- Create a primer for employees on how to craft evidence-backed self-evaluations that pair well with AI drafting tools.
Source: Neowin https://www.neowin.net/news/microso...te-ai-slop-for-performance-reviews-this-year/


