Botsitting and AI Cleanup: The Hidden Labor Behind Office AI Productivity

A June 2026 Glean Work AI Institute report found that office workers in the United States, United Kingdom, and Australia spend an average of 6.4 hours a week supervising AI systems, even as many say those tools make individual tasks faster. The finding should puncture the cleanest version of the AI productivity story. The issue is no longer whether generative AI can draft, summarize, code, classify, or search; it is whether organizations understand the new layer of human labor required to make those outputs usable. AI is not just automating office work. It is quietly reorganizing office work around inspection, correction, context feeding, and blame avoidance.

Office worker reviewing an AI compliance audit draft on dual monitors with human “quality gate” in view.The Productivity Miracle Has a Labor Shadow​

The first wave of enterprise AI messaging sold time. Copilots would shorten meetings, agents would clear queues, chatbots would turn messy prompts into polished work, and office employees would be freed from low-value toil. That claim was never wholly false; millions of workers have already discovered that a decent model can produce a rough draft, summarize a thread, or translate a pile of notes faster than a human starting from zero.
But time saved at the task level is not the same thing as time saved at the organizational level. The new reporting around “botsitting” captures the difference: workers are not merely using AI, they are managing it. They prompt, wait, check, correct, rerun, paste between systems, explain the obvious, and then absorb the reputational risk if the machine’s work makes it into a client deck, a legal memo, a support response, or a codebase.
That is why the 6.4-hour figure matters. It is not a stray anecdote from a cranky knowledge worker who hates change. It is a measurement of a hidden operating cost that many AI business cases still treat as a rounding error. Nearly a working day each week is not “friction.” It is a new job function, often performed by people whose job descriptions, performance reviews, and calendars have not caught up.
The Chicago Tribune framing — AI cutting hours of office work while creating a new kind of busywork — gets at the contradiction. AI can make one task feel magical and the day feel heavier. The worker who used to write the first draft now supervises the first draft, audits its assumptions, converts it into the organization’s tone, and explains why the result cannot simply be trusted.

Botsitting Is What Happens When AI Becomes Everyone’s Intern​

The term botsitting sounds comic, but it is useful because it restores the human to the center of the workflow. The fantasy of the autonomous agent is a system that receives a goal and returns a result. The reality, in many offices, is closer to an overconfident intern with instant recall, shaky judgment, and no institutional memory unless someone painstakingly supplies it.
That dynamic creates a peculiar kind of fatigue. Traditional busywork is often boring because it is repetitive. AI busywork is exhausting because it demands vigilance. The worker must stay alert enough to catch subtle errors, hallucinated details, missing context, formatting drift, policy violations, and confident nonsense dressed in executive prose.
This is not the same as learning a new keyboard shortcut or adopting a better search tool. Generative AI introduces probabilistic output into environments that still demand accountability. A spreadsheet formula either calculates or fails; a model may produce something plausible, incomplete, legally risky, or simply not aligned with how the company actually works.
The human therefore becomes the quality gate. In many organizations, that quality gate is not formally recognized. It is wedged into existing roles under the assumption that AI assistance is a net reduction in work. That assumption may hold for narrow, well-scoped tasks. It breaks down when AI is sprayed across departments without redesigned processes, training, data access, and governance.

The Personal Productivity Gap Is Now the Enterprise Problem​

One of the most important tensions in the Glean findings is the gap between individual and organizational benefit. A large majority of surveyed workers reportedly said they use AI at work, and many said it makes them more productive. Yet far fewer said their organizations are performing significantly better because of it. That is the productivity paradox in miniature.
The explanation is not mysterious. A worker may save 20 minutes drafting a memo and then spend 15 minutes checking it, five minutes reformatting it, and another 10 minutes clarifying the source material the model mangled. Even when the individual still comes out ahead, the organization may not. The saved time may disappear into review loops, tool switching, redundant approvals, and meetings about how to use AI “responsibly.”
Worse, productivity may become more uneven. The employee who already understands the domain, the company, and the risks can use AI as acceleration. The employee who lacks that context may use AI to generate more work for reviewers. In that world, the productivity gains accrue to the already capable, while the cleanup burden often lands on managers, senior analysts, security teams, legal reviewers, and IT.
That has a familiar ring for anyone who lived through earlier enterprise technology cycles. Email was supposed to reduce meetings; it created inbox work. Collaboration suites were supposed to centralize knowledge; they multiplied channels. Low-code tools were supposed to empower departments; they created shadow systems that IT later had to secure and rationalize. AI is following the same arc, only faster and with higher stakes.

The Real Tax Is Context​

The phrase “AI cleanup” makes the problem sound like error correction, but the deeper tax is context. Most office work is not hard because sentences are hard to write. It is hard because the worker knows which sentence can be said, which number is safe to use, which customer history matters, which policy applies, and which unstated constraint will blow up the plan.
AI systems do not automatically inherit that context. They need access to the right documents, permissions, metadata, workflow states, and business rules. When they do not have it, employees become the connective tissue. They paste snippets from one system into another, summarize the backstory, restate constraints, and clean up the output when the model optimizes for generic plausibility rather than local truth.
This is where WindowsForum readers should see the enterprise IT angle clearly. The model is only the visible surface. Beneath it are identity, access control, endpoint management, data classification, records retention, auditing, network boundaries, and integration architecture. If those foundations are weak, AI does not eliminate complexity. It exposes it.
That is why a worker can be using four or more AI tools and still feel less supported. More tools can mean more interfaces to babysit, more policy ambiguity, more copy-and-paste risk, and more output formats to reconcile. If the “agentic” future depends on humans moving data between agents, the future has already failed its own automation pitch.

Microsoft’s Copilot Ambition Runs Into the Same Office Physics​

Microsoft has been unusually well positioned to sell AI into the workplace because it already owns so much of the work surface: Windows, Microsoft 365, Teams, Outlook, SharePoint, OneDrive, Edge, Entra, Intune, Defender, Power Platform, and Azure. The Copilot strategy is built on a compelling idea: AI becomes more useful when it lives where the work already happens.
That is the right direction, but it does not repeal office physics. A Copilot that can summarize a Teams meeting is useful. A Copilot that summarizes the wrong meeting, misses the decision that happened in chat, or cannot distinguish a tentative idea from an approved action item still needs a human editor. The more authoritative the AI output appears, the more dangerous a quiet error becomes.
Microsoft’s challenge is therefore not just model quality. It is trust calibration. Users need to know when Copilot is drafting, when it is retrieving, when it is inferring, and when it is acting. Administrators need to know what data it can touch, what it logs, how permissions are inherited, and how mistakes can be investigated. Security teams need to understand whether AI expands the blast radius of overshared files and stale access rights.
For Windows administrators, this is not an abstract productivity debate. AI assistants on managed endpoints will touch the same old enterprise problems: least privilege, device posture, data leakage, retention, compliance, and user training. A bad AI rollout is not just a morale issue. It is a governance issue wearing a productivity badge.

The Cleanup Crew Is Often the Most Valuable Staff​

The National CIO Review’s headline — the AI cleanup crew is getting tired — points to a management problem hiding in plain sight. The people who can clean up AI work are often the people with the most institutional knowledge. They know the customer, the system, the exception, the security policy, the executive preference, and the history behind the numbers.
Those employees are valuable precisely because they can tell when AI output is wrong. But that ability makes them magnets for invisible labor. They become reviewers of AI-generated drafts, explainers of AI mistakes, fixers of agentic workflows, and informal trainers for colleagues who have been told to “use AI more” without being given much guidance.
That is a recipe for resentment. When employees are asked to absorb the cognitive load of AI supervision while leadership celebrates automation savings, the bargain starts to look dishonest. The worker experiences AI as another layer of responsibility; the organization reports it as productivity improvement.
Retention risk follows naturally. If the most competent people are assigned the burden of turning machine output into accountable work, they may begin to look for employers that recognize that labor. The irony is sharp: companies may deploy AI to reduce dependence on scarce expertise, only to burn out the experts required to make the AI useful.

AI Fatigue Is Not Anti-Technology Sentiment​

It is tempting for vendors and executives to dismiss complaints about botsitting as resistance to change. That would be a mistake. Office workers are not rejecting AI because they love manual drudgery. Many are using it enthusiastically. The fatigue comes from the gap between the marketed experience and the lived workflow.
In demos, AI operates in a clean room. The source documents are available, the prompt is well formed, the permissions are aligned, the output is polished, and the user accepts the result. In real offices, the data is messy, the files are duplicated, the policies are ambiguous, the output needs review, and the user is interrupted by five other systems demanding attention.
This gap is especially painful because AI tools often encourage overproduction. A system that can generate five drafts, 20 ideas, or 200 lines of code in seconds creates a new burden of selection. Someone has to decide what is good, what is safe, what is redundant, and what is subtly wrong. The bottleneck moves from production to judgment.
That shift matters culturally. Knowledge workers derive status not only from finishing tasks but from owning judgment. If AI turns more employees into supervisors of machine work while reducing space for original thought, organizations may gain throughput and lose engagement. The office may become faster and more alienating at the same time.

The ROI Spreadsheet Has Been Too Polite​

AI business cases often count subscription costs, infrastructure costs, implementation costs, and projected labor savings. They are less likely to count review time, rework, prompt tinkering, compliance checks, duplicate tool usage, and the opportunity cost of senior employees correcting junior employees’ AI-assisted work. That omission makes the ROI look cleaner than the workplace.
This is not a call to stop deploying AI. It is a call to stop pretending that deployment is adoption and adoption is value. The number of active users tells executives that people are touching the tool. It does not tell them whether the tool improved a workflow, reduced cycle time, improved quality, lowered risk, or merely shifted effort into unmeasured cleanup.
A more honest AI scorecard would measure the full path from request to accepted result. How many steps did the AI remove? How many did it add? How often did humans redo the work? Which roles absorbed the review burden? Which outputs could be trusted with light editing, and which required expert reconstruction?
Without that accounting, companies risk optimizing for visible activity. Employees will use AI because they are expected to use AI. They will generate more artifacts because generation is easy. The organization will then drown in plausible drafts, summaries, and recommendations that still require human sorting.

Disconnected Tools Are Turning Workers Into Middleware​

The most revealing complaints about workplace AI often sound less like complaints about intelligence and more like complaints about integration. The model may be clever, but it does not know where the latest contract lives. The assistant can summarize a document, but not connect that summary to the ticketing system. The chatbot can answer a policy question, but not open the HR workflow that resolves it.
When systems are disconnected, humans become middleware. They copy data, translate formats, reconcile contradictory records, and make judgment calls across software boundaries. AI can make that work faster in places, but it can also make the handoffs more numerous. A worker who once completed a task in three familiar systems may now consult two AI assistants, verify output in another tool, and paste the final version back into the original workflow.
For IT leaders, this should shift the adoption conversation away from “Which AI tool should we buy?” and toward “Which workflow are we redesigning?” A chatbot bolted onto a broken process produces a faster broken process. An agent without reliable permissions, APIs, logging, and rollback is not a labor-saving device; it is an operational liability.
The most successful AI deployments are likely to be boring in exactly this way. They will have fewer splashy prompts and more disciplined plumbing. They will connect to governed data, respect identity boundaries, surface uncertainty, and make human review easier rather than pretending it can be eliminated.

Security Teams Will Be Asked to Clean Up, Too​

Every wave of workplace technology creates a shadow backlog for security. AI is no different. Employees experimenting with public tools may paste sensitive data into systems the company has not approved. Departments may procure AI services outside IT review. Generated content may carry confidential fragments, licensing concerns, or policy violations. Agents may be granted broader access than they need because narrow access is harder to configure.
The botsitting burden therefore has a security twin. Someone has to classify what AI can see, decide which tools are approved, monitor usage, investigate incidents, and educate users about safe behavior. If that work is not funded and staffed, it lands on already stretched security and IT operations teams.
There is also a subtler risk: AI can launder uncertainty into confidence. A polished answer may hide weak sourcing. A generated script may run with privileges the user does not fully understand. A summarized incident may omit the anomaly that mattered. The cleanup crew in security is not just correcting prose; it is protecting the organization from automation that makes bad assumptions look official.
This is why governance cannot be treated as a brake on innovation. Good governance is what allows AI to scale without forcing every worker to become a full-time skeptic. The goal is not to smother experimentation. It is to make safe, useful paths easier than risky workarounds.

The Agentic Office Still Needs Adult Supervision​

The next phase of AI marketing is already centered on agents: systems that do not merely answer but act. They will file tickets, update records, schedule follow-ups, draft responses, query databases, and execute multistep workflows. In theory, this should reduce botsitting. In practice, it could intensify it if organizations skip the hard design work.
An agent that acts incorrectly is more dangerous than a chatbot that writes incorrectly. A flawed paragraph can be edited. A flawed action may change a record, notify a customer, trigger a workflow, or expose data. The more autonomy a system receives, the more important observability, permissions, approval gates, and rollback become.
This is where the “AI will replace busywork” narrative needs revision. AI may replace some manual steps, but it will also create supervisory work. The question is whether that supervisory work is designed, supported, measured, and compensated — or whether it is pushed onto employees as an invisible expectation.
The best agentic systems will not ask users to babysit every move. They will make clear what they did, why they did it, what data they used, and where uncertainty remains. They will escalate appropriately. They will fail safely. They will reduce the user’s mental load rather than converting the user into a nervous air-traffic controller for software robots.

CIOs Need to Treat Botsitting as a Design Defect​

The practical lesson for CIOs is blunt: if employees are spending nearly a day a week managing AI, the organization has not automated work so much as redistributed it. That may still be worthwhile in some cases, but it must be understood as a design choice. Hidden labor is still labor.
A mature AI program should begin with workflow inventory. Which tasks are AI-assisted today? Which ones produce accepted results faster? Which ones create rework? Which teams are acting as informal reviewers? Which data gaps force users to supply context manually? Which tools overlap? Which outputs are too risky to trust without expert review?
That analysis will often reveal that the right answer is not another model. It may be better retrieval, cleaner permissions, fewer redundant tools, stronger templates, clearer usage rules, better training, or a narrower deployment. In enterprise IT, boring improvements frequently beat glamorous procurement.
The other necessary move is managerial honesty. If AI supervision is part of the job, say so. Train for it. Measure it. Reward it. Do not call a worker “resistant” because they are doing the quality-control work that keeps the company out of trouble.

The Windows Desktop Is Becoming the AI Control Room​

For Windows users, this shift will be felt most directly at the desktop. AI is moving into operating systems, browsers, office suites, development environments, security consoles, and collaboration tools. The old desktop was a place where users operated applications. The new desktop is becoming a place where users supervise software that operates other software.
That changes what usability means. A good interface is no longer just one that exposes features cleanly. It must expose agency cleanly. Users need to see what the AI knows, what it can do, what it is about to do, and what it has already done. Administrators need controls that are granular enough to prevent oversharing but simple enough to maintain.
This also changes endpoint management. AI features will arrive through monthly app updates, cloud-side changes, browser integrations, and licensing toggles. Organizations that once worried about standardizing Office macros will now worry about standardizing AI behavior. Policy drift could become a major source of confusion: one worker’s assistant can access a repository, another’s cannot, and a third is using an unsanctioned external tool because it gives better answers.
Windows has always been the messy middle of enterprise work. That makes it the natural proving ground for AI’s real impact. If AI can reduce context switching, respect enterprise controls, and make work easier on managed Windows endpoints, the productivity story strengthens. If it adds another pane, another prompt, and another review burden, users will notice.

The Useful Lesson Hidden in the Busywork​

The botsitting debate should not be read as proof that AI is a bust. It is proof that AI is entering the hard part of enterprise adoption. The easy part was showing that models can generate useful artifacts. The hard part is embedding them into accountable systems where quality, security, compliance, and morale matter.
That distinction is important because it separates vendor promise from operational reality. Vendors are incentivized to highlight capability. Enterprises must manage consequence. A model that can do 80 percent of a task is impressive in a demo; in production, the remaining 20 percent may contain nearly all the risk.
Organizations that understand this will stop treating human review as a temporary flaw that disappears with the next model. Some review burden will shrink as models improve and integrations mature. But judgment, accountability, and institutional context will remain central. The future of AI work is not human-free. It is human-shaped, whether executives admit it or not.
The goal should be to make that human role better. AI should give workers leverage, not a second shift of invisible supervision. That requires disciplined architecture, clear governance, realistic metrics, and respect for the people doing the cleanup.

The Office AI Bill Is Coming Due​

The concrete message for Windows shops, CIOs, and department leaders is not that AI should be paused. It is that AI’s hidden labor must be brought onto the balance sheet before the next wave of agents expands it.
  • Organizations should measure AI work from request to accepted output, not merely count tool usage or generated drafts.
  • IT leaders should treat disconnected AI tools as a workflow problem, because employees become the integration layer when systems do not talk to each other.
  • Managers should recognize AI review and correction as real labor, especially when it falls on senior employees with scarce institutional knowledge.
  • Security teams should be involved before AI tools spread across departments, because cleanup after data exposure or unsafe automation is far costlier than early governance.
  • Microsoft 365 and Windows AI deployments should be judged by whether they reduce context switching and rework, not by whether they add another visible assistant to the desktop.
  • Agentic AI should earn autonomy gradually, with logging, permissions, approval gates, and rollback mechanisms treated as core features rather than enterprise add-ons.
The next year of workplace AI will be less about astonishment and more about accounting. The winners will not be the organizations that generate the most AI-assisted output, but the ones that redesign work so the machines remove burdens instead of disguising them. If the industry gets that right, AI may still deliver the productivity gains promised in the slide decks; if it gets it wrong, the office of the future will be staffed by tired humans quietly cleaning up after software that was supposed to set them free.

References​

  1. Primary source: Chicago Tribune
    Published: Mon, 15 Jun 2026 19:22:50 GMT
  2. Independent coverage: The National CIO Review
    Published: Mon, 15 Jun 2026 14:43:09 GMT
 

Back
Top