Corrections NZ Tightens AI Use After Copilot Chat Misuse and Privacy Review

  • Thread Author
Corrections has quietly moved from piloting generative tools to policing them: after a small number of staff were found to have used Microsoft Copilot Chat to help draft formal casework — including Extended Supervision Order reports — the department has labelled that behaviour “unacceptable,” launched a privacy risk assessment, and reiterated strict boundaries around what staff may and may not put into AI chat interfaces.

Laptop screen shows an AI governance chat as a glowing shield with a lock above case notes.Background​

Corrections’ announcement is the latest example of a public-sector organisation confronting the messy gap between enthusiasm for productivity-enhancing AI and the hard legal, ethical, and operational risks that follow when those tools are used around sensitive personal data. The Department of Corrections in New Zealand says its authorised AI footprint is deliberately narrow: staff may access a standalone Copilot Chat feature that sits under the organisation’s Microsoft 365 licence, while other third‑party AI apps are blocked on the Corrections network. That restriction is intended to keep AI interactions inside an enterprise-controlled environment with established privacy and security controls.
At the same time, the department’s guidance is explicit: personal information — names, identifiers, health or medical information, and details relating to people under Corrections’ management — must not be entered into Copilot Chat, and the tool must not be used to draft, structure, analyse, or generate content for reports or assessments that contain personal information. Corrections also reports that since Copilot was introduced on managed devices in November 2025, roughly 30 percent of staff have engaged with the tool. A privacy risk assessment has already been completed where misuse was identified, and the department says auditing of prompts is possible because prompts and responses are searchable and exportable for review.

What happened — a concise account​

  • Corrections discovered a small number of incidents in which staff used Copilot Chat in ways that contravened the AI policy — specifically, assisting with the drafting of formal reports that contain personal information.
  • The department restricted access to Copilot so that only the free Copilot Chat feature available via Microsoft 365 is permitted on managed devices; other public AI applications are blocked on the Corrections network.
  • The agency completed a privacy risk assessment in response to the incidents and has reminded staff that misuse is “unacceptable”; it also stated that it has an AI assurance function within cybersecurity and participates in the All‑of‑Government community of practice on AI.
These are short, factual points. The deeper story is not only the immediate breach of policy but how Corrections’ approach exposes the practical trade-offs all governments face when allowing modern AI tools into day‑to‑day public service workflows.

Why this matters: legal, safety and ethical stakes​

The Privacy Act is the baseline​

Under New Zealand’s Privacy Act, the collection, use and disclosure of personal information — including through AI tools — remain regulated activities. Agencies are responsible for ensuring that their use of technology complies with privacy principles: that personal information is collected lawfully, is securely held, is only used for authorized purposes, and that individuals can exercise their rights to access and correct information about themselves. The Office of the Privacy Commissioner has been explicit: the Privacy Act applies to AI-driven uses and agencies must understand the technologies they deploy and ensure that usages meet privacy requirements.
Putting case-sensitive corrections material — criminal-history details, mental‑health notes, rehabilitation progress and supervision conditions — into a generative AI chat is not a theoretical risk. Even if prompts and responses are kept within a Microsoft tenant, mistakes in prompts, misconfiguration, or unauthorized copying of outputs can create real and lasting privacy harms. The legal and reputational consequences for agencies that fail to protect this data are significant.

Accuracy, explainability and downstream effects​

Generative AI is not a neutral drafting assistant; it can hallucinate, omit salient context, or reframe narratives in ways that alter meaning. For frontline probation officers and community corrections staff, a report that misstates a risk factor or misattributes a behavioural sign can materially affect an individual’s liberty and supervision conditions.
  • AI outputs require human oversight. Corrections’ policy that Copilot should be used only for non‑sensitive, assistive tasks is aligned with best practice: where outputs affect decisions about people, a trained human must review, correct and take responsibility for the final document.
  • Because corrections reports are often used in courts and to inform ministerial or judicial decisions, their provenance and accuracy must be defensible. Accepting AI-drafted paragraphs without robust verification is a pathway to error and liability.

Security and data residency questions​

Microsoft emphasises enterprise data protection for Copilot Chat: prompts and responses for enterprise users are logged subject to tenant policies and commitments, and are not used by Microsoft to train its foundation models under enterprise terms. However, that model still requires trust in cloud controls, the tenant configuration, and the surrounding operational practices. Agencies must consider where LLM calls are actually processed, what metadata is created, and how logs are stored and protected. Relying on a vendor’s enterprise safeguards reduces some risks but does not eliminate them.

Corrections’ controls: what they’ve done and where the gaps remain​

Controls in place​

  • Network-level blocking of public AI apps outside the approved Microsoft Copilot Chat deployment.
  • A written AI policy aligned to Government Chief Digital Officer guidance, explicitly forbidding entry of personal information into Copilot Chat and the drafting of personal-data reports.
  • Auditability: prompts and outputs are searchable and exportable for review, enabling retrospective investigations where misuse is suspected.
  • An organisational AI governance structure: an AI assurance officer sits inside the directorate for cybersecurity and an AI working group provides governance and guidance.

Where practical problems remain​

  • Policy adherence vs. frontline realities: staff under time pressure or with limited digital literacy may still be tempted to paste sensitive material into a chat for drafting efficiency. Policies are only as effective as training, enforcement, and the usability of approved tools and processes.
  • Culture and incentives: if staff perceive AI as a faster route to completing paperwork and risk little accountability, policy alone won’t stop careless behaviours. Corrections’ statement that it will audit and has issued reminders is necessary but not sufficient to change day‑to‑day practice.
  • Technical nuance: the difference between Copilot Chat as a free feature and the full Microsoft 365 Copilot enterprise product has operational implications. If staff sign into Copilot Chat with their enterprise credentials, enterprise data protections can apply — but that depends on correct tenant configuration and the particular Copilot feature set in use. Misunderstanding these differences can create a false sense of security.

The political and public-administration angle​

Public servants working with highly sensitive citizen information are held to high standards of accountability. When an agency like Corrections (which manages people under supervision and those in custody) reports misuse of AI, it raises questions about procurement, training, and governance across the wider public sector.
New Zealand’s Government Chief Digital Officer (GCDO) has been leading efforts to create an All‑of‑Government approach to AI, including communities of practice and public service guidance on GenAI. That central leadership is important: consistency across agencies reduces the risk that one agency’s mistake becomes a systemic problem. Corrections’ participation in the All‑of‑Government community of practice is therefore an important compliance signal — but participation does not obviate the need for local competence and enforcement.

What Corrections and similar agencies should do — operational recommendations​

Below are practical actions that public agencies handling sensitive personal information should adopt immediately. These are ranked in order of priority.
  • Reinforce “no personal data” rules with mandatory, scenario-based training for all staff who might interact with AI tools.
  • Implement technical controls at the tenant and endpoint level to prevent copy‑paste of classified fields into chatboxes and to warn or block when certain data classes are detected.
  • Require that any AI‑assisted draft of operational casework include an explicit, auditable human review checklist before submission, with name, role and timestamp captured.
  • Enforce logging and periodic audits of AI prompts and responses, with automated red-flags for prompts that include personally identifiable information (PII).
  • Use data‑classification policies to create whitelists and blacklists: identify what information is absolutely off‑limits for generative models and ensure those controls are enforced by tooling.
  • Consider a staged deployment strategy: start with low‑risk pilots, evaluate outcomes, and expand only when compliance and safety metrics are consistently met.
  • Maintain a clear breach response playbook that includes timely notification to privacy authorities, affected individuals, and transparent internal reporting.
All of these steps are consistent with the Privacy Commissioner’s expectation that agencies carry out privacy impact assessments, get senior leadership approvals for AI deployments, and maintain human oversight where outputs affect individuals.

Technical mitigations and choices​

  • Enterprise data protection (EDP) and tenant isolation: configure Copilot so prompts and responses are retained within the Microsoft 365 service boundary, and confirm contractual and technical guarantees that enterprise data is not used to improve vendor models. But note: EDP relies on correct configuration and active enforcement. Regular validation by internal auditors is essential.
  • Data loss prevention (DLP) integrated with Copilot: modern DLP tools can detect PII and prevent it from being pasted into chat UIs. This should be paired with contextual user warnings and mandatory supervisor approval for edge cases.
  • Role-based access controls: limit the features available to frontline staff. For instance, allow Copilot to suggest wording for generic administrative communications but disable file upload and document grounding where sensitive files could be referenced.
  • Private LLM instances for high-risk workflows: for truly sensitive workloads some agencies may choose dedicated, on‑premises or contractually segregated cloud LLMs that offer stronger contractual data residency and non‑training guarantees. This is more expensive but the right choice where the risk profile demands it.
  • Tamper-evident audit trails: prompts used in decision‑relevant documents should be captured in immutable logs, linked to the final human‑approved document, and retained under a clear retention policy.

Training, culture and the human factor​

Technology controls alone won’t prevent misuse. Corrections’ emphasis on ongoing conversations with Community Corrections staff and regular reminders of its AI policy is the right cultural move — but more is required.
  • Training should be practical and role-based: short modules that show exactly what is and isn’t allowed, with examples drawn from real reporting tasks.
  • Supervisors must be capable of identifying AI artifacts in prose and challenging staff drafts when AI has been used. This requires training for managers as much as frontline users.
  • Reward structures should not unintentionally incentivise cutting corners. If performance metrics prioritise throughput without safeguarding quality, staff will seek shortcuts.
The Privacy Commissioner has urged agencies to engage staff, be transparent about AI use, and ensure there is human review prior to acting on AI outputs — guidance that aligns perfectly with what Corrections says it is trying to do.

Accountability: audits, sanctions and reporting​

Corrections makes two consequential points: prompts and responses are auditable, and misuse is treated “extremely seriously.” That creates a path for accountability, but public agencies must create a balanced enforcement regime that focuses on remediation and learning rather than only punishment.
  • For inadvertent or low‑harm incidents: mandated refresher training, documented remediation plans, and supervised rework of affected reports.
  • For willful or repeated breaches: formal disciplinary processes, escalation internally and — where required by privacy law — notification to the Office of the Privacy Commissioner and affected individuals.
  • For systemic failures (e.g., misconfigured tenant, inadequate DLP, no human review): commissioning an independent review, public reporting of findings, and concrete timelines for remediation.
Corrections has said that as of the most recent update no notifications had been made to the Privacy Commissioner, but that the department’s privacy team was working with relevant work groups to provide further guidance in Community Corrections. Transparent, timely reporting of outcomes from the privacy risk assessment will help rebuild public trust.

Broader lessons for government: what other agencies should take from this​

  • Central guidance is necessary but not sufficient. The GCDO’s All‑of‑Government work and public‑service AI guidance are essential backstops; agencies must operationalise those principles locally.
  • Vendor guarantees matter — but so does independent verification. Microsoft’s enterprise promises for Copilot’s data protections reduce risk, but agencies must validate configuration and monitor telemetry.
  • Low‑risk pilots will surface cultural and training gaps. Use them to fix process and governance before rolling tools into higher‑risk casework.
  • Don’t treat AI as a productivity plug‑in only: treat it as an organisational change that touches policy, procurement, audit, legal, privacy, HR and frontline operations.

Risks to watch beyond privacy and accuracy​

  • Re-identification risks from “de‑identified” text: even partial personal details can be combined to re‑identify individuals, particularly in small communities or where cases are high‑profile.
  • Differential impact and bias: AI outputs may contain subtle cultural, gender or ethnic biases that can skew assessment language or the framing of risk.
  • Chaining and provenance: if an AI‑assisted report is used to justify further automated decisions, errors compound and become harder to reverse.
  • Vendor dependency and supply‑chain risk: heavy reliance on a single cloud vendor for both productivity and AI increases systemic exposure to outages, policy changes, or contractual disputes.
These are not academic worries; they are plausible, foreseeable harms that should be actively mitigated.

Conclusion​

Corrections’ decision to call staff use of AI outside approved parameters “unacceptable,” and to conduct a privacy risk assessment, is the right immediate response to identified misuse. The agency’s mix of technical controls, policy clarity, auditability and engagement with All‑of‑Government AI governance gives it a structured path to safer adoption.
But the incident also illustrates an uncomfortable truth for public administration: introducing advanced AI into high‑stakes workflows exposes latent gaps in training, culture, technical configuration and governance. Enterprise features such as Microsoft’s Copilot Chat and EDP provide important protections, yet they are not substitutes for rigorous human oversight, role‑based limits, and ongoing audit and training regimes. Agencies that treat AI as a simple productivity upgrade rather than an organisational change risk harming the very people they are mandated to protect.
For Corrections and other agencies working with sensitive personal information, the calculus must remain conservative: use AI to reduce mundane administrative friction, but never to shortcut the human judgement, accountability, and legal safeguards that underpin public trust.

Source: Otago Daily Times Corrections labels staff's AI use as 'unacceptable'
 

Corrections in New Zealand has labelled staff use of generative AI to draft formal casework “unacceptable,” after an internal review found a small number of probation and community‑corrections staff had used Microsoft Copilot Chat to assist with reports that contain personal and health information.

AI governance setting shows UNACCEPTABLE stamp on a laptop screen.Background​

The Department of Corrections introduced Microsoft Copilot on managed devices in November 2025 as a narrow, controlled way to give staff access to an enterprise AI assistant. Corrections says only the standalone Copilot Chat feature available through its Microsoft 365 tenant is permitted on the Corrections network; other public AI services are blocked. The department’s AI policy — explicitly aligned to guidance from the Government Chief Digital Officer — prohibits entering personal identifying information, health details, or other case-sensitive material into Copilot Chat, and forbids using Copilot to draft or generate content for reports and assessments that contain personal information. Where breaches were fducted a privacy risk assessment and reiterated that misuse is “unacceptable.”
This incident is a compact case study in the wider tension public‑sector organisations face: how to harness productivity tools that promise time savings while managing legal, privacy, evidentiary and safety risks when those tools touch highly sensitive human‑services data.

What the department says happened​

  • Corrections restricted AI access on its network to Microsoft Copilot Chat and blocked consumer AI web apps to keep AI interactions inside a controllable enterprise boundary.
  • About 30% of staff had engaged with Copilot since rollout on Corrections devices in November 2025; uptake remains “relatively low,” according to Corrections leadership.
  • The policy makes clear: personal information (names, identifiers, medical and health details, or details relating to people in Corrections’ management) must not be entered into Copilot Chat; Copilot must not be used to draft reports or assessments containing personal information.
  • Where staff contravened policy, Corrections ran a privacy risk assessment, audited prompts (which are searchable/exportable under enterprise logging) and took internal action while reminding staff and managers of the boundaries.
Those are the factual claims Corrections has made publicly; they shape how we evaluate the department’s approach and the residual risks.

The technical reality: what Copilot Chat does — and what it doesn’t​

Understanding the technology is essential before judging policy compliance or the severity of the privacy exposure. Microsoft’s enterprise Copilot offerings include built‑in controls that make managed use safer than consumer chatbots — but they are not a panacea.
  • Enterprise Data Protection and tenant boundaries. Microsoft implements Enterprise Data Protection (EDP) and tenant isolation for Copilot Chat used under a Microsoft 365 organisational account. These protections mean prompts and responses can be logged, retained, and subjected to eDiscovery and auditing inside the customer’s tenant boundary. Microsoft states that prompts and responses from commercial tenants are not used to train its foundational models.
  • Prompt and response logging. Copilot Chat interactions under an organisational account are recorded as metadata and may be retained in tenant systems for compliance and auditing. That logging is a double‑edged sword: it produces an audit trail for incident detection and remediation, but also creates another repository of sensitive content if staff violate policy and include personal information.
  • No absolute guarantee against disclosure. Microsoft’s contractual and technical commitments reduce the risk of provider‑side reuse or model training, but they do not eliminate other disclosure pathways: staff can copy outputs into documents, upload stored logs, or inadvertently reveal sensitive facts. Enterprise protections also depend on correct configuration and secure operational practices.
  • Hallucination and factual risk. Large language models can and do generate plausible‑sounding but incorrect information. When caseworkers rely on AI‑generated phrasing, summaries, or synthesis without robust verification, the resulting report can contain factual errors with legal or safety ramifications. The technology’s current failure mode is not intentional malice but confident misinformation — a particularly bad fit for forensic, clinical, or risk‑assessment writing.
In short: Copilot Chat, when configured for enterprise use, provides meaningful technical controls that consumer tools lack. But those protections are conditional: they require disciplined human behaviour, correct tenant configuration, and organisational governance to be effective. Microsoft’s documentation and product pages confirm the core technical protections Corrections is relying on, while also making clear these tools are meant to complement — not replace — human review.

Legal, privacy and regulatory framework​

The Office of the Privacy Commissioner (OPC) has long been clear: the Privacy Act applies regardless of the tools used, and agencies must satisfy the law when they adopt generative AI. Public organisations are expected to:
  • Obtain senior leadership approval and assess whether a generative AI tool is necessary and proportionate.
  • Conduct a Privacy Impact Assessment or Algorithmic Impact Assessment before deployment.
  • Be transparent with affected people when their personal information may be involved.
  • Ensure human review of AI outputs before action is taken.
  • Avoid inputting identifiable personal information into external AI tools unless explicit controls and privacy assurances are in place.
Corrections’ stated approach—narrow tenant‑scoped Copilot access, a formal AI policy, privacy risk assessment when breaches were discovered, and participation in all‑of‑government AI governance forums—maps onto OPC expectations. But the regu that compliance depends on practice, not just policy: where staff breach policies by inputting personal data, agencies must remediate and, where appropriate, notify affected individuals. The OPC further notes that the Privacy Act grants individuals the right to access and correct personal information, a right that must be preserved even when information has flowed through AI tools.

Where the approach worked — and where it did not​

Corrections made several positive choices that align with best practice:
  • Narrow, approved footprint. Permitting only the tenant‑bound Copilot Chat and blocking unvetted consumer services limits the attack surface and centralises control. This is the recommended pattern in many public‑sector AI playbooks.
  • Explicit policy language. Corrections’ policy explicitly prohibited entering personal, identifying, or health information into Copilot Chat and forbade using Copilot to draft reports that contain personal data — a clear, enforceable line.
  • Audit and response. The department conducted a privacy risk assessment, audited prompts, and reminded staff; auditing capabilities exist because Copilot Chat under EDP logs interactions. Those are essential first steps after discovery.
But the incident also exposed gaps:
  • Policy vs practice gap. The fact that staff used Copilot to assist with Extended Supervision Order reports shows the behavioural gap between written policy and day‑to‑day practice. Policies that are not embedded through training, supervision, and continuous monitoring become paper safeguards only.
  • Training and supervision shortfall. Corrections acknowledged ongoing work to make AI use “an ongoing conversation,” which implicitly admits training, supervision, and change‑management need strengthening. Short, high‑pressure workflows in community corrections create incentives to seek shortcuts; without practical alternatives or streamlined workflows, staff may default to using tools they can access.
  • Insufficient technical containment for document drafting. Corrections allowed the free Copilot Chat feature that does not integrate with system data. That reduces the risk of Copilot accessing internal case files, but it does not prevent staff from pasting sensitive text into the chat window or uploading attachments to the chat. Adult learning and operational controls must account for that behaviour.

The real harms at stake​

When AI is misused in frontline casework, harms fall into multiple, compounding categories:
  • Privacy breach and re‑identification risk. Personal and medical details entered into chat windows become part of recorded logs — even if retained inside the tenant — and may be accessible during incident review. That increases the chance of improper exposure or future misuse.
  • Evidentiary weakness and legal risk. AI‑generated text that contains inaccuracies, invented details, or improperly attributed claims can weaken reports used in parole decisions, supervision orders, or risk assessments. Courts, tribunals, or oversight bodies could challenge the reliability of evidence if AI involvement is discovered.
  • Erosion of professional integrity. Reliance on AI to draft evaluative material risks diluting practitioner judgement and professional accountability — especially in sensitive areas like risk assessments and clinical summaries. Corrections’ statement emphasises maintaining "professional integrity" for this reason.
  • Regulatory exposure and reputational damage. If the OPC or another oversight body finds systemic failures, the organisation could face mandated remedial actions, public censure, and loss of trust among people they supervise and the wider public.

Practical recommendations for Corrections and similar agencies​

The problem exposed by this episode is solvable — but it requires organisational muscle: policy, technology, training, and oversight working in lockstep.
  • Operationalise the policy with role‑based restrictions.
  • Turn a high‑level prohibition into specific, enforceable rules tied to roles and systems (for example, block paste/upload to Copilot Chat for accounts used by community‑corrections caseworkers).
  • Use conditional access and device management to enforce who can sign into Copilot Chat and from which devices.
  • Technical hardening: stop the paste button.
  • Use endpoint controls, browser policies or DLP rules to prevent pasting or uploading of Protected Health Information (PHI) and personally identifiable information (PII) into web chat interfaces.
  • Enable Purview or equivalent sensitivity labelling to detect and block sensitive material in prompts.
  • Embed continuous auditing and alerting.
  • Keep phem with automated scanning for sensitive patterns (NHI numbers, dates of birth, health keywords) and generate alerts that trigger supervisory review and retraining.
  • Invest in training that reflects real work.
  • Short, scenario‑based training for Community Corrections staff that shows what is and is not permissible, with examples drawn from actual report tasks. Training should include what to do instead — approved templates and time‑saving macros that do not expose PII.
  • Human‑in‑the‑loop verification as policy.
  • Require explicit checklist verification steps when any AI‑assisted drafting occurs: date, source checks, confirm attribution, confirm no personal identifiers were included, and sign‑off by a qualified manager.
  • Transparent incident response and notification plans.
  • Define threshold for OPC notification and staff notification; practice tabletop exercises for a small breach scenario to build muscle memory.
  • Governance and independent review.
  • Maintain an AI assurance function (as Corrections has), but also commission periodic independent audits and publish an anonymised summary of lessons learned to preserve public trust.

What other organisations do — useful comparators​

Universities and large enterprises that have deployed Microsoft Copilot offer instructive lessons:
  • Many universities permit Copilot Chat under enterprise sign‑in only and emphasise that Copilot Chat does not access Microsoft 365 content unless the higher‑privilege Copilot experience is licensed and enabled; they also warn users against entering sensitive data. These institutions combine policy, visual UI cues (a green shield indicating Enterprise Data Protection is active), and training to reduce misuse. (its.uiowa.edu)
  • Microsoft’s public guidance and community posts stress that prompts and responses for enterprise accounts are not used to train foundational models and point to Purview, DLP, and tenant isolation as core technical mitigations — but they also repeatedly stress human governance and configuration as necessary. Those vendor assurances are important but must be treated as part of a broader risk management program, not a substitute for it.

How to assess severity and decide next steps​

When an agency finds staff have used AI contrary to policy, decision‑making should be guided by a small set of concrete facts and proportionality.
  • Was personal or health information included in prompts or uploaded to the chat? If yes, this is a privacy incident requiring at least internal remediation and likely OPC notification, depending on scale and sensitivity.
  • Did AI output meaningfully change the substance of the report (for example, adding new assertions or altered judgments) or merely assist with neutral tasks (grammar, plain‑English rephrasing)? Substantive changes require more urgent, formal review.
  • Were mitigating technical controls in place (EDP enabled, tenant‑scoped access, prompt logging)? If so, auditing may be able to show scope and recipients of exposure; if not, risk is higher.
  • Did the incident reveal a pattern of risky behaviour or a small number of isolated mistakes? Patterns demand stronger organisational remediation and possibly external reporting. Corrections has described the problem here as a small number of incidents and has run a privacy risk assessment, but independent oversight or periodic external audits would increase confidence.

Conclusions — practical, not punitive​

Corrections’ response — explicitly calling the misuse “unacceptable,” running a privacy risk assessment, auditing prompts, and reminding staff about the policy — is appropriate as an immediate containment step. Those actions reflect a sensible mix of technical controls and governance.
But containment is the start, not the end. The real test is whether Corrections converts this wake‑up call into measurable change: stronger technical enforcement of no‑paste/data‑block rules, practical training for staff facing real‑world time pressures, routine auditing with automated detection of PII in prompts, and a clear public‑sector governance trail that includes independent review. The Office of the Privacy Commissioner’s expectations remain firm: the Privacy Act applies, and agencies must do the work required to prove their use of AI is necessary, proportionate, and safe.
For other public‑sector organisations watching this episode, the lesson is immediate: narrowing your AI footprint to a single vendor and buying enterprise protections is the right starting point — but it is not sufficient. True safety requires aligned policy, enforced technical constraints, continuous staff training, and transparent oversight. Do that, and organisations can safely gain the productivity benefits of AI while protecting the people they serve; fail to do that, and even a small number of well‑intentioned mistakes can produce outsized harm.

Corrections’ episode is a reminder that AI governance is an operational problem as much as a technical one: tools like Copilot can be made acceptably safe, but only if the organisation does the unglamorous work of change management, configuration hardening, auditing, and continual staff support.

Source: Otago Daily Times Corrections labels staff's AI use as 'unacceptable'
 

The Department of Corrections in New Zealand has publicly reprimanded a small number of staff for using Microsoft Copilot Chat to assist with formal casework — including Extended Supervision Order reports — calling the behaviour “unacceptable,” launching a privacy risk assessment, and tightening controls to keep AI use strictly within a narrow, enterprise-managed footprint. /www.rnz.co.nz/news/national/586932/corrections-takes-action-against-staff-s-unacceptable-use-of-artificial-intelligence)

A businessperson contemplates enterprise boundaries and policy controls safeguarding privacy.Background​

Corrections introduced Microsoft Copilot Chat on managed devices in November 2025 as an assistive capability intended to improve non‑sensitive productivity tasks. The department’s policy explicitly forbids entering personal information — names, identifiers, health or medical details, or case-management specifics relating to people under Corrections’ management — into Copilot Chat, and it prohibits using the tool to draft or generate content for reports or assessments that contain personal information. Corrections reports that roughly 30 percent of staff engaged with the tool since rollout, and thplications are blocked on the Corrections network to keep interactions inside enterprise controls.
These immediate facts frame a common public-sector conundrum: organisations want the productivity gains of generative AI while needing to contain legal, privacy and safety risks when those tools encountase data. The Corrections move — swift, public, and prescriptive — is an early example of how agencies test the boundaries of permissive, governed AI adoption and then respond when staff behaviour diverges from policy.

What actually happened​

Corrections discovered a “small number” of incidents where staff used Copilot Chat to help draft formal casework documents, including reports connected to Extended Supervision Orders. In response, the agency performed a privacy risk assessment, flagged the incidents internally, reiterated prohibitions to staff, and emphasised that prompts and responses are auditable and exportable for review. The departmentsurance function (a responsibility held by the director of cybersecurity), participates in the All‑of‑Government community of practice on AI, and has created an AI working group to govern adoption.
Corrections has not (as of its public statement) made a formal notification to the Office of the Privacy Commissioner, but the OPC has been clear that privacy oblthe Information Privacy Principles under the Privacy Act — apply to uses of personal information in AI tools. That means the department’s response will be judged not only on internal controls but on whether it has identified, contained, and remedied any privacy risks to affected individuals.

Why this matters: legal, ethical and operational stakes​

There are three intertwined stakes when frontline public servants place case-sensitive material into a generative AI chat:
  • Privacy and legal compliance. Under the Privacy Act and OPC guidance, agencies must ensure that the collection, use and disclosure of personal information — including processing through AI — is lawful and limited to authorized purposes. Inputs to external or even enterprise-managed generatitute processing that triggers legal obligations.
  • Safety and fairness. Corrections handles people with complex histories, mental-health needs, and legal conditions. Mistakes, mischaracterisations, or exposure of sensitive details canse people and erode public trust in institutions that must show discretion and duty of care.
  • Evidentiary and operational integrity. Generative models can produce inaccurate content — so-called hallucinations — and if unchecked, those errors have the potential to enter offdocuments are used in statutory or judicial contexts, the introduction of AI‑generated, unchecked content raises real evidentiary risk.
Put simply: productivity benefits matter, but in corrections contexts the cost of a single mistake can be high. That asymmetry explains why the department labelled the misuse “unacceptable” and why it framed Copilot as an "assistive tool" that must not touch sensitive content.

The technical reality: what Microsoft’s enterprise Copilot can — and cannot — promise​

Understanding the technology and its enterprise protections is essential to judging whether Corrections’ approach is proportionate.
  • Microsoft states that Microsoft 365 Copilot Chat processes prompts and responses inside the Microsoft 365 service boundary, offering enterprise data protection and logging. Prompts and responses are logged for auditing and eDiscovery, and tenant‑scoped controls can limit which organisational content Copilot can access. Microsoft explicitly documents that prompts and responses processed by Microsoft 365 Copilot Chat are not used to train the underlying foundation models.
  • Microsoft’s privacy FAQ reiterates that organisational users signed in with an enterprise identity are excluded from model training, that files uploaded to Copilot are stored under enterprise controls (with a stated retention window) and that organisations can configure whether conversations are used for model training. Those technical features are important safeguards — but they are not a substitute for governance: if staff type personal identifiers or case details into the chat, that content still flows through the Copilot service and will be subject to whatever enterprise protections the tenant configures.
  • Independent coverage and vendor guidance point to operational features organisations can use — Purview Data Loss Prevention (DLP), sensitivity labels, restricted search scopes, and audit export — to reduce accidental exposure and to evidence compliance after the fact. But those controls need correct configuration and continuous oversight to be effective.
Taken together, the platform-level protections reduce one category of risk (aggregation of tenant content into model training), but they do not eliminate the problems that arise when employees put third‑party-personal or case-sensitive information into conversational prompts. That human‑in-the‑loop failure remains the primary vector for the incidents Corrections reported.

Cross‑checking the department’s claims​

Corrections’ public statements — about the scope of allowed use, the fact of a privacy risk assessment, and the 30 percent adoption figure — align with contemporaneous reporting in national media and with the department’s own summary messaging. RNZ reported the same timeline (Copilot introduced on managed devices in November 2025), the same uptake estimate, and described the internal privacy risk assessment and the department’s statement that staff who use Copilot may be audited.
To validate the technical claim that enterprise Copilot doesn’t use prompts to train foundational models, I cross‑checked Mion (Microsoft Learn) and the Microsoft Support privacy FAQ; both make the exclusion explicit for enterprise-authenticated users and describe controls for organisations to manage training and data visibility. That confirms Corrections’ ability to keep prompts within a controllable enterprise domain — provided the organisation configures those controls and prevents use of unsanctioned AI applications on its network.
Where precise numbers or internal disciplinary outcomes are concerned — for example, the exact count of incidents, the identities of affected reports, or whether individuals’ privacy was concretely harmed — public reporCorrections described the incidents as “small” and has not published a full incident report publicly. That absence of detailed public facts means readers should treat the scale and impact assessments with caution until further disclosure is available.

Analysis: strengths in Corrections’ approach — and where the gaps remain​

Corrections did a number of things well from an AI‑governance perspective:
  • Narrow, enterprise‑first posture. The department limited permitted AI access to the MChat feature available under the organisation’s existing Microsoft licence, and blocked consumer AI apps on its network — a sensible approach that reduces uncontrolled data egress.
  • Clear policy prohibitions. The policy explicitly forbids entry of personal or health data into the Copilot chat and forbids using Copilot to generate content for personally identifying reports — clarity that helps line managers enforce boundaries.
  • Auditability and assurance. By using a tenant-bound Copilot Chat with logging and exportable prompts, Corrections retains an audit trail — useful for both compliance and remediation. Microsoft’s own documentation supports these capabilities.
But significant gaps remain that public-sector organisations frequently under‑estimate:
  • Human error and culture. Policies do not enforce themselves. Where staff are under time pressure and dealing with repetitive reporting tasks, the temptation to use an “assistive” chat to speed draftingultural and training challenge more than a pure technical one.
  • Technical configuration and DLP precision. Enabling Copilot under a tenant does not automatically prevent all leaks; organisations must configure Purview DLP, sensitivity labels, restricted search scopes, and endpoint controls to reduce the risk of accidental prompt content. Misconfiguration or incomplete policy coverage will leave gaps.
  • Detection vs. prevention. Audit logs and prompt export are valuable for after‑the‑fact review, but they do not stop the initial leak. Preventive controls (automated blocking of disallowed prompt patterns, inline redaction, or mandatory templates that scrub identifiers) are technically possible but operationally complex and rarely implemented at scale.
  • Evidentiary risks. Even if the prompt traffic is auditable, the use of model‑generated language in statutory reports risks introducing inaccuracies. Without rigorous human verification, AI-assisted content could inadvertently distort case facts. This remains an underappreciated risk in many AI adoption plans.

Practical recommendations for Corrections and similar public-sector agencies​

Public-sector bodies that handle sensitive personal information should treat this incident as a learning moment. The following checklist combines governance, technical, and human measures that reduce risk while preserving legitimate productivity gains:
  • Clarify and publish acceptable use cases. Define narrow, explicit categories of tasks where Copilot is permissible (e.g., generic administrative drafting) and explicitly list prohibited uses (case reports, clinical summaries, identifying information). Reiterate consequences for breaches.
  • Enforce with technical controls. Implement Purview DLP policies, sensitivity labels for case files, and restrict prevent Copilot from ingesting sensitive repositories. Configure Copilot conversation visibility to tenant policies and disable model training options for organisational accounts.
  • Introduce preventive prompt filtering. Where possible, deploy inline prevention that blocks or warns when prompts contain patterns that match identifiers, health terms, or judicial references. This can be accomplished with DLP or custom middleware.
  • Mandatory human verification. Require that any AI‑assisted text used in official reports be reviewed and signed off by a trained staff member who confirms factual accuracy, sources, and sensitive redaction.
  • Training and regular refreshers. Deliver scenario-based training for staff, focusing on examples (safe/unsafe prompts), DLP behaviours, and the limitations of generative AI (hallucinations, biases).
  • Incident response and notification playbook. Maintain a clear process: identify affected records, assess harm, remediate, document actions, and notify individuals and regulators where thresholds are met.
  • Audit and metrics. Use audit logs to generate regular compliance dashboards — prompt volumes, blocked prompt attempts, and DLP incidents — and tie those metrics to leadership oversight.
  • Community of practice and external review. Continue engagement with the All‑of‑Government AI community and seek external privacy and legal review for high‑risk uses.
These measures are incremental and operational. They recognise that organisations rarely achieve perfect prevention overnight, but they also map to practical, implementable steps that materially reduce the main vectors of risk.

What vendors offer — and what they don’t​

Vendors such as Microsoft provide enterprise-grade features intended to support safe adoption:
  • Tenant-scoped processing and audit logging. Copilot for Microsoft 365 processes within the tenant boundary and provides logs that can be exported for eDiscovery and auditing.
  • Controls over model training and personalization. Organisations can opt out of using conversation data for model training; enterprise accounts are excluded from training of Copilot’s base models under Microsoft’s public guidance.
  • Integration with Purview and DLP. Microsoft’s compliance stack can block or redact sensitive content from being returned in summaries or from being included in Copilot responses when configured with sensitivity labels and policies.
However, vendors cannot replace governance or human judgement. Product controls require correct configuration, mainton into organisational processes. Vendor statements that tenant data are not used to train foundation models are important but do not mean prompts containing sensitive information are risk‑free. The risk that staff will place sensitive content into a chat remains a human and process problem rather than a purely technical one.

Broader implications for AI governance in public services​

The Corrections episode illuminates systemic themes that will recur across governments:
  • Policy lag vs. technology pace. Tools roll out faster than governance frameworks and staff training programs. Agencies must choose between disabling helpful tools or investing in rapid policy, tooling, and capability upgrades.
  • Trust and transparency. Public trust in corrections and justice systems depends on demonstrable safeguards. Transparent communication about what happened, what was affected, and how risks were mitigated matters — both legally and politically.
  • Harmonised guidance and standards. The Office of the Privacy Commissioner’s AI guidance provides an early national benchmark for privacy expectations. Governments should aim for harmonised, sector-specific standards for AI use in sensitive domains, including mandatory impact assessments and minimum technical controls.
  • Human‑centred AI adoption. The strongest defence against misuse is a combination of intuitive tooling (that prevents obvious mistakes), realistic process design (that doesn’t push staff toward risky shortcuts), and continual, scenario-driven training.

What we still don’t know — and why transparency matters​

Public statements leave several important factual gaps:
  • The exact number of incidents and the precise nature of the content entered into Copilot remain undisclosed publicly. Without thosicult to evaluate the real‑world harm or the systemic risk.
  • Whether affected individuals were notified or whether records were amended remains unclear. RNZ reported that no notifications had been made to the Office of the Privacy Commissioner at the time of reporting, but that OPC expects agencies to take appropriate remedial action where policy breaches occur.
  • The long‑term governance changes Corrections will implement — beyond reiterating rules and performing audits — will determine whether this event becomes an isolated correction or a trigger for broader policy reform across the public sector.
Agencies that proactively disclose lessons learned (redacted where necessary), publish compliance metrics, and show sustained governance actions will better preserve public trust than those that treat incidents as internal, unreported operational matters.

Conclusion​

Corrections’ public rebuke of staff who used Microsoft Copilot Chat inappropriately is a cautionary but instructive episode for any organisation moving generative AI into sensitive workflows. The department’s decision to limit AI access to a tenant-bound Copilot feature, to block consumer AI apps, and to rely on audit logs and a privacy risk assessment are defensible steps — and they sit on top of documented vendor capabilities that reduce one class of risk by excluding enterprise prompts from model training.
Yet the core lesson is organisational: technology alone does not solve the human behaviours, time pressures, and operational incentives that lead well-intentioned staff to seek shortcuts. Robust DLP and Purview configurations, preventive prompt filtering, mandatory human verification, scenario-based training, and clear incident playbooks are the practical controls that translate policy into safer practice. Agencies that commit to those measures — and that pursue open, evidence‑based disclosure when things go wrong — will navigate the trade‑offs of generative AI more successfully than those that rely on vendor promises or broad policy statements alone.
Corrections’ incident should be read not as a final verdict on the value of AI in public services, but as a practical reminder: adopt fast, govern faster, and train constantly.

Source: hrmonline.co.nz 'Unacceptable': Department of Corrections slams misuse of AI at work
Source: hcamag.com 'Unacceptable': Department of Corrections slams misuse of AI at work
 

Back
Top