When the conversation moves from keyboard to vocal cord, the difference is more than convenience — it changes how chatbots behave, how users engage with them, and how those systems reveal their strengths and weaknesses in real time. In a short hands‑on sweep, five major consumer chatbots — ChatGPT, Google Gemini Live, Microsoft Copilot, Meta AI, and xAI’s Grok — were tested in voice mode and produced markedly different experiences: all useful, but only a few that genuinely felt like a back‑and‑forth conversation rather than a sequence of polished monologues. The PCMag Australia write‑up that sparked this re‑examination documented those hands‑on impressions and singled out Gemini Live as the most conversational in that session.
Voice is the missing dimension that finally makes chatbots feel like companions rather than search boxes. Recent product rollouts from OpenAI, Google, Microsoft, Meta and xAI all pushed voice chat out of labs and into mobile apps, and the result is a mix of technologies: real‑time speech recognition, expressive text‑to‑speech, conversation management and in some cases on‑device wake words. Those features collate dozens of design choices — interruption handling, conversational prompting, transcript visibility, voice selection and privacy defaults — into one live experience. The PCMag tests capture the consumer angle: how these systems listen, respond and follow up in a real dialogue about a common human problem — carving out time for a creative side project.
In parallel, the legal and business context has hardened: publishers and content owners are increasingly litigating how models were trained, and that dispute changes the calculus for enterprises and publishers alike. In April 2025, Ziff Davis sued OpenAI alleging that the company used copyrighted content in its training data — a case covered by major outlets and still active.
Cross‑checks with recent reporting confirm OpenAI’s rollout of multiple voices and its Advanced Voice Mode, even as the company iterates on the distinction between Standard and Advanced voice behaviours. Independent coverage tracked both the new voices and the push/pull over which voice modes remain available to users. Strengths
Independent reporting confirms Gemini Live’s rollout of multiple voices and that Google has been expanding Live to broader user sets; coverage from multiple outlets documented the voice names and the step to make Live more widely available on mobile. Strengths
Strengths
Strengths
For Windows users and IT buyers, the decision is practical: match the assistant to your ecosystem and governance needs. For public or casual voice use, all five offer free entry points; for sensitive or regulated interactions, prefer enterprise contracts that explicitly define data use and training guarantees. The voice era is here, and it rewards deliberate choices: choose the voice that fits the job, tune response styles, and always treat spoken output as draft rather than definitive.
The adoption of voice answers a long‑standing human itch: to have a partner rather than a tool. But the partnership depends on design choices — who asks the questions, how follow‑ups are handled, what’s saved and why — and those choices vary meaningfully across vendors. The PCMag observations are a useful consumer snapshot of that variation; combining those experiential notes with vendor policy checks and independent reporting provides a clearer map for anyone deciding which assistant to talk to, and which to trust with their most important conversations.
Source: PCMag Australia Talk, Don't Type: Which Chatbot Is Best at Actual Conversation?
Background
Voice is the missing dimension that finally makes chatbots feel like companions rather than search boxes. Recent product rollouts from OpenAI, Google, Microsoft, Meta and xAI all pushed voice chat out of labs and into mobile apps, and the result is a mix of technologies: real‑time speech recognition, expressive text‑to‑speech, conversation management and in some cases on‑device wake words. Those features collate dozens of design choices — interruption handling, conversational prompting, transcript visibility, voice selection and privacy defaults — into one live experience. The PCMag tests capture the consumer angle: how these systems listen, respond and follow up in a real dialogue about a common human problem — carving out time for a creative side project.In parallel, the legal and business context has hardened: publishers and content owners are increasingly litigating how models were trained, and that dispute changes the calculus for enterprises and publishers alike. In April 2025, Ziff Davis sued OpenAI alleging that the company used copyrighted content in its training data — a case covered by major outlets and still active.
Overview: what the testers did and why it matters
The test scenario
- Each AI was launched in its mobile app (iOS in the PCMag test).
- The same personal prompt — how to balance a busy writing life to make time for a book or play — was read aloud to each assistant.
- The tester noted tone, follow‑through, interactivity, whether the AI asked follow‑up questions, and whether a transcript was available during or after the session.
- the assistant’s ability to ask clarifying questions and maintain context;
- the pacing and length of responses (short, conversational vs long, monologue);
- UI affordances (live text, transcript, voice selection);
- and the perceived empathy or tone — elements that make a conversation feel “alive.”
Why voice changes the evaluation
Voice reduces the friction for follow‑up questions and often forces AIs to manage turn‑taking and interruptions. It also exposes differences in how vendors design conversational flows: some aim for concise, actionable replies; others attempt a more Socratic back‑and‑forth. For many users, that feel matters as much as raw accuracy.Hands‑on findings: what each assistant did well (and not)
ChatGPT — solid advice, less back‑and‑forth
ChatGPT’s iOS app supports voice mode and provides multiple voice choices; the Advanced Voice rollout introduced named voices such as Vale alongside a series of earlier options. In the PCMag test, the assistant delivered useful, concrete productivity advice — scheduling blocks of time, batching tasks — but the conversation felt more like polished monologues than an active dialogue. The responses were closed‑ended, with fewer probing questions to keep the exchange going.Cross‑checks with recent reporting confirm OpenAI’s rollout of multiple voices and its Advanced Voice Mode, even as the company iterates on the distinction between Standard and Advanced voice behaviours. Independent coverage tracked both the new voices and the push/pull over which voice modes remain available to users. Strengths
- Broad, accurate advice; strong in drafting and planning tasks.
- Multiple voice options and a polished mobile UI.
- Less insistently conversational — tends toward complete, self‑contained replies instead of asking follow‑ups.
Google Gemini Live — the best talkative partner in this round
Gemini Live stood out for its conversational drive: it regularly asked follow‑ups and nudged the tester to refine aims and next steps. Gemini’s voice selection (including the British‑accented Capella) and the Live interface make it easy to start a spoken back‑and‑forth, and reviewers reported that Gemini Live intentionally uses pauses and conversational prosody to sound less robotic. PCMag’s tester said Gemini “felt less like a chat with a robotic AI and more like one with a sympathetic friend.”Independent reporting confirms Gemini Live’s rollout of multiple voices and that Google has been expanding Live to broader user sets; coverage from multiple outlets documented the voice names and the step to make Live more widely available on mobile. Strengths
- Proactive dialogue style and frequent clarifying questions.
- Realistic prosody and choice of voices; easy to enable Gemini Live in the app.
- Some advanced Live features (extensions, app integrations) were initially limited and have been rolled out in phases.
Microsoft Copilot — an empathetic sounding board tied to the Microsoft ecosystem
Copilot’s voice mode presented a friendly, encouraging conversation and asked the tester useful clarifying questions and positive reinforcement. The Copilot app offers multiple voice styles (reporting mentions Wave among other voice names) and Microsoft has been rolling voice activation gestures such as “Hey, Copilot!” in preview builds. The assistant’s ecosystem advantages (calendar, Outlook, Microsoft 365 context) give it an edge when the conversation touches on scheduling or integrating tasks with calendars. Strengths- Practical, agenda‑aware suggestions when the user is embedded in Microsoft 365.
- Voice mode that supports follow‑ups and empathetic confirmations.
- Best experience if you’re within Microsoft’s ecosystem; otherwise some of the context advantages fall away.
Meta AI — lively, sometimes tangential
Meta AI’s voice mode impressed for its spontaneity and its tendency to open new avenues (for example, brainstorming different book ideas and suggesting related media). That energy is a two‑edged sword: it delivered unexpected, useful inspiration, but it also wandered off the original thread and required the tester to steer the conversation back. The app’s UI displays live text during the speech and keeps a transcript after the call, enhancing usability for follow‑up.Strengths
- Lively, creative ideation; strong for brainstorming and lateral thinking.
- Live transcript visible during the conversation.
- Can deviate from the user’s initial goal; needs more guardrails for focused tasks.
Grok (xAI) — packed with practical ideas, sometimes too dense
Grok produced a wealth of practical suggestions and asked follow‑ups, but its responses were often dense and information‑heavy — which made them harder to absorb during a spoken conversation. The app exposes a transcript and allows customization of response style (concise, socratic, formal, or custom), and that flexibility can help tune the flow. PCMag’s tester found Grok helpful but occasionally overwhelming in the amount of content it delivered per reply.Strengths
- Rich, pragmatic suggestions and a high information density.
- Response style customization in the app.
- Dense replies can feel like drinking from a firehose in a live conversation.
Comparative analysis: what makes voice chat feel like “conversation”?
1) Turn management and clarifying questions
The strongest conversational experiences weren’t necessarily the most eloquent voices — they were the ones that asked for input and then built on it. Gemini Live and Copilot were notable for steering the conversation with follow‑ups; ChatGPT and Grok tended to reply with complete answers that required explicit user prompts to continue. PCMag’s notes underline this contrast: Gemini repeatedly asked whether it should search for writing groups and offered specific options; ChatGPT provided solid tips but didn’t push the dialogue forward as consistently.2) Response length, pacing and interruption
- Shorter, conversational replies make it easier to interrupt and steer a session in real time.
- Long, compressed replies (Grok) are rich in content but harder to digest when spoken.
Apps that allow the user to stop a reply mid‑stream or that surface a live transcript make it easier to reference or interrupt. Meta and Grok both present text as they speak; that visibility aids comprehension.
3) Voice personality vs functional clarity
Named voices (Vale, Capella, Wave) help brand a service and can increase engagement — but personality should not get in the way of clarity. Several outlets confirmed OpenAI’s rollout of named voices and Google’s list (including Capella). Users reported strong preferences for particular accents and tones, which is unsurprising: voice identity influences perceived empathy and usefulness.Privacy, data handling, and legal risks — what the voice era exposes
Voice interactions create extra telemetry — raw audio, transcripts, timing data — and that raises both privacy and legal questions. The high‑profile litigation landscape reinforces why buyers and enterprise admins must be deliberate.- Ziff Davis filed a copyright infringement suit against OpenAI in April 2025 alleging training on proprietary content without permission; that case is part of a larger wave of publisher litigation. Major news outlets covered the complaint and its implications.
- Vendor data policies differ. Microsoft emphasizes tenant‑level protections for customer data when Copilot runs under an enterprise contract; Google and OpenAI expose different default behaviours in consumer tiers. Windows‑centric guidance suggests Copilot as the safer default for enterprise tenant data because of integrated governance controls. Those points are echoed in recent evaluations aimed at enterprise and Windows audiences.
- Assume consumer voice sessions may be used to improve models unless you are under an enterprise contract that explicitly forbids training uses.
- For regulated or sensitive audio, prefer enterprise plans with contractual non‑training clauses or on‑device solutions.
- Keep transcripts and recordings under control; if the app saves voice transcripts by default, audit retention policies and deletion controls.
Cross‑referenced verification of key claims
- The Ziff Davis lawsuit against OpenAI (April 2025) is confirmed by Reuters and The Verge, both reporting on the filing and the allegations that OpenAI used publisher content in training.
- Google’s Gemini Live rollout and its voice list (including Capella) have been reported by multiple outlets, including 9to5Google and PhoneArena/IndiaToday, which document the voice names and the staged rollout to Android and iOS. These independent reports align with PCMag’s observation that Gemini’s Capella voice is an accessible option in the mobile app.
- OpenAI’s ChatGPT voice updates — the introduction of new named voices and Advanced Voice Mode — are widely covered; TechRadar and other outlets tracked the company’s changes to voice options and the subsequent user feedback about Standard vs Advanced voice behaviour. That matches the PCMag tester’s description of ChatGPT’s available voices and the perception that ChatGPT’s voice responses were competent but less probing.
- Microsoft’s Copilot voice features and the introduction of voice activation tests (e.g., “Hey, Copilot!”) are documented by outlets like The Verge; Microsoft’s enterprise posture around Copilot is also described in Microsoft materials and product coverage. These corroborate PCMag’s point that Copilot plays well to calendar‑aware scheduling and the Microsoft ecosystem.
- Grok’s customization options and the app’s transcript behaviour are reported by several hands‑on writeups and app notes, which align with PCMag’s description that Grok shows text during speech and offers style customization.
Practical recommendations: which assistant should you pick for spoken conversations?
The short answer: match the assistant to your goals and environment.- If you want the most conversational, dialogue‑oriented experience (follow‑ups, clarifying questions, a “friend” feel): Google Gemini Live impressed in these tests for driving the conversation and sustaining context.
- If you need enterprise governance and calendar/file integration in a Windows or Microsoft 365 environment: Microsoft Copilot is the pragmatic choice, with tenant controls, in‑ecosystem advantages and voice features tuned for productivity scenarios.
- If you want a generalist that performs solidly across tasks (writing, coding, drafting) and a polished multi‑platform app: ChatGPT remains the go‑to for broad capability, though its voice mode can feel less interrogative than Gemini in some sessions.
- If your priority is creative ideation and lateral brainstorming in voice mode: Meta AI and Grok produce lively, idea‑rich conversations; be ready to re‑focus the session if the AI wanders or overloads you with information.
How to get better voice conversations: practical settings and habits
- Pick the right voice and speed
- Demo available voices and select one that matches your listening preference; British accents (Capella, Wave) were cited as favorites by testers.
- Favor short prompts and invite follow‑ups
- Ask the assistant to “suggest three next steps” or “ask me two clarifying questions” to encourage back‑and‑forth.
- Use transcript and playback
- If available, watch live text while the assistant speaks or review the post‑session transcript to capture details you missed.
- Customize response style where supported
- Grok’s style presets (concise, socratic, formal, custom) can dramatically change pacing — use them to match the context (brainstorm vs execution).
- Control data sharing for sensitive sessions
- For confidential content, use enterprise plans with non‑training clauses or opt for on‑device solutions where available. Treat consumer tiers as not privacy‑safe by default.
Risks, caveats and things the tests don’t settle
- Reliability vs fluency: conversational fluency can mask factual errors. Natural‑sounding voice is persuasive; verify any factual or legal claims the AI makes. Independent evaluations (consumer groups and newsrooms) show variance in reliability across assistants for advice on legal/financial questions. Treat voice chat as a first draft, not a legal or professional authority.
- Training‑data litigation and IP exposure: lawsuits like the one filed by Ziff Davis against OpenAI heighten the risk profile for vendors that trained on third‑party content without licenses. That affects enterprise procurement, licensing terms and future model availability.
- Celebrity voice imitations and legal/ethical limits: some apps offer voices in the style of well‑known actors, which raises rights and reputational issues. Verify vendor claims and licensing; if a specific celebrity voice is important for your use case, ask for documentation and consent models. Public reporting confirms these kinds of voices exist in some apps, but the legal boundary remains fluid. Flag celebrity‑imitation claims as potentially contentious unless the vendor provides clear licensing.
- Regional variability and staged rollouts: many voice features were rolled out regionally or first on Android, then iOS, or behind subscription tiers. Confirm availability in your locale and app version before benchmarking. Gemini Live, for instance, expanded to free Android users first and then broadened platform availability.
Final verdict and takeaways
Voice turns chatbots from typed tools into conversational partners — but “being conversational” is a compound, design‑driven property. In the live test reported by PCMag Australia, all five assistants offered helpful and concrete advice; the difference was how they earned the right to be heard. Gemini Live’s steady stream of clarifying questions and its engaging prosody made it the most natural conversational partner in that round. Copilot delivered a strong, empathetic, context‑aware experience for Microsoft users; ChatGPT gave solid, broadly applicable counsel but felt more one‑sided; Meta AI’s liveliness produced unexpected inspiration; and Grok’s high information density served users who want many practical actions at once.For Windows users and IT buyers, the decision is practical: match the assistant to your ecosystem and governance needs. For public or casual voice use, all five offer free entry points; for sensitive or regulated interactions, prefer enterprise contracts that explicitly define data use and training guarantees. The voice era is here, and it rewards deliberate choices: choose the voice that fits the job, tune response styles, and always treat spoken output as draft rather than definitive.
The adoption of voice answers a long‑standing human itch: to have a partner rather than a tool. But the partnership depends on design choices — who asks the questions, how follow‑ups are handled, what’s saved and why — and those choices vary meaningfully across vendors. The PCMag observations are a useful consumer snapshot of that variation; combining those experiential notes with vendor policy checks and independent reporting provides a clearer map for anyone deciding which assistant to talk to, and which to trust with their most important conversations.
Source: PCMag Australia Talk, Don't Type: Which Chatbot Is Best at Actual Conversation?
Similar threads
- Replies
- 1
- Views
- 25
- Replies
- 2
- Views
- 50
- Replies
- 0
- Views
- 18
- Replies
- 0
- Views
- 15
- Replies
- 0
- Views
- 31