The challenge of choosing the right AI assistant is becoming increasingly vital as more products surge into the mainstream, touting productivity gains and intelligent support. It is no longer enough to simply trust brand names or flashy marketing—it takes hands-on trials and scrutiny to uncover the real capabilities of platforms like Perplexity and Copilot. I dove into results from two detailed investigations: a general-purpose, prompt-driven head-to-head review from Techpoint Africa, and a clinical academic study published in Nature evaluating AI performance in the nuanced field of obstetric ultrasound. Together, these snapshots form a revealing composite of where next-gen AI stands—both in broad digital utility and in specialist domains.
In the Techpoint Africa analysis, the reviewer designed a series of 10 prompts aimed at common and practical use cases for digital assistants—ranging from research summaries to creative tasks and technical troubleshooting. This trial mirrored the typical session most users would have with AI: fluid, not restricted to any single domain, but expecting clarity, correctness, and efficiency. It’s the kind of real-world test that is required in a world saturated with promise but in need of proof.
When examining depth and reliability, the platforms diverged. On academic and technical prompts—summaries of research articles or explanations of programming errors—Copilot occasionally provided more detailed, citation-backed answers, but tended to over-extend or produce generic information if not directly cited. Perplexity, meanwhile, shone in answers that required up-to-date information or synthesis of recent web sources, a testament to its search-driven architecture.
On creative and less-structured prompts (like generating poems or offering subjective advice), both platforms showed strong performance, but sometimes sprinkled in bias or assumptions unless the prompt required strict factuality. This exposes a common AI weakness: the blending of fact with plausible-sounding, but ungrounded, speculation.
This difference is more than cosmetic. In a landscape plagued by misinformation and deepfakes, the ability to “show your work” with sources is fast becoming a trust benchmark for AI. Users and enterprises are growing wary of black-box responses, especially when meaningful decisions depend on them.
Yet there was a subtle, human-like difference in the way each platform dealt with uncertainty. Perplexity, perhaps due to its sourcing method, occasionally admitted when it could not confirm a detail or specified ambiguity in its answers. Copilot was more prone to blurring its sources together, resulting in fluid but sometimes unfalsifiable explanations. For professional environments, especially research or journalism, that distinction is critical.
These gaps can be dangerous in sensitive contexts. For example, AI might overlook rare warning signs in ultrasound imaging, or offer remedial advice inappropriate to clinical best practices. The study points out that although AIs like Copilot and ChatGPT could become helpful tools for continuous education, quick reference, or even preliminary data mining, the systems are not yet reliable as standalone diagnostic aides.
But this capability is also a double-edged sword.
If you need speed, source transparency, and up-to-date coverage—especially in areas where facts are in flux—Perplexity is highly compelling. Its citation habits reinforce digital literacy, encouraging users to dig deeper rather than simply consume.
Copilot, meanwhile, leverages Microsoft’s strengths: breadth of coverage, seamless Office integration, and often deeper—if more generalized—responses for workplace and productivity use. Its utility grows in environments where the AI’s suggestions are but one layer in a deeper workflow, to be checked and confirmed by human expertise.
For medical or other high-stakes expert domains, neither should supplant qualified professionals. Used wisely, though—with awareness of their edges and blind spots—these tools can amplify human capability rather than deceive it.
As AI continues to work its way into office suites, search engines, and even clinical practice, the smart user will not only ask, “What can Copilot or Perplexity do for me?” but also, “Where should I step in to guide, verify, and improve the answers I receive?” The best AI is not simply the fastest or most encyclopedic—but the easiest to collaborate with, question, and learn from responsibly.
The most valuable skill, in a world of ever better digital assistants, might just be the wisdom to know what they do well, what they miss, and when to slow down and look twice. For now, the AI arms race is far from settled—and that’s good news for users who want smarter, more transparent, and ultimately more accountable digital partners.
Source: Techpoint Africa I tested Perplexity vs Copilot with 10 prompts for different use cases — Here’s what I found
Source: Nature Performance of ChatGPT and Microsoft Copilot in Bing in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports - Scientific Reports
Testing Perplexity and Copilot: The Generalist’s Duel
In the Techpoint Africa analysis, the reviewer designed a series of 10 prompts aimed at common and practical use cases for digital assistants—ranging from research summaries to creative tasks and technical troubleshooting. This trial mirrored the typical session most users would have with AI: fluid, not restricted to any single domain, but expecting clarity, correctness, and efficiency. It’s the kind of real-world test that is required in a world saturated with promise but in need of proof.Use Cases Under the Microscope
Across categories—academic summarization, coding help, travel recommendations, fact-finding, business insights, and even the generation of poetry—the two platforms were given identical queries and their responses were measured for speed, accuracy, and depth. Some prompts were straightforward, testing factual recall or synthesis. Others were more open-ended, requiring creative thinking or nuanced advice.Results That Matter: Speed and Substance
A key takeaway stood out almost immediately: Perplexity generally delivered results faster than Copilot, often in about half the time. For users in the flow of work, this difference is far from trivial—waiting ten or fifteen seconds rather than thirty strongly influences user satisfaction. But speed is never the sole arbiter.When examining depth and reliability, the platforms diverged. On academic and technical prompts—summaries of research articles or explanations of programming errors—Copilot occasionally provided more detailed, citation-backed answers, but tended to over-extend or produce generic information if not directly cited. Perplexity, meanwhile, shone in answers that required up-to-date information or synthesis of recent web sources, a testament to its search-driven architecture.
On creative and less-structured prompts (like generating poems or offering subjective advice), both platforms showed strong performance, but sometimes sprinkled in bias or assumptions unless the prompt required strict factuality. This exposes a common AI weakness: the blending of fact with plausible-sounding, but ungrounded, speculation.
The Value of Citations and Sources
One pronounced strength of Perplexity was its consistent citation of sources. Each answer—whether factual, explanatory, or advisory—was linked to specific web references. For users who want to verify information, contextualize it further, or simply double-check an AI’s claim, this habit is invaluable. Copilot, though backed by the breadth of Microsoft’s datasets and intelligence, did not always surface links or source material as transparently unless directly asked.This difference is more than cosmetic. In a landscape plagued by misinformation and deepfakes, the ability to “show your work” with sources is fast becoming a trust benchmark for AI. Users and enterprises are growing wary of black-box responses, especially when meaningful decisions depend on them.
Bias, Hallucination, and the Human Element
Both tests uncovered moments where the platforms generated plausible-sounding but incorrect or misleading information—a phenomenon known as AI “hallucination.” These ranged from mild misquoting of research to outright fabricated statistics. It is a sobering reminder: no matter the interface or brand, today’s AI systems are still fundamentally pattern-based; they repeat, remix, and occasionally confabulate.Yet there was a subtle, human-like difference in the way each platform dealt with uncertainty. Perplexity, perhaps due to its sourcing method, occasionally admitted when it could not confirm a detail or specified ambiguity in its answers. Copilot was more prone to blurring its sources together, resulting in fluid but sometimes unfalsifiable explanations. For professional environments, especially research or journalism, that distinction is critical.
Extending the Lens: AI in a Medical Context
While the first review focused on broad use cases, the Nature study took an explicitly clinical lens: How do ChatGPT and Microsoft Copilot in Bing fare when tasked with answering questions about obstetric ultrasound, or interpreting medical reports?Methodology Anchored in Science
The study’s rigor sets it apart from casual reviews. Both AI systems were systematically tested with standardized exam questions, simulated clinical scenarios, and real de-identified ultrasound reports. Their answers were assessed against gold-standard clinician judgments for correctness and completeness.The Middle Ground of Competence
Results revealed that both ChatGPT and Copilot perform at a useful—but not yet expert—level. Many correct answers emerged, particularly for general knowledge or interpreting textbook reports. However, neither platform demonstrated the robustness or precision required for frontline decision-making in obstetrics. There were notable omissions, occasional misinterpretation of clinical data, and instances of confidently-worded but subtly wrong conclusions.These gaps can be dangerous in sensitive contexts. For example, AI might overlook rare warning signs in ultrasound imaging, or offer remedial advice inappropriate to clinical best practices. The study points out that although AIs like Copilot and ChatGPT could become helpful tools for continuous education, quick reference, or even preliminary data mining, the systems are not yet reliable as standalone diagnostic aides.
Critical Analysis: The Double-Edged Sword of Direct Access
What’s profound about current AI progress, including Microsoft Copilot’s rapid evolution, is the sheer scope of their knowledge and the speed at which they synthesize input data. Integrating up-to-date internet information with large language models offers users direct access to a kind of living encyclopedia.But this capability is also a double-edged sword.
Notable Strengths
- Real-time Information: For use cases where freshness is critical (breaking news, quickly-evolving technical domains, regulatory changes), web-augmented AIs like Perplexity and Copilot have a compelling edge.
- Synthesizing Multiple Perspectives: Rather than relying on a single source, AI can summarize and synthesize dozens of viewpoints—useful in both the humanities and the sciences.
- Accessibility and Productivity: These AIs lower the bar for research, learning, and creative work, offering meaningful drafts or summaries in seconds, rather than hours.
Hidden Risks
- Hallucinated Authority: Even when wrong, AIs often write with confidence. In areas where users lack domain knowledge, there's a real risk of uncritical acceptance.
- Citations ≠ Truth: Perplexity’s habit of citing sources bolsters transparency, but users must still evaluate the worthiness of those references—citation does not equate to factuality.
- Domain Limitations: In medicine or legal contexts, subtle distinctions or rare “edge cases” can trip up even the most sophisticated AIs. As seen in the Nature study, these lapses are not always obvious to non-experts.
Practical Advice: How Should You Choose?
For Windows enthusiasts, students, professionals, or digital workers choosing between platforms like Perplexity and Copilot, the optimum choice depends on context.If you need speed, source transparency, and up-to-date coverage—especially in areas where facts are in flux—Perplexity is highly compelling. Its citation habits reinforce digital literacy, encouraging users to dig deeper rather than simply consume.
Copilot, meanwhile, leverages Microsoft’s strengths: breadth of coverage, seamless Office integration, and often deeper—if more generalized—responses for workplace and productivity use. Its utility grows in environments where the AI’s suggestions are but one layer in a deeper workflow, to be checked and confirmed by human expertise.
For medical or other high-stakes expert domains, neither should supplant qualified professionals. Used wisely, though—with awareness of their edges and blind spots—these tools can amplify human capability rather than deceive it.
The Future: Evolving From Assistants to Colleagues
Both studies show that AI assistants are evolving rapidly but remain fundamentally assistive rather than authoritative. They excel in pattern recognition, rapid research, and cross-domain synthesis. They stumble when domain expertise, deep experience, or emotional nuance play central roles. Transparency—a willingness to show sources, admit uncertainty, and encourage human verification—emerges as a pivotal factor in trust.As AI continues to work its way into office suites, search engines, and even clinical practice, the smart user will not only ask, “What can Copilot or Perplexity do for me?” but also, “Where should I step in to guide, verify, and improve the answers I receive?” The best AI is not simply the fastest or most encyclopedic—but the easiest to collaborate with, question, and learn from responsibly.
The most valuable skill, in a world of ever better digital assistants, might just be the wisdom to know what they do well, what they miss, and when to slow down and look twice. For now, the AI arms race is far from settled—and that’s good news for users who want smarter, more transparent, and ultimately more accountable digital partners.
Source: Techpoint Africa I tested Perplexity vs Copilot with 10 prompts for different use cases — Here’s what I found
Source: Nature Performance of ChatGPT and Microsoft Copilot in Bing in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports - Scientific Reports
Last edited: