Google’s Gemini is quietly testing a set of new experimental modes — Agent Mode, Gemini Go, and an Immersive View — that together signal a deliberate shift from single‑turn chat toward agentic, creative, and visually driven workflows inside the Gemini workspace. Early UI discoveries reported by TestingCatalog show descriptive labels and distinct icons in the mode selector; the most consequential of these, Agent Mode, is explicitly described as capable of autonomous exploration, planning, and execution, while Gemini Go and Immersive View appear aimed at collaborative ideation and richer visual responses respectively.
Google has been evolving Gemini beyond a conversational assistant into a platform that blends multimodal generation, developer tooling, and agentic automation. At Google I/O 2025 the company publicly teased an agentic direction — Project Mariner and related Agent Mode demos — and since then Google’s developer documentation and product blog entries have steadily expanded how and where agentic features will appear, from developer IDEs to the Gemini app and search integrations. This context makes the TestingCatalog discovery less an isolated UI experiment and more a visible step in a broader product strategy. (android-developers.googleblog.com, blog.google)
Gemini’s roadmap over 2024–2025 has emphasized three parallel ambitions:
This hybrid strategy allows Google to:
Those capabilities promise clear productivity gains, particularly for integrated Workspace users and developer workflows, but they also raise non‑trivial questions about who controls the agent, how errors are prevented, and what data is captured and used. The testing evidence and public Google announcements make the direction plain, but the precise UX, admin controls, and availability remain in flux. Organizations and users should prepare by tightening governance, testing agent workflows in controlled settings, and keeping an eye on Google’s evolving documentation as these experimental modes move toward wider availability. (android-developers.googleblog.com, theverge.com)
Source: TestingCatalog Google tests new Gemini modes - Agent, Go and Immersive View
Background / Overview
Google has been evolving Gemini beyond a conversational assistant into a platform that blends multimodal generation, developer tooling, and agentic automation. At Google I/O 2025 the company publicly teased an agentic direction — Project Mariner and related Agent Mode demos — and since then Google’s developer documentation and product blog entries have steadily expanded how and where agentic features will appear, from developer IDEs to the Gemini app and search integrations. This context makes the TestingCatalog discovery less an isolated UI experiment and more a visible step in a broader product strategy. (android-developers.googleblog.com, blog.google)Gemini’s roadmap over 2024–2025 has emphasized three parallel ambitions:
- Deep integration with Google Workspace and developer tools to make AI a native productivity layer.
- Expanded multimodal and creative tooling (Canvas, image/video editing, Audio/Video Overviews).
- Introduction of agentic features that can perform multi‑step tasks and interact with web content autonomously.
Agent Mode: what the label and leaks actually mean
What the testing UI shows
TestingCatalog captured new mode descriptions in Gemini’s mode selector that explicitly describe Agent Mode as a tool that will “perform autonomous exploration, planning and execution.” The listing also shows Agent Mode with a dedicated, unique icon — a UI signal that Google may intend this mode to remain a distinct, persistent entry rather than a temporary toggle.What Agent Mode is built to do
Public announcements from Google and developer documentation make clear that Agent Mode is intended to do more than offer scripted responses. In practice it aims to:- Accept a high‑level objective from a user.
- Formulate a multi‑step plan to achieve that objective.
- Use integrated tools (web browsing, Google apps, IDE tools in developer contexts) to carry out steps with varying degrees of autonomy and human oversight.
Android Studio’s Agent Mode preview is already described as a capability that can “formulate an execution plan that spans multiple files” and then make edits, run builds, and iteratively fix issues — a clear, concrete instance of the same agentic pattern. (android-developers.googleblog.com)
How this matches Google I/O and Project Mariner
Agent Mode is not a surprise: Google demoed Project Mariner (the company’s internal name for web‑interacting agents) at Google I/O and outlined agentic features for Search, Chrome, and the Gemini app. Those demos and announcements framed agents as capable of navigating websites, interacting with page elements, and orchestrating multi‑step tasks such as apartment search workflows or travel planning. The current testing descriptions confirm that Google is pushing those capabilities into consumer‑visible UX. (blog.google, hereandnowai.com)Why Agent Mode matters
Agent Mode is the most consequential of the three tested modes because it changes the product category. Gemini moves from being a knowledge and creativity assistant into being an execution layer that can act as a delegate on the user’s behalf — a transition that brings major productivity gains but also new operational and safety challenges.- Productivity upside: Agents can automate repetitive web chores, coordinate data across Google apps, and reduce the cognitive load of planning and execution.
- Behavioral shift: Users will need to learn how to specify goals that an agent can execute safely and reliably.
- Safety and risk: Allowing software to take autonomous actions on the web or in user accounts raises concerns about authentication, consent, error handling, and the potential for damaging automated behavior.
Gemini Go: ideation, Canvas, and prototyping
The testing label and likely UX
TestingCatalog’s discovery shows Gemini Go described with the short phrase “explore ideas together,” which strongly suggests a focus on collaborative ideation rather than direct automation. The language and early UI placement indicate Gemini Go could be a Canvas‑centric mode for sketching, brainstorming, and rapid prototyping inside the Gemini workspace.How Canvas and Gemini Go may fit together
Google has been expanding Canvas as a shared multimodal workspace where users can sketch, drop images, and co‑edit — with Gemini powering generation and transformations inside that space. A dedicated Gemini Go mode would logically:- Provide tools and prompts tailored to brainstorming, moodboarding, rapid design iteration, and early product prototyping.
- Offer quick ways to switch between text prompts, sketches, and generated assets.
- Surface templates and collaborative features that make ideation with teams fast and low friction.
Use cases and value
- Product design teams quickly generating style options and wireframes.
- Marketers sketching campaign concepts and producing sample images or short videos.
- Developers or data teams drafting proof‑of‑concept flows with embedded diagrams and code snippets.
Immersive View: visual answers and “visual explanations”
The testing label and possibilities
The Immersive View mode’s short description — promising “visual answers to your questions” — suggests an expansion of Gemini’s visual capabilities into interactive, contextual visuals. That could mean:- On‑demand generation of explanatory illustrations, annotated images, or step‑by‑step visual walkthroughs.
- A fusion of Google’s Video Overview / Gemini Live visual guidance features with image generation and scene synthesis.
- Visual overlays or “answer cards” that illustrate complex concepts rather than relying solely on text.
Where Immersive View draws from existing tech
Google’s recent Gemini Live enhancements enable visual guidance by analyzing camera input and highlighting items on screen; Immersive View looks like an extension of that idea to generated visuals and in‑app explanations. In effect, Immersive View could provide visual answers — diagrams, annotated screenshots, and synthesized images — to complement or replace prose in situations where a picture is more effective. (techradar.com, theverge.com)Practical implications
- Education and training: step‑by‑step visual explanations for complex procedures.
- Troubleshooting: annotated screenshots that highlight the precise controls the user must adjust.
- Creative workflows: auto‑generated concept visuals that speed iteration.
Product strategy: testing toggles vs. permanent modes
Google’s internal practice — as seen repeatedly across Gemini’s feature rollouts — is to expose experimental capabilities as named toggles or modes in early tests, then decide whether to fold those experiences into the core product or keep them as distinct entries. TestingCatalog points out that many of the “mode” toggles historically serve as temporary containers before being merged into Gemini’s primary UX; Agent Mode’s dedicated icon, however, suggests a higher probability that it will remain a separate, intentionally discoverable feature. That pattern matches Google’s documented approach for developer‑grade features (e.g., Agent Mode in Android Studio) versus user convenience features (which often become contextually surfaced). (android-developers.googleblog.com)This hybrid strategy allows Google to:
- Pilot focused experiences with a subset of users.
- Measure engagement and failure modes in the wild.
- Iterate on safety, permission models, and admin controls before making features broadly available.
Security, privacy, and governance — the hard part of agentic AI
Agent Mode introduces concrete security and governance challenges that differ in degree and kind from traditional conversational assistants:- Account access and delegated actions: Agents that log into services, fill forms, or place orders must operate within strict authentication and consent frameworks to avoid abuse and accidental transactions.
- Error handling and fallbacks: Autonomous actions can yield irreversible consequences (financial transactions, publicly visible posts). UX must make rollback and human‑in‑the‑loop confirmation trivial.
- Data residency and training telemetry: Enterprises will demand clarity about whether and how agent interactions are used to improve models, especially for sensitive or proprietary workflows.
- Admin control and rollout: Workspace features such as Gemini‑powered image editing were rolled out with specific availability tiers and varying admin control; early Workspace image tools initially lacked centralized admin toggles, a detail that raised governance concerns inside enterprises. Organizations evaluating Agent Mode for business use must validate available admin controls and data protection agreements before enabling the feature widely.
Competitive landscape and market implications
Agentic functionality is the current battleground for AI platform competition. OpenAI, Anthropic, Microsoft, and several other players have pushed agent frameworks and automation features in recent releases; Google’s Agent Mode and Project Mariner directly position Gemini as a competitor in the multi‑step task automation space. For users and enterprises the choice will hinge on:- Integration depth: Gemini’s advantage is deep linkage with Google services and Workspace.
- Safety and governance: Enterprise customers care more about policy controls than headline capabilities.
- Developer ecosystem: Tools such as Gemini Code Assist and a Gemini CLI that surface agent features in developer workflows lower barriers for adoption.
How to prepare (practical steps for users, admins, and dev teams)
- Inventory integrations:
- Identify Google services, third‑party sites, and internal tools the agent might access.
- Tighten account and authentication policies:
- Review OAuth scopes, enable robust 2FA, and consider just‑in‑time permissions for agent actions.
- Define human‑in‑the‑loop rules:
- For high‑impact tasks (purchases, public posts), require explicit user confirmation before execution.
- Test in controlled environments:
- Use preview accounts or developer sandboxes to exercise Agent Mode workflows and capture failure cases.
- Update data handling policies:
- Clarify whether agent logs, prompts, and artifacts are retained and how they may be used for model improvement.
Risks, unknowns, and unverifiable claims
- Timeline uncertainty: TestingCatalog’s UI capture and various Google announcements indicate active development, but exact rollout windows remain unconfirmed. Multiple public writeups talk about staggered or subscriber‑gated releases, but no public definitive date for general availability of the three modes has been published. Treat timing claims as provisional. (analyticsvidhya.com)
- Persistence of modes: While Agent Mode’s distinct icon suggests permanence, Google has a history of merging experimental toggles into broader interfaces. The final product taxonomy could differ when features go to stable release. This is a contextual, product‑management decision rather than a technical one.
- Feature scope and limits: The testing descriptions are intentionally high level. The precise abilities (e.g., whether agents will write emails, make purchases, or post on social networks) and the safeguards around those abilities remain to be confirmed in official docs and release notes.
- Commercial gating and quotas: Early agentic features have been announced for paid tiers or developer previews, but pricing, quotas, and licensing terms are subject to change and should be verified against official product announcements before procurement decisions. (blog.google)
Short checklist for readers who want to track and evaluate these modes
- Watch Google’s official product posts and Android Developers blog for Agent Mode docs and developer previews. (android-developers.googleblog.com)
- Test Canvas and Gemini Live features to understand how visual and collaborative primitives will map into Gemini Go and Immersive View. (cincodias.elpais.com, techradar.com)
- Evaluate admin and data controls in your Workspace environment before enabling previews — recent Workspace image tools revealed subtle admin control gaps that enterprises flagged.
- For developer teams, experiment with Gemini Code Assist and the CLI to see how agentic workflows are surfaced in IDEs and automation scripts.
Conclusion
The appearance of Agent Mode, Gemini Go, and Immersive View in Gemini’s testing UI is more than cosmetic tinkering — it reflects a deliberate shift in Google’s strategy: move Gemini from a passive assistant to a platform that helps users think, prototype, and act. Agent Mode is the most disruptive of the three, promising autonomous planning and execution that could reframe routine digital work. Gemini Go and Immersive View look set to accelerate ideation and lift visual communication inside the same workspace.Those capabilities promise clear productivity gains, particularly for integrated Workspace users and developer workflows, but they also raise non‑trivial questions about who controls the agent, how errors are prevented, and what data is captured and used. The testing evidence and public Google announcements make the direction plain, but the precise UX, admin controls, and availability remain in flux. Organizations and users should prepare by tightening governance, testing agent workflows in controlled settings, and keeping an eye on Google’s evolving documentation as these experimental modes move toward wider availability. (android-developers.googleblog.com, theverge.com)
Source: TestingCatalog Google tests new Gemini modes - Agent, Go and Immersive View