Why Android and iOS can’t be enough for agentic AI

0 6 minutes read

Why Android and iOS can’t be enough for agentic AI

agentic AI – Smartphone AI tools can summarize, draft, and trigger tasks—but they still run into limits when you ask them to do more than what the OS and apps were designed to expose. The path forward, developers say, runs through tool frameworks like MCP and Android’s App

Over the past year, the same frustration has followed a lot of people who try “agentic AI” on their phones: it feels powerful—until you push it a little past what it was built to do.

An assistant can summarize emails, set reminders, even help write code. But those wins tend to sit inside a narrow lane of built-in abilities. The moment you ask for something slightly more ambitious or personalized, the assistant hits a ceiling. It isn’t malfunctioning. It’s simply not designed for the kind of expanded, real-world working life people keep imagining.

That gap is where two different ideas start showing up. One is structured tool exposure, using frameworks like the Model Context Protocol (MCP), which lets external cloud services hook into an agent. The source material notes that Anthropic, OpenAI, and most other platforms support MCP.

The other route is more experimental: giving agents broader, system-level access to personal machines. This is where names like OpenClaw come into the conversation—an attempt to move beyond “assistant that talks” into “agent that can act.”

The human appeal is obvious: let an AI read your emails, manage your smart home, or trigger real-world actions. The problem is just as immediate. What happens when the AI needs a new kind of computing—and the phone’s operating system still thinks in apps?

A smartphone assistant doesn’t “see” your world the way people assume

Before getting to the future, the source makes a key clarification about how agentic AI works today. When you ask an AI to set a calendar event or generate a text document. it isn’t executing tasks directly inside your phone the way a human would. It’s still just a language model reading text input and producing text output.

The twist is that the text output can include structured “tool calls.” Instead of “acting” itself. the AI requests that external systems do something on its behalf. A separate platform layer intercepts those requests. runs the necessary software. and then returns the result to the model. so the assistant can respond with an actual outcome.

The example given is simple: ask Gemini to summarize a webpage. The source says Gemini doesn’t browse the internet on its own like a little computer roaming the web in the background. Instead, it calls a fetch tool, receives the page’s raw HTML, and then generates the summary.

In practice, the source argues, every capability is an external function the model has to ask someone else to run—whether that’s on-device code snippets, a call to another service, or MCP.

The orchestration layer matters because it’s what turns a chatbot into something closer to a personal assistant.

And now the real question: are we forcing AI into the wrong shape?

The source pushes back on the idea that the next leap will simply come from bolting more AI features onto existing mobile operating systems.

Google and Apple have been moving toward a broader approach to AI tools, but the story keeps circling one worry: smartphone OS design still follows an app-first model, while agentic AI increasingly needs a way to access tools—and those tools need to be scoped, controlled, and permissioned.

The source frames this as a mismatch of assumptions. It suggests the biggest problem isn’t just which AI model you use. It’s deciding what data and tools the AI can access, how those accesses are controlled, and what the scope of those actions should be.

It points to existing code-focused tools—Claude Code and Codex—as an example of how far “tool use” can be stretched when you can connect beyond their original code-first purpose. With effort, they can branch out into talking to other services and even running small tasks.

Then it pivots to a more radical prototype: Microsoft’s Project Solara, described as an OS built specifically to run specialized AI agents for specific tasks. The source says the concept is intriguing even if it isn’t fully convinced by an AI-led UI experience.

Android’s answer: AppFunctions, aimed at making Gemini more agentic

Google’s own approach for Android is identified in the source as a new AppFunctions framework, “whispered about at I/O 2026.” AppFunctions, as described, is an experimental API that allows apps to expose tools to Android’s Gemini AI agent—so apps can communicate directly with the agent.

The source also compares AppFunctions to an on-device version of an MCP server, with apps acting locally rather than in the cloud.

For developers. AppFunctions are presented as straightforward to build on and able to run on-device with no latency and enhanced privacy. The source also says AppFunctions interact directly with app state and leverage Android’s permission and privileged-agent model—features intended to block functions users don’t want running.

But the limitations show up quickly, and they’re central to the argument. The source says AppFunctions remain restrictive compared with what’s possible elsewhere.

It notes that AppFunctions are limited to functions exposed by apps and the OS. It also says there’s still no hint of long-term project scratchpads or self-learning capabilities.

More importantly, it argues that AppFunctions feel structurally incapable of supporting autonomous workflows that span multiple systems or maintain state across long-term tasks on their own.

The source draws a line between what could be “game-changing” and what likely won’t happen. Asking Gemini to book a cab. add milk to a shopping list. and cancel a hotel booking could be enough for many users. But it’s less clear AppFunctions could open the door to a phone managing things like stock portfolio handling. budgeting. or booking an entire vacation away—especially if the agent needs to coordinate across services over time.

The same ambition that makes agents useful also makes them risky

The source doesn’t treat these capabilities as just a convenience upgrade. It raises the practical danger that comes with more power.

An agent that understands someone’s needs and data is more powerful than simple tool calling. But with that power comes risk—especially if an agent is given unfettered access to accounts and documents.

The safer path, the source says, is careful scoping: giving an AI a carefully bounded environment so it can accomplish specific tasks.

Proper sandboxing is presented as the key, but also the hard part. Containers and virtual machines are described as too resource-intensive to spin up for multiple agentic tasks.

WebAssembly (WASM) is said to show promise for lightweight, secure applications, but the source also calls the specification a “broken mess” when it comes to cross-language support for developers.

Then come the lingering concerns that don’t disappear just because the interface looks smooth: sensitive data management and history. per-task permissions and shared memory. and how authentication between agents and services should work. The source stresses that you don’t want an AI to retain context of emails in plain-text memory.

Whoever solves these problems, the source suggests, gets a powerful AI-centric tool that could upend Android and iOS as the AI platforms of the future.

So is the answer a new kind of phone, or a different kind of integration?

The closing movement is less certain—and that’s where the unease really lands.

The source argues that AppFunctions, MCP, and similar frameworks are still incredibly useful because they make today’s operating systems accessible to AI. They’re positioned as the future of mobile AI in the short term.

But beyond that, the source casts doubt on the long-term direction of “apps as the core unit of computing.” The claim is blunt: given the wind is blowing toward agent-service integrations, those integrations may be the future rather than more app-first activity.

It asks readers to picture a world where you don’t need a 300MB taxi app when an MCP request could do the job. It also points at the clutter people already feel—app drawers and notification shades—and frames a potential alternative: everything arriving at the end of a quicker voice command.

If AI becomes the primary way people interact with computing, the source argues, then smartphone operating systems may need to be rebuilt from the ground up—not just updated.

And that’s the tension the last year has been teaching, one tool at a time. Today’s agents can do impressive things. but they still depend on a structure phones weren’t originally built to provide. The moment you want the assistant to work like an actual agent—over time. across systems. with bounded but meaningful access—you start bumping into the walls of Android and iOS.

In that collision between ambition and architecture, the real story isn’t whether AI is good enough. It’s whether the platform underneath it is.

agentic AI MCP Model Context Protocol Android AppFunctions Gemini Anthropic OpenAI Claude Code Codex Microsoft Project Solara OpenClaw smartphone AI AI-first operating system sandboxing WebAssembly WASM privacy and security

Ana Souza 16 hours ago

0 6 minutes read

Leave a Reply Cancel reply