The Intelligence Layer
Three companies shipped desktop-native AI in the same week, each integrating at a different depth. Intelligence is becoming an OS layer, the same way networking and graphics did before it.
A GenAI Newsletter by Raj
For the past two years, AI lived in a browser tab. You opened ChatGPT or Claude, typed a question, got an answer, and went back to whatever you were doing. The AI had no idea what was on your screen or what files were on your machine.
That is changing significantly. Three companies shipped desktop-native AI within days of each other, and a fourth approach emerged from individual developers. Each one has a different idea of how AI should live on your computer, and looking at them side by side tells you a lot about where this is headed.
Approach 1: Replace the OS
Perplexity Personal Computer launched for Mac on April 16. It manages your local files, native applications, and web browsing. It reads your email, calendar, and messages. It uses roughly 20 AI models internally, routing each task to whichever model is best suited for it. With a Mac mini, it runs 24/7. You can start tasks remotely from your iPhone with two-factor authentication.
CEO Aravind Srinivas: "A traditional operating system processes commands; an AI operating system focuses on goals."
This is the most ambitious version of desktop AI anyone has shipped. Perplexity is saying the file system, app launcher, notification center, and browser are all implementation details that should be hidden behind a goal-oriented AI layer. You say what you want done and the system figures out which files, apps, and APIs need to be orchestrated.
It costs $200/month. Whether the productivity gain justifies that depends on how much of your work can actually be expressed as goals. Writing "prepare my weekly report using data from these three spreadsheets and email it to the team" is a good fit. Browsing, reading, and forming opinions is not. The question for Perplexity is how much of a knowledge worker's day falls into the first category.
Approach 2: Live Alongside the OS
Gemini for Mac launched the same week. It's free for all users on macOS 15+. Press Option+Space from anywhere and Gemini appears as an overlay. It can see your screen and answer questions about whatever you're looking at.
Google took the opposite approach from Perplexity. Gemini doesn't manage your computer. It shows up when you call it, answers your question, and goes away. You stay in control of your OS, your files, your apps. The AI is a second opinion you can summon, not a manager that runs in the background.
Alongside the desktop app, Google shipped Gemini 3.1 Flash TTS, a text-to-speech model with audio tags that let you control vocal style, pace, and delivery. It supports 70+ languages and watermarks all output with SynthID. It currently holds the top Elo score (1,211) on the Artificial Analysis TTS leaderboard. Combined with the desktop overlay, this positions Gemini as something you can both see and hear.
The interesting thing about making this free is that Google is prioritizing distribution over revenue. If a hundred million people get used to pressing Option+Space to ask AI a question, Google has built a new kind of search habit that's much harder to displace than a browser bookmark.
Approach 3: Be the Terminal
Claude Code and the terminal-agent ecosystem represent a third philosophy. AI lives in your command line. There is no visual interface beyond text.
Claude Code already has /loop for recurring background tasks, /schedule for cron-like agents, /batch for parallel work across worktrees, skills for domain-specific capabilities, and MCP for connecting to external tools. It reads your repo, writes code, runs tests, manages git, and handles multi-step workflows. This week, Anthropic's Claude Managed Agents (now in public beta) added production infrastructure: sandboxing, permissions, state management, error recovery.
The terminal approach has the deepest integration of any of these. A terminal agent can read any file, run any command, and compose any Unix tool into a pipeline. The Perplexity and Gemini approaches are limited by what their app can access through macOS APIs. The terminal has no such constraint.
The tradeoff is that the audience is limited to people who already work in a terminal. My mother will never use Claude Code. She might use Perplexity Personal Computer in five years, and she could use Gemini for Mac today.
Approach 4: The Companion Layer
A fourth approach is emerging from individual developers. It doesn't try to replace anything or live anywhere specific. It sits next to your cursor as a teaching companion.
Clicky (by FarzaTV) watches your screen, listens to your questions, speaks answers back, and points at things on screen. Farza built it to learn Davinci Resolve. Within days, someone built a Hindi version for teaching elderly parents how to make UPI payments, and someone else built a Clicky SDK for embedding the pattern in any app.
This approach assumes you're already in the right application and you already know what you want to do. You just need help figuring out how. A video editor who can't find the color grading panel. A parent who wants to send money through Google Pay. A new employee trying to navigate their company's internal tools.
Of the four approaches, this one serves the widest range of people. Most users don't need an AI operating system. They need someone to show them where the button is.
The Pricing Tells a Story
| Approach | Product | Price | Who it's for |
|---|---|---|---|
| Replace the OS | Perplexity Personal Computer | $200/mo | Power users, executives |
| Overlay | Gemini for Mac | Free | Everyone |
| Terminal | Claude Code + Managed Agents | $200/mo (Max) | Developers |
| Companion | Clicky and derivatives | Free / open source | Learners, non-technical users |
Google is giving it away to build habit. Perplexity and Anthropic are charging premium prices because their users can measure the productivity gain. The companion layer is free because it's built by individuals solving their own problems, same as the skills we talked about last week.
Where This Goes
All four approaches will coexist for a while because they serve different people doing different things. The long-term trajectory is toward convergence. Gemini will eventually act on your screen, Perplexity will get cheaper as inference costs fall, Claude Code will eventually get a visual layer, and the companion pattern will get absorbed into operating systems as an accessibility feature.
The more useful question is which mental model becomes the default. Right now most people think of AI as "a chat window I type into." Within a year the default will probably be "something running on my machine." How much control it has is the open question, and this week gave us four different answers.
This is the tenth edition of my weekly deep dive into what is actually happening at the frontier of Generative AI. Previous editions: The Falling Price of Intelligence / Why Looping Is the New Scaling / The Quiet Skill Revolution / AI Gets Personal / The Stack Got Leaked / The Stack Eats the Model
This Week's Radar:
- Perplexity Personal Computer: AI that manages your Mac, runs 24/7, starts tasks from iPhone
- Gemini for Mac: Free native desktop app, Option+Space overlay, screen awareness
- Gemini 3.1 Flash TTS: Audio tags for voice control, 70+ languages, SynthID watermarking
- Claude Managed Agents: Production infrastructure for deploying terminal agents at scale
- OpenAI-Cerebras $20B deal: OpenAI diversifying away from NVIDIA
- OpenAI $100/mo tier: Unlimited GPT-5.4, 10x Codex, between Plus and Pro