If you’ve ever tried using AI on your inbox, you’ve probably felt two competing emotions.
The first is relief: finally, something that can turn 73 unread messages into a short list of what matters.
The second is dread: if you let this thing send a reply, you might wake up to a minor social crisis you didn’t authorize.
That tension is the whole game.
The fastest way to make an “inbox agent” actually usable is to start with a hard constraint: advice-only. The agent can read, classify, summarize, and draft. It cannot send. It cannot archive. It cannot unsubscribe. It cannot do anything irreversible.
That sounds limiting, but in practice it’s what turns AI from a novelty into a habit. You keep your voice. You keep your relationships. You still get the leverage.
This post lays out a workflow you can copy in a weekend: how to choose inputs, pick defaults, shape the output so it fits on one screen, and handle the ways it fails.
What “inbox triage” actually means
Inbox triage is not “write emails faster.”
It’s “reduce decisions.”
Most inbox stress isn’t the typing. It’s the constant tiny judgment calls: is this urgent, is it worth responding, should I schedule something, do I owe someone a yes/no, is this just noise.
A good triage artifact should do three things:
- separate the few messages that can hurt you if ignored,
- turn the rest into a small set of queued decisions,
- give you enough context to act without re-reading five threads.
If your triage output makes you open more tabs, you didn’t build triage. You built a new kind of procrastination.
The core rule: advice-only until trust is earned
Here’s the operational boundary I recommend:
The agent may propose actions and draft text, but it may not take actions.
It’s not just about avoiding embarrassment. It’s about feedback loops.
When an agent can send things, you feel like you need to supervise it continuously. Continuous supervision is how these projects die. Advice-only lets you run it on a schedule and only intervene when something looks off.
If you want a default timeline: keep it advice-only for at least two weeks of daily usage. If it’s still reliably helpful, you can add one “low-risk action” later (for example: moving obvious newsletters into a folder). One at a time.
Inputs: choose one channel and make it boring
The trap is to start with “my whole digital life.” Email, Slack, iMessage, LinkedIn, calendar invites, GitHub notifications, all at once.
Don’t.
Pick one:
- If your real inbox pain is email, start with email.
- If your real pain is one chat system, start there.
Then set a hard bound: “unread messages from the last 24 hours” or “the newest 50 items.” You’re building a recurring artifact, not a perfect archive.
Also: decide whether the agent sees the full message body or just the sender/subject/snippet. Full body produces better summaries and draft replies, but it increases privacy surface area. If you’re doing this for work, keep it narrow at first.
The output template that makes this work
Most inbox tools fail because the output is too long.
Triage is only useful if it fits on one screen and you can skim it in under a minute.
Use a fixed template. Here’s one that works well for a daily run:
Inbox triage (last 24h)
Urgent (today) …
Waiting on you …
Can ignore …
Drafts (ask-first) …
The secret is that each line should be written as a decision, not a summary. “Reply with yes/no.” “Schedule.” “No action.”
If you get the structure right, you stop doing inbox archaeology.
A concrete workflow (what to implement)
Run the agent once per day at a consistent time. Twice per day if your inbox is heavy, but avoid “always on.” Always-on creates always-anxious.
When it runs, it should pull the newest N unread items, collapse each thread into a short, neutral summary, classify into a small number of buckets, and propose the next action. Drafting replies is optional, and you should treat it like a power tool: useful, but easy to misuse.
I’m being deliberate about “optional.” Drafting for everything is how you get spammy output that you don’t trust. Start by drafting only for messages in the “Waiting on you” bucket where the reply is unambiguous.
Example output (what it should actually send you)
Here’s a realistic advice-only triage message:
Inbox triage (last 24h)
Urgent (today) Doc to sign (deadline today) → open + sign
Waiting on you Scheduling: 2 threads want times → pick 3 slots Quick yes/no: “Are you joining the call?” → reply yes
Can ignore 6 updates/newsletters (no reply)
Drafts (ask-first) I can draft a 3-sentence response for the scheduling threads—want me to?
Notice what’s missing: a wall of text.
The point is not to capture everything. It’s to make the next 10 minutes of inbox work smoother.
Defaults that prevent this from becoming annoying
If you want speed, set defaults now and don’t bike-shed them later.
Cadence: daily is usually right. If you do twice per day, pick two specific times.
N: 30–60 unread items is a good bound. If you regularly exceed that, your triage output should include “you’re over the cap” as a signal.
Buckets: keep it to 3–4 buckets total. Humans can’t hold seven categories in their head.
Length cap: one screen. If it can’t fit, it must compress harder.
Drafting rule: start with “draft only on request.” Then expand to “draft for the top 1–3 messages that clearly need a reply.”
Escalation: one retry + one alert if the inbox API fails. Otherwise stay quiet.
Permission boundaries (make trust explicit)
Advice-only is the big boundary. But it’s worth being explicit about smaller ones.
The agent should not unsubscribe you (even if it’s obvious), archive or delete (even if it’s spam), “follow up” with people, accept calendar invites, or send messages in your voice.
If you later decide to allow one action, choose one that is reversible and socially harmless. Think “label this thread” rather than “reply.”
Failure modes (and what to do when they happen)
You don’t need a sophisticated agent to get value. You do need a plan for the ways it breaks.
Failure mode 1: The agent is confidently wrong about urgency
What it looks like: it puts a low-stakes newsletter in “Urgent,” or it misses a time-sensitive request because the subject line is vague.
How you catch it: add one line in the triage output that lists the “top 5 newest items” by sender/subject, even if classified as ignorable. That gives you a quick sanity check without re-reading everything.
How you fix it: tighten the urgency heuristic. For example: “urgent only if it has an explicit deadline/time, or it’s from a short allowlist of senders.” If you don’t want to maintain a sender allowlist, use a “meeting-related” signal: anything that mentions today/tomorrow, a calendar time, or the word “deadline.”
Failure mode 2: Draft replies sound like a robot
What it looks like: generic pleasantries, overly formal tone, or it agrees to things too eagerly.
How you catch it: the agent should label drafts as drafts, and you should treat them like autocomplete, not like a final answer.
How you fix it: create a short “voice note” the agent uses for drafting: two sentences on tone, one sentence on how you say no, one sentence on how you propose times. This is the highest leverage personalization you can do.
Failure mode 3: It creates more work than it saves
What it looks like: you spend longer reading the triage than you would have spent scanning your inbox.
How you catch it: measure it once. Time yourself for three days. If triage takes longer than 60 seconds to read, it’s too long.
How you fix it: reduce N. Reduce buckets. Remove summaries from the “Can ignore” bucket. Limit draft replies. You’re compressing for action, not for completeness.
Failure mode 4: It silently stops running
What it looks like: you forget about it for a week.
How you catch it: store the last-success timestamp somewhere durable, and if it hasn’t run for 48 hours, send one alert.
How you fix it: don’t add more automation. Make the trigger boring and reliable. A scheduled run beats clever event-based triggers here.
The “upgrade path” if you want more power later
Once you’ve used advice-only triage for a couple of weeks, you’ll know what you actually want.
Some people want better summaries. Some want better drafting. Some want the agent to surface “open loops” (things you promised and forgot).
A safe progression looks like this: start with advice-only triage. Then add drafting for only the top 1–3 items. Then add a “follow-up reminders” section (still no sending). If you want one real action, make it reversible (labeling, moving to a folder). Anything beyond that is where you should demand real monitoring.
If you skip straight to autonomy, you’ll spend your time apologizing.
Closing: the goal is a calmer inbox, not a faster typist
The reason this works is that it respects the real constraint: you can delegate sorting, but you can’t delegate relationships.
Advice-only triage gives you leverage without giving up control.
If you implement this and stick to one screen of output, you’ll feel the difference in a week.