略读预计 2 分钟

Don't trust large context windows

摘要

文章指出 LLM 的有效上下文远小于宣传值，通常在 100k token 后会进入性能下降的 “迟钝区”。作者认为自动摘要功能无法完全解决问题，建议采取主动管理策略：将关键信息沉淀为独立文档（如 PRD 或计划），并通过频繁开启新会话来确保模型始终在 “聪明区” 工作。

荐读理由

按文章RULER和Chroma报告，有效上下文仅为广告窗口的几分之一，工作量大时会进入智能区之外的区域；采用新会话+自己写的spec作为artifact，结构化项目如obra/superpowers和mattpocock/skills则把工作流全部围绕小命名artifact构建

原文

Don't trust large context windows

May 06 2026

#note #ai #llm #tech #practices

I recently watched a video that put a name on something I'd been feeling. The author splits an LLM's context window into two zones. There's the smart zone, where the model is sharp, and the dumb zone, where attention drops off and the model starts forgetting what you told it five minutes ago. The cutoff sits somewhere around 100k tokens. It doesn't matter how big the advertised context window is.

This matters because coding agents will happily walk you straight into the dumb zone. A modern agent burns through tokens fast. A few file reads, a long debug session, a sprawling test run, and you're at 100k before lunch. Meanwhile vendors keep advertising windows of 200k, 1M, even 2M, as if those numbers represented a usable working set. They don't. Studies like RULER and Chroma's report on context rot show that effective context is a fraction of the advertised number, and that performance degrades gradually as you fill the window.

Large context windows are mostly a marketing number. The architectures behind them work, but they paper over a problem the underlying attention mechanism doesn't really solve. The number on the box gets bigger every release. The usable part doesn't keep up.

Modern agents are getting smart about this. Tools like Claude Code now auto-compact: when the session gets long, the agent summarizes the history and starts fresh. That helps. But auto-compaction kicks in after you've already spent time in the dumb zone, and the summary is itself produced by a model that's already degraded. Better than nothing, but I'd rather avoid the situation altogether.

What I do is open a new session and pass it a spec I wrote myself. That's a much higher signal handoff than any automated summary, because I get to decide what matters going forward. It's the breadcrumb approach applied to agents. Leave an artifact that the next session, or the next person, can pick up cleanly.

You can take this further. Projects like obra/superpowers and mattpocock/skills structure entire agent workflows around small, named artifacts. PRDs, plans, skills, sub-agent handoffs. Each one is a way to keep the working session in the smart zone by deliberately moving information out of the session into something the next session can read.

So I treat my context window like a budget. I assume only the first chunk is really working for me, and everything I can move out of the live session and into a written artifact is one less thing for attention to fight over.

Continue Reading

Jun 13 2026 Pac-Man, but you're the ghost

May 29 2026 Fixing corrupted Home Assistant energy statistics

Feb 13 2026 n8n-nodes-open5e: n8n community node that lets you access D&D 5th edition SRD content

Feb 04 2026 A fix for long-pressing movement keys in VSCode with Vim-Mode

Feb 03 2026 The Scientific Method

Hacker News · 139 赞 · 97 评讨论 → 阅读原文 →

这条对你有帮助吗？