跳转至内容
  • 版块
  • 最新
  • 标签
  • 热门
  • 世界
  • 用户
  • 群组
皮肤
  • 浅色
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • 深色
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • 默认(不使用皮肤)
  • 不使用皮肤
折叠
AI订阅指南

AI订阅指南

  1. 主页
  2. AI 工具横评
  3. The agent plan had every step except where to stop

The agent plan had every step except where to stop

已定时 置顶 已锁定 已移动 AI 工具横评
12 评论 12 发布者 17.0k 浏览 38 关注中
  • 从旧到新
  • 从新到旧
  • 最多赞同
回复
  • 在新帖中回复
登录后回复
此主题已被删除。只有拥有主题管理权限的用户可以查看。
  • 霜 离线
    霜 离线
    霜华逸梦
    编写于 最后由 编辑
    #1

    来源:https://dev.to/michaeltruong/the-agent-plan-had-every-step-except-where-to-stop-357h


    I've been running multi-slice agent plans — Renovate migrations, content-pipeline skills, dependency upgrades. I split multi-PR work into slices, each backed by a markdown file with file paths, verification commands, and merge-safe acceptance criteria.

    I assumed the checklist was enough. The plan described what to build. I treated how far the agent could go as implicit. Then an agent merged a pull request I expected to review first.

    The trigger was mundane. During the first slice of a Renovate migration, an agent regrouped dependency buckets in renovate.json — config-only, no version bumps, no runtime behavior. It ran lint and typecheck, opened the pull request, and merged it. The change itself was reasonable.

    What surprised me was the absence of a documented stop line. The migration plan described the edit, the verification commands, and the acceptance criteria. It did not say whether the executing agent should stop at "open PR" or continue to "merge after green checks." The plan was an implementation spec. The agent treated it as permission to finish the job.

    Traditional engineering plans answer: what work should happen, in what order, with what verification? Agent plans increasingly need a second answer: how much autonomy does the next actor get?

    My first reaction was to tighten the repository boundary — branch protection became the safety layer. But protection alone does not tell the agent whether this slice was supposed to end at an open PR or proceed to merge.

    The portable fix: every slice names exactly how far the executor may go. Two levels: Default is Open PR only. Elevated is Merge granted, requiring explicit rationale. Each slice also states Rationale and copies the Agent instruction verbatim into the prompt.

    The lesson is narrower: once agents act, plans delegate autonomy whether you write that down or not. Human delegation has always been fuzzy — "take a pass at this" means different things to different people. Agent delegation punishes ambiguity faster because the agent will complete every step it can justify from the text in front of it.

    (此帖无评论)


    1 条回复 最后回复
    154
    • 浅 离线
      浅 离线
      浅笑气节
      编写于 最后由 编辑
      #2

      之前一直用 GPT,上周试了 Claude,代码能力确实更强。

      1 条回复 最后回复
      11
      • G 离线
        G 离线
        galaxyquest
        编写于 最后由 编辑
        #3

        免费版有什么限制?能用几个小时?

        1 条回复 最后回复
        3
        • 养 离线
          养 离线
          养只猫叫花花
          编写于 最后由 编辑
          #4

          API 定价出来了吗?对小团队友不友好?

          1 条回复 最后回复
          24
          • 拾 离线
            拾 离线
            拾光忘机
            编写于 最后由 编辑
            #5

            之前一直用 GPT,上周试了 Claude,代码能力确实更强。

            1 条回复 最后回复
            37
            • T 离线
              T 离线
              techguru27
              编写于 最后由 编辑
              #6

              UI 做得不错,但核心能力跟开源方案比如何?

              1 条回复 最后回复
              5
              • 忘 离线
                忘 离线
                忘川
                编写于 最后由 编辑
                #7

                免费版有什么限制?能用几个小时?

                1 条回复 最后回复
                1
                • 浅 离线
                  浅 离线
                  浅忆横笛
                  编写于 最后由 编辑
                  #8

                  试了一下,确实好用,准备把之前的工具换掉了。

                  1 条回复 最后回复
                  19
                  • I 离线
                    I 离线
                    ioncloud
                    编写于 最后由 编辑
                    #9

                    免费版有什么限制?能用几个小时?

                    1 条回复 最后回复
                    17
                    • 低 离线
                      低 离线
                      低吟旅人
                      编写于 最后由 编辑
                      #10

                      之前一直用 GPT,上周试了 Claude,代码能力确实更强。

                      1 条回复 最后回复
                      2

                      你好!看起来您对这段对话很感兴趣,但您还没有一个账号。

                      厌倦了每次访问都刷到同样的帖子?您注册账号后,您每次返回时都能精准定位到您上次浏览的位置,并可选择接收新回复通知(通过邮件或推送通知)。您还能收藏书签、为帖子顶,向社区成员表达您的欣赏。

                      有了你的建议,这篇帖子会更精彩哦 💗

                      注册 登录
                      回复
                      • 在新帖中回复
                      登录后回复
                      • 从旧到新
                      • 从新到旧
                      • 最多赞同


                      • 登录

                      • 没有帐号? 注册

                      • 登录或注册以进行搜索。
                      Powered by NodeBB Contributors
                      • 第一个帖子
                        最后一个帖子
                      0
                      • 版块
                      • 最新
                      • 标签
                      • 热门
                      • 世界
                      • 用户
                      • 群组