中文 English

Clawd Code Quick Architecture Read: One Page to Understand the Python-First Rewrite Workspace

Published: 2026-03-31
clawd-code claude-code python architecture audit

The latest margrop/clawd-code codebase is no longer just a preserved source archive. It is now a very explicit Python-first porting workspace: src/ is the active implementation surface, tests/ is the verification layer, archive/claude_code_ts_snapshot/ is an optional local archive, and src/reference_data/ is the source of mirrored command and tool inventories.

This short post answers one question only: how is the workspace organized right now?

1. Start with the shape

The current main.py exposes a command surface that is much broader than a simple summary script:

That tells you the project is not trying to be a monolithic clone. It is trying to expose a set of readable layers:

2. Method: command and tool surfaces are now data

Command surface

commands.py loads roughly 207 command entries from src/reference_data/commands_snapshot.json.
It can:

  1. fetch a single command
  2. search commands
  3. render an index
  4. return a mirrored execution message

command_graph.py then splits those commands into:

That makes the command surface feel like a command regime, not a pile of scripts.

Tool surface

tools.py loads roughly 184 tool entries and adds policy filters:

ToolPermissionContext.blocks() can reject tools by exact name or prefix. That means the tool pool is a controlled capability surface, not an “all tools enabled” list.

3. Body: sessions, history, and transcript

The runtime body lives mostly in query_engine.py and runtime.py.

QueryEnginePort

The core state is explicit:

submit_message() does not just emit text. It:

  1. checks the turn budget
  2. builds a summary
  3. updates usage
  4. sets a stop reason
  5. appends transcript state
  6. compact if necessary
  7. returns a TurnResult

The streaming path emits a real event sequence:

History and persistence

HistoryLog stores stage-level events. TranscriptStore keeps replayable prompt history. StoredSession is the persisted snapshot.
That is what makes the runtime a body: state is not hidden, and state can survive.

4. Technique: routing and assembly

runtime.py is where the workspace starts to behave like a runtime instead of a catalog.

Prompt routing

PortRuntime.route_prompt() uses transparent token scoring:

  1. normalize the prompt
  2. split into tokens
  3. match against module names, source hints, and responsibilities
  4. score matches
  5. select the best ones

There is no embedding black box here. It is simple, explainable, and easy to audit.

Session bootstrap

bootstrap_session() assembles the entire flow:

  1. build workspace context
  2. run setup with trusted=True
  3. record history
  4. route the prompt
  5. build an execution registry
  6. execute mirrored command/tool shims
  7. infer permission denials
  8. submit and stream a turn
  9. persist the session

The result is a full RuntimeSession report instead of a single string. That is a good sign: the system wants the process to be inspectable.

5. Startup: prefetch first, trust gate second

The startup chain is one of the strongest parts of the rewrite.

setup.py, prefetch.py, deferred_init.py, and bootstrap_graph.py define an order:

  1. top-level prefetch side effects
  2. warning handler and environment guards
  3. CLI parser and pre-action trust gate
  4. setup() + commands/agents parallel load
  5. deferred init after trust
  6. mode routing
  7. query engine submit loop

run_setup() performs the prefetch work first.
run_deferred_init(trusted) turns trust into four switches:

The rule is simple: decide trust first, then decide capability.

6. Audit: parity is the conscience

parity_audit.py is not a vanity metric. It is the project’s self-check.

It compares:

  1. root file coverage
  2. directory coverage
  3. Python file count vs archived TS-like count
  4. command snapshot coverage
  5. tool snapshot coverage
  6. missing targets

Three numbers matter most:

That says this is a serious surface-mirroring project, not a tiny demo.
Just as important, the audit does not pretend equivalence when the archive is unavailable.

7. One-line memory aid

If you want the quick Dao / Method / Body / Technique map:

That is the current shape of the workspace.

References

  1. https://github.com/margrop/clawd-code
  2. https://github.com/XingP14/claude-code
  3. https://github.com/ghuntley/claude-code-source-code-deobfuscation