Clawd Code Quick Architecture Read: One Page to Understand the Python-First Rewrite Workspace

Published: 2026-03-31

The latest margrop/clawd-code codebase is no longer just a preserved source archive. It is now a very explicit Python-first porting workspace: src/ is the active implementation surface, tests/ is the verification layer, archive/claude_code_ts_snapshot/ is an optional local archive, and src/reference_data/ is the source of mirrored command and tool inventories.

This short post answers one question only: how is the workspace organized right now?

1. Start with the shape

The current main.py exposes a command surface that is much broader than a simple summary script:

workspace summary
manifest
parity audit
setup report
command graph
tool pool
bootstrap graph
command / tool inventories
routing / bootstrap / turn loop
session load / flush / persist
remote / ssh / teleport / direct-connect / deep-link branches

That tells you the project is not trying to be a monolithic clone. It is trying to expose a set of readable layers:

port_manifest for workspace shape
commands / tools for mirrored inventories
query_engine / runtime for session behavior
setup / bootstrap_graph for startup order
parity_audit for coverage and drift

2. Method: command and tool surfaces are now data

Command surface

commands.py loads roughly 207 command entries from src/reference_data/commands_snapshot.json.
It can:

fetch a single command
search commands
render an index
return a mirrored execution message

command_graph.py then splits those commands into:

builtins
plugin_like
skill_like

That makes the command surface feel like a command regime, not a pile of scripts.

Tool surface

tools.py loads roughly 184 tool entries and adds policy filters:

simple_mode
include_mcp
ToolPermissionContext

ToolPermissionContext.blocks() can reject tools by exact name or prefix. That means the tool pool is a controlled capability surface, not an “all tools enabled” list.

3. Body: sessions, history, and transcript

The runtime body lives mostly in query_engine.py and runtime.py.

`QueryEnginePort`

The core state is explicit:

session_id
mutable_messages
permission_denials
total_usage
transcript_store

submit_message() does not just emit text. It:

checks the turn budget
builds a summary
updates usage
sets a stop reason
appends transcript state
compact if necessary
returns a TurnResult

The streaming path emits a real event sequence:

message_start
command_match
tool_match
permission_denial
message_delta
message_stop

History and persistence

HistoryLog stores stage-level events. TranscriptStore keeps replayable prompt history. StoredSession is the persisted snapshot.
That is what makes the runtime a body: state is not hidden, and state can survive.

4. Technique: routing and assembly

runtime.py is where the workspace starts to behave like a runtime instead of a catalog.

Prompt routing

PortRuntime.route_prompt() uses transparent token scoring:

normalize the prompt
split into tokens
match against module names, source hints, and responsibilities
score matches
select the best ones

There is no embedding black box here. It is simple, explainable, and easy to audit.

Session bootstrap

bootstrap_session() assembles the entire flow:

build workspace context
run setup with trusted=True
record history
route the prompt
build an execution registry
execute mirrored command/tool shims
infer permission denials
submit and stream a turn
persist the session

The result is a full RuntimeSession report instead of a single string. That is a good sign: the system wants the process to be inspectable.

5. Startup: prefetch first, trust gate second

The startup chain is one of the strongest parts of the rewrite.

setup.py, prefetch.py, deferred_init.py, and bootstrap_graph.py define an order:

top-level prefetch side effects
warning handler and environment guards
CLI parser and pre-action trust gate
setup() + commands/agents parallel load
deferred init after trust
mode routing
query engine submit loop

run_setup() performs the prefetch work first.
run_deferred_init(trusted) turns trust into four switches:

plugin_init
skill_init
mcp_prefetch
session_hooks

The rule is simple: decide trust first, then decide capability.

6. Audit: parity is the conscience

parity_audit.py is not a vanity metric. It is the project’s self-check.

It compares:

root file coverage
directory coverage
Python file count vs archived TS-like count
command snapshot coverage
tool snapshot coverage
missing targets

Three numbers matter most:

1902 TS-like files in the archive surface snapshot
207 command entries
184 tool entries

That says this is a serious surface-mirroring project, not a tiny demo.
Just as important, the audit does not pretend equivalence when the archive is unavailable.

7. One-line memory aid

If you want the quick Dao / Method / Body / Technique map:

Dao: manifest and summary make the current workspace visible
Method: command graph, tool pool, and bootstrap graph make the organization visible
Body: session, history, and transcript make the state visible
Technique: route, bootstrap, and turn loop make the action visible

That is the current shape of the workspace.

References

https://github.com/margrop/clawd-code
https://github.com/XingP14/claude-code
https://github.com/ghuntley/claude-code-source-code-deobfuscation