中文 English

Clawd Code Latest Code Analysis: How a Python-First Claude Code Rewrite Organizes Commands, Tools, Sessions, and Audit

Published: 2026-03-31
clawd-code claude-code python Agent architecture audit porting

After reading the latest code in margrop/clawd-code, the main change is obvious: this repository is no longer just an archive of the leaked Claude Code tree. It is now a Python-first porting workspace. The tracked tree centers on src/ as active implementation, tests/ as verification, archive/claude_code_ts_snapshot/ as an optional local archive, and src/reference_data/ as the mirrored data source for commands, tools, and surface coverage.

That changes the question. We are no longer asking only “what did the original source look like?” We are asking:

  1. can the old system be reorganized into a maintainable Python workspace?
  2. can commands, tools, sessions, permissions, and startup order be modeled explicitly?
  3. how much of the original surface is mirrored today?
  4. which parts are already runnable, and which parts are still skeletal?

That is why I rewrote this post. The previous version still revolved around the leaked snapshot itself. This version needs to talk about the new engineering object: the rewrite workspace.

If I compress the current codebase into one sentence, it would be this:

the repository turns porting into data, state, flow, and audit.

That is the part worth writing down.

1. Start with the shape of the repository

The current README makes the direction explicit. The key entrypoints are:

That list tells you something important: the project is no longer trying to be a single-purpose source dump. It is trying to expose a set of readable layers:

  1. port_manifest for workspace shape;
  2. commands, tools, command_graph, and tool_pool for mirrored inventories;
  3. query_engine, runtime, history, transcript, and session_store for session state;
  4. setup, prefetch, deferred_init, and bootstrap_graph for startup order;
  5. parity_audit for coverage and drift;
  6. remote_runtime and direct_modes for future routing branches.

That is a much more mature story than “we have a copy of the source”.

2. Dao: the direction is now the porting process itself

If I read this through the Daoist lens, the “Dao” is no longer the leaked code as an artifact. The Dao is the porting direction:

PortManifest is the clearest expression of that Dao. It scans the active Python tree, counts files, groups top-level modules, and renders the result as Markdown. It does not try to be smart. It tries to be true.

QueryEnginePort.render_summary() does something similar. It takes the manifest, command surface, tool surface, session state, usage totals, and transcript state, and compresses them into a single readable report. That is not merely a convenience function. It is a statement of intent: the system wants to know what it is doing before it tries to do more.

That restraint is a good sign. In porting work, the most dangerous mistake is pretending the job is already complete.

3. Method: commands, tools, permissions, and startup graphs

The Method layer is where the repository stops being a pile of files and becomes an organization.

3.1 commands.py

commands.py loads src/reference_data/commands_snapshot.json into a tuple of PortingModule objects. It provides:

The important detail is execute_command(). It does not claim to execute the original system. It returns a mirrored execution message that says, in effect, “this entry would handle the prompt here.” That is a very practical porting move: keep the behavior explainable before trying to make it fully equivalent.

3.2 tools.py

tools.py does the same for the tool surface, but with more policy:

The permission context is especially interesting. ToolPermissionContext.blocks() can reject a tool by exact name or by prefix. That means the tool pool is not only a list, it is a policy boundary.

In other words, tools are treated like capability, and capability is filtered by explicit rules.

3.3 command_graph.py and tool_pool.py

The repository does not stop at raw inventories. It builds higher-level structures:

The classification itself is simple. It looks at source_hint and applies lightweight grouping rules. That is a feature, not a limitation. For a porting workspace, a transparent heuristic is often more useful than a hidden embedding model.

3.4 bootstrap_graph.py

build_bootstrap_graph() is one of the most telling pieces in the new code. It turns startup into a doctrine:

  1. top-level prefetch side effects
  2. warning handler and environment guards
  3. CLI parser and pre-action trust gate
  4. setup() + commands/agents parallel load
  5. deferred init after trust
  6. mode routing: local / remote / ssh / teleport / direct-connect / deep-link
  7. query engine submit loop

That is not cosmetic. It is a statement that order matters.

4. Body: sessions, history, transcript, and persistence

The Body layer is the part of the system that actually carries state.

QueryEnginePort is the clearest “body” object in the workspace. It holds:

That means a turn is not just output. A turn is a state update.

4.1 submit_message()

submit_message() roughly does this:

  1. stop early if the turn budget is exceeded;
  2. build a summary from prompt, matched commands, matched tools, and denials;
  3. estimate usage and update totals;
  4. set a stop reason;
  5. append the prompt to in-memory state;
  6. append it to the transcript store;
  7. compact messages if needed;
  8. return a structured TurnResult.

This is a very good model for a porting runtime. It makes the state changes visible and testable.

4.2 stream_submit_message()

The streaming variant emits explicit events:

Again, the point is observability. The system does not hide the lifecycle behind one opaque string.

4.3 History and transcript

The workspace splits session memory into three layers:

TranscriptStore can append, compact, replay, and flush. That is a small class, but it captures a very important idea: sessions should be mutable, compressible, and persistent.

5. Technique: routing, execution, and runtime assembly

The Technique layer is where the workspace starts acting like a runtime.

5.1 route_prompt()

PortRuntime.route_prompt() does not use embeddings or an opaque classifier. It uses transparent token scoring:

  1. normalize the prompt;
  2. split it into tokens;
  3. compare tokens with module names, source hints, and responsibilities;
  4. assign scores;
  5. sort and select the best matches.

That makes the routing explainable and debuggable. In a porting project, that matters more than sounding clever.

5.2 bootstrap_session()

This is the most complete runtime assembly path in the repository. It:

  1. builds the workspace context;
  2. runs setup with trusted=True;
  3. records history;
  4. routes the prompt against mirrored inventories;
  5. builds the execution registry;
  6. executes mirrored command/tool shims;
  7. infers permission denials;
  8. streams and submits a query-engine turn;
  9. persists the session;
  10. returns a complete RuntimeSession report.

That report includes context, setup, system init, routing results, execution messages, stream events, turn output, persisted session path, and history. It is a strong design because it returns a whole situation, not just a string.

5.3 Turn loop and mode branches

run_turn_loop() simulates a bounded multi-turn cycle.
remote_runtime.py and direct_modes.py define future-facing branches:

Right now those are largely placeholders, but they are useful placeholders. They make the architecture explicit and leave room for future runtime branching.

6. Startup: prefetch, deferred init, and trust gating

This is one of the strongest improvements in the latest code.

6.1 run_setup()

run_setup() simulates startup prefetch work:

It also returns a SetupReport that includes platform details, trust status, prefetch results, and deferred initialization results.

6.2 run_deferred_init(trusted)

deferred_init.py maps a single boolean into four capability switches:

If trusted is false, those capabilities stay off. If it is true, they turn on.

That is a very clean trust gate. In agent systems, the startup boundary is often more dangerous than the runtime boundary because it is where side effects appear first.

6.3 SetupReport

The setup report turns startup into Markdown so it can be inspected. That fits the rest of the project very well: the code keeps trying to make state visible before it tries to make state automatic.

7. Audit: parity is a conscience, not a slogan

The parity layer is what makes this repository honest.

parity_audit.py compares the Python workspace with the archived snapshot surface along several axes:

  1. root file coverage;
  2. directory coverage;
  3. current Python file count versus the archived TS-like count;
  4. command snapshot coverage;
  5. tool snapshot coverage;
  6. missing root targets;
  7. missing directory targets.

The numbers are worth remembering:

Those counts tell you the project is not just a toy rewrite. It is mirroring a large enough surface that a proper audit becomes necessary.

The most important detail, though, is that the audit is explicit when the archive is unavailable. It does not fake completeness. That is the right attitude for this kind of work.

8. What changed compared with the previous version of this post

The previous article was about the leaked snapshot and the broader legal/ethical context.
This one is about the current repository:

  1. Python-first workspace instead of a tracked TypeScript leak;
  2. mirrored inventories instead of static source commentary;
  3. runtime reports instead of vague architectural impressions;
  4. startup and trust gating instead of simply reading files;
  5. parity audit instead of pretending equivalence.

That is a major shift. The repository now behaves more like a system under reconstruction than an archive under observation.

9. What I would call the Dao / Method / Body / Technique map

If I compress the whole repository into one mental model:

That map is useful because it tells you where to look when the code changes.

If the manifest changes, the Dao is shifting.
If the command graph or permissions change, the Method is shifting.
If the session/transcript model changes, the Body is shifting.
If routing or runtime assembly changes, the Technique is shifting.

That is a clean way to think about this workspace.

10. Closing thought

I do not think the right way to describe this repository is “it copied the leaked code” or “it rewrote the leaked code”.

The better description is:

it translated a large source surface into a Python workspace that can be inspected, audited, routed, and gradually made more complete.

That is a much more interesting engineering object.

The project is now less about preserving a source tree and more about turning a source tree into a reproducible system of:

And that is exactly the kind of thing worth writing about.

References

  1. https://github.com/margrop/clawd-code
  2. https://github.com/XingP14/claude-code
  3. https://github.com/ghuntley/claude-code-source-code-deobfuscation