Clawd Code Latest Code Analysis: How a Python-First Claude Code Rewrite Organizes Commands, Tools, Sessions, and Audit
After reading the latest code in margrop/clawd-code, the main change is obvious: this repository is no longer just an archive of the leaked Claude Code tree. It is now a Python-first porting workspace. The tracked tree centers on src/ as active implementation, tests/ as verification, archive/claude_code_ts_snapshot/ as an optional local archive, and src/reference_data/ as the mirrored data source for commands, tools, and surface coverage.
That changes the question. We are no longer asking only “what did the original source look like?” We are asking:
- can the old system be reorganized into a maintainable Python workspace?
- can commands, tools, sessions, permissions, and startup order be modeled explicitly?
- how much of the original surface is mirrored today?
- which parts are already runnable, and which parts are still skeletal?
That is why I rewrote this post. The previous version still revolved around the leaked snapshot itself. This version needs to talk about the new engineering object: the rewrite workspace.
If I compress the current codebase into one sentence, it would be this:
the repository turns porting into data, state, flow, and audit.
That is the part worth writing down.
1. Start with the shape of the repository
The current README makes the direction explicit. The key entrypoints are:
summarymanifestparity-auditsetup-reportcommand-graphtool-poolbootstrap-graphsubsystemscommandstoolsroutebootstrapturn-loopflush-transcriptload-sessionremote-modessh-modeteleport-modedirect-connect-modedeep-link-modeshow-commandshow-toolexec-commandexec-tool
That list tells you something important: the project is no longer trying to be a single-purpose source dump. It is trying to expose a set of readable layers:
port_manifestfor workspace shape;commands,tools,command_graph, andtool_poolfor mirrored inventories;query_engine,runtime,history,transcript, andsession_storefor session state;setup,prefetch,deferred_init, andbootstrap_graphfor startup order;parity_auditfor coverage and drift;remote_runtimeanddirect_modesfor future routing branches.
That is a much more mature story than “we have a copy of the source”.
2. Dao: the direction is now the porting process itself
If I read this through the Daoist lens, the “Dao” is no longer the leaked code as an artifact. The Dao is the porting direction:
- make structure visible first;
- make mirrored surfaces queryable;
- make runtime state inspectable;
- make audit honest.
PortManifest is the clearest expression of that Dao. It scans the active Python tree, counts files, groups top-level modules, and renders the result as Markdown. It does not try to be smart. It tries to be true.
QueryEnginePort.render_summary() does something similar. It takes the manifest, command surface, tool surface, session state, usage totals, and transcript state, and compresses them into a single readable report. That is not merely a convenience function. It is a statement of intent: the system wants to know what it is doing before it tries to do more.
That restraint is a good sign. In porting work, the most dangerous mistake is pretending the job is already complete.
3. Method: commands, tools, permissions, and startup graphs
The Method layer is where the repository stops being a pile of files and becomes an organization.
3.1 commands.py
commands.py loads src/reference_data/commands_snapshot.json into a tuple of PortingModule objects. It provides:
get_command()get_commands()find_commands()execute_command()render_command_index()
The important detail is execute_command(). It does not claim to execute the original system. It returns a mirrored execution message that says, in effect, “this entry would handle the prompt here.” That is a very practical porting move: keep the behavior explainable before trying to make it fully equivalent.
3.2 tools.py
tools.py does the same for the tool surface, but with more policy:
simple_modeinclude_mcpToolPermissionContext
The permission context is especially interesting. ToolPermissionContext.blocks() can reject a tool by exact name or by prefix. That means the tool pool is not only a list, it is a policy boundary.
In other words, tools are treated like capability, and capability is filtered by explicit rules.
3.3 command_graph.py and tool_pool.py
The repository does not stop at raw inventories. It builds higher-level structures:
CommandGraphsplits commands intobuiltins,plugin_like, andskill_like;ToolPoolfilters tools into the currently allowed subset.
The classification itself is simple. It looks at source_hint and applies lightweight grouping rules. That is a feature, not a limitation. For a porting workspace, a transparent heuristic is often more useful than a hidden embedding model.
3.4 bootstrap_graph.py
build_bootstrap_graph() is one of the most telling pieces in the new code. It turns startup into a doctrine:
- top-level prefetch side effects
- warning handler and environment guards
- CLI parser and pre-action trust gate
- setup() + commands/agents parallel load
- deferred init after trust
- mode routing: local / remote / ssh / teleport / direct-connect / deep-link
- query engine submit loop
That is not cosmetic. It is a statement that order matters.
4. Body: sessions, history, transcript, and persistence
The Body layer is the part of the system that actually carries state.
QueryEnginePort is the clearest “body” object in the workspace. It holds:
session_idmutable_messagespermission_denialstotal_usagetranscript_store
That means a turn is not just output. A turn is a state update.
4.1 submit_message()
submit_message() roughly does this:
- stop early if the turn budget is exceeded;
- build a summary from prompt, matched commands, matched tools, and denials;
- estimate usage and update totals;
- set a stop reason;
- append the prompt to in-memory state;
- append it to the transcript store;
- compact messages if needed;
- return a structured
TurnResult.
This is a very good model for a porting runtime. It makes the state changes visible and testable.
4.2 stream_submit_message()
The streaming variant emits explicit events:
message_startcommand_matchtool_matchpermission_denialmessage_deltamessage_stop
Again, the point is observability. The system does not hide the lifecycle behind one opaque string.
4.3 History and transcript
The workspace splits session memory into three layers:
HistoryLogfor stage-level history;TranscriptStorefor replayable prompt history;StoredSessionfor persisted snapshots.
TranscriptStore can append, compact, replay, and flush. That is a small class, but it captures a very important idea: sessions should be mutable, compressible, and persistent.
5. Technique: routing, execution, and runtime assembly
The Technique layer is where the workspace starts acting like a runtime.
5.1 route_prompt()
PortRuntime.route_prompt() does not use embeddings or an opaque classifier. It uses transparent token scoring:
- normalize the prompt;
- split it into tokens;
- compare tokens with module names, source hints, and responsibilities;
- assign scores;
- sort and select the best matches.
That makes the routing explainable and debuggable. In a porting project, that matters more than sounding clever.
5.2 bootstrap_session()
This is the most complete runtime assembly path in the repository. It:
- builds the workspace context;
- runs setup with
trusted=True; - records history;
- routes the prompt against mirrored inventories;
- builds the execution registry;
- executes mirrored command/tool shims;
- infers permission denials;
- streams and submits a query-engine turn;
- persists the session;
- returns a complete
RuntimeSessionreport.
That report includes context, setup, system init, routing results, execution messages, stream events, turn output, persisted session path, and history. It is a strong design because it returns a whole situation, not just a string.
5.3 Turn loop and mode branches
run_turn_loop() simulates a bounded multi-turn cycle.
remote_runtime.py and direct_modes.py define future-facing branches:
remotesshteleportdirect-connectdeep-link
Right now those are largely placeholders, but they are useful placeholders. They make the architecture explicit and leave room for future runtime branching.
6. Startup: prefetch, deferred init, and trust gating
This is one of the strongest improvements in the latest code.
6.1 run_setup()
run_setup() simulates startup prefetch work:
start_mdm_raw_read()start_keychain_prefetch()start_project_scan(root)
It also returns a SetupReport that includes platform details, trust status, prefetch results, and deferred initialization results.
6.2 run_deferred_init(trusted)
deferred_init.py maps a single boolean into four capability switches:
plugin_initskill_initmcp_prefetchsession_hooks
If trusted is false, those capabilities stay off. If it is true, they turn on.
That is a very clean trust gate. In agent systems, the startup boundary is often more dangerous than the runtime boundary because it is where side effects appear first.
6.3 SetupReport
The setup report turns startup into Markdown so it can be inspected. That fits the rest of the project very well: the code keeps trying to make state visible before it tries to make state automatic.
7. Audit: parity is a conscience, not a slogan
The parity layer is what makes this repository honest.
parity_audit.py compares the Python workspace with the archived snapshot surface along several axes:
- root file coverage;
- directory coverage;
- current Python file count versus the archived TS-like count;
- command snapshot coverage;
- tool snapshot coverage;
- missing root targets;
- missing directory targets.
The numbers are worth remembering:
total_ts_like_files = 1902command_entry_count = 207tool_entry_count = 184
Those counts tell you the project is not just a toy rewrite. It is mirroring a large enough surface that a proper audit becomes necessary.
The most important detail, though, is that the audit is explicit when the archive is unavailable. It does not fake completeness. That is the right attitude for this kind of work.
8. What changed compared with the previous version of this post
The previous article was about the leaked snapshot and the broader legal/ethical context.
This one is about the current repository:
- Python-first workspace instead of a tracked TypeScript leak;
- mirrored inventories instead of static source commentary;
- runtime reports instead of vague architectural impressions;
- startup and trust gating instead of simply reading files;
- parity audit instead of pretending equivalence.
That is a major shift. The repository now behaves more like a system under reconstruction than an archive under observation.
9. What I would call the Dao / Method / Body / Technique map
If I compress the whole repository into one mental model:
- Dao: porting direction and explicit surface visibility;
- Method: command graph, tool pool, permissions, and bootstrap order;
- Body: session state, history, transcript, and persisted sessions;
- Technique: prompt routing, mirrored execution, runtime assembly, and mode branches.
That map is useful because it tells you where to look when the code changes.
If the manifest changes, the Dao is shifting.
If the command graph or permissions change, the Method is shifting.
If the session/transcript model changes, the Body is shifting.
If routing or runtime assembly changes, the Technique is shifting.
That is a clean way to think about this workspace.
10. Closing thought
I do not think the right way to describe this repository is “it copied the leaked code” or “it rewrote the leaked code”.
The better description is:
it translated a large source surface into a Python workspace that can be inspected, audited, routed, and gradually made more complete.
That is a much more interesting engineering object.
The project is now less about preserving a source tree and more about turning a source tree into a reproducible system of:
- manifest;
- inventory;
- session;
- startup;
- audit.
And that is exactly the kind of thing worth writing about.
References
https://github.com/margrop/clawd-codehttps://github.com/XingP14/claude-codehttps://github.com/ghuntley/claude-code-source-code-deobfuscation