ble-reticulum/migration/reports/Codex_Report_20260517_1430.md

28 KiB
Raw Blame History

Phase 2 Investigation Report

Target: BLEInterface.py _handle_identity_handshake

No files were modified.

Source Excerpt

1202 def _handle_identity_handshake(self, address: str, data: bytes) -> bool:
1217     # Identity handshake detection: exactly 16 bytes
1218     if len(data) != 16:
1219         return False

1222     peer_identity = self.address_to_identity.get(address)
1223     if peer_identity:
1227         if data == peer_identity:
1228             RNS.log(...)
1229             return True
1233         RNS.log(...)
1234         return True

1238     central_identity = bytes(data)
1239     identity_hash = self._compute_identity_hash(central_identity)

1243     if self._check_duplicate_identity(address, central_identity):
1250         self.driver.disconnect(address)
1253         return True

1255     self.address_to_identity[address] = central_identity
1256     self.identity_to_address[identity_hash] = address

1261     mtu = self.driver.get_peer_mtu(address)
1262     if not mtu:
1263         mtu = 23

1266     frag_key = self._get_fragmenter_key(central_identity, address)
1268     with self.frag_lock:
1269         self.fragmenters[frag_key] = BLEFragmenter(mtu=mtu)
1271         self.reassemblers[frag_key] = BLEReassembler()

1274     if identity_hash not in self.spawned_interfaces:
1278         self._spawn_peer_interface(...)
1285     else:
1287         existing_if = self.spawned_interfaces[identity_hash]
1290         existing_if.peer_address = address
1291         self.address_to_interface[address] = existing_if

1297     self._last_real_data[identity_hash] = time.time()
1300     if address in self._pending_identity_connections:
1301         del self._pending_identity_connections[address]
1303     return True
1305 except Exception as e:
1306     RNS.log(...)
1307     return True

Callers

Direct caller:

Driver callback chain:

Related but separate:

  • handle_peripheral_data contains older/parallel identity-handshake handling. It is not a caller of _handle_identity_handshake.

Callees

  • _compute_identity_hash(peer_identity) at line 1899
  • _check_duplicate_identity(address, peer_identity) at line 1051
  • self.driver.disconnect(address)
  • self.driver.get_peer_mtu(address)
  • _get_fragmenter_key(peer_identity, address) at line 1886
  • BLEFragmenter(mtu=mtu)
  • BLEReassembler()
  • _spawn_peer_interface(...) at line 1920
  • RNS.log(...)
  • time.time()

Object Attributes Read

  • self.address_to_identity
  • self.identity_to_address, indirectly through _check_duplicate_identity
  • self._pending_detach, indirectly through _check_duplicate_identity
  • self.driver.connected_peers, indirectly through _check_duplicate_identity
  • self.peers, indirectly through _check_duplicate_identity
  • self._last_real_data, indirectly through _check_duplicate_identity
  • self._zombie_timeout, indirectly through _check_duplicate_identity
  • self.spawned_interfaces
  • self.frag_lock
  • self.fragmenters
  • self.reassemblers
  • self.address_to_interface
  • self._pending_identity_connections

Object Attributes Mutated

  • self.address_to_identity[address]
  • self.identity_to_address[identity_hash]
  • self.fragmenters[frag_key]
  • self.reassemblers[frag_key]
  • existing_if.peer_address
  • self.address_to_interface[address]
  • self._last_real_data[identity_hash]
  • self._pending_identity_connections, deletion
  • Indirectly via _check_duplicate_identity: stale address cleanup and possible zombie disconnect behavior.
  • Indirectly via _spawn_peer_interface: spawned_interfaces, address_to_interface, address_to_identity, identity_to_address, _pending_detach, _last_real_data, RNS.Transport.interfaces.

External Objects Touched

  • self.driver: disconnect(), get_peer_mtu(), connected_peers.
  • BLEFragmenter / BLEReassembler: imported through the backend shim at lines 81-99.
  • BLEPeerInterface: created indirectly through _spawn_peer_interface.
  • RNS.Transport.interfaces: mutated indirectly by _spawn_peer_interface.
  • Wall clock: time.time().

Reticulum Dependencies

  • RNS.log is used directly for status, warnings, and errors.
  • RNS.Transport.interfaces is modified indirectly when a peer interface is spawned.
  • Interface lifecycle expectations are Reticulum-specific: creating a BLEPeerInterface, setting interface fields, and making it visible to Transport are not pure protocol work.

BLE Driver / Platform Dependencies

  • MTU lookup uses self.driver.get_peer_mtu(address).
  • Duplicate rejection may call self.driver.disconnect(address).
  • Duplicate detection depends on self.driver.connected_peers.
  • The whole function is called from BLE notification/write callbacks, so it sits directly on the driver/platform boundary.
  • Address handling is BLE/MAC-rotation specific.

Pure Protocol Decisions

Small pieces are pure enough to extract:

  • len(data) == 16 means “candidate identity handshake”.
  • Existing identity plus matching 16-byte payload means “duplicate handshake, consume”.
  • Existing identity plus different 16-byte payload means “identity-like data, consume with warning”.
  • New identity value is bytes(data).
  • identity_hash = peer_identity.hex()[:16].
  • frag_key = peer_identity.hex().
  • Default MTU fallback is 23 if no MTU is available, although this is BLE policy, not Reticulum protocol.

These decisions can be represented as a small classifier/result object.

Side Effects

The current function is mostly side effects:

  • Consumes or rejects incoming data before normal reassembly.
  • Logs multiple outcomes.
  • Disconnects duplicate connections.
  • Stores identity/address mappings.
  • Creates fragmentation state.
  • Spawns or updates Reticulum peer interfaces.
  • Updates zombie/activity tracking.
  • Removes pending identity timeout state.
  • Suppresses all exceptions and consumes the packet anyway.

Proposed Split

A conservative split is possible, but only for a very small helper. I would not move the full function into C++/Rust. The function is too entangled with BLE driver state, Reticulum interface lifecycle, locks, MAC rotation, pending detach cleanup, and logging.

Good extraction candidate:

  • A pure “classify identity handshake input” helper.

Keep in Python glue:

  • Duplicate identity policy using current connection state.
  • Driver disconnect.
  • MTU lookup.
  • Fragmenter/reassembler creation.
  • Peer interface spawn/update.
  • Reticulum registration.
  • Logging.
  • Pending timeout and zombie tracking.

Proposed Python-Facing Helper API

For protocol core:

result = classify_identity_handshake(data, known_identity=None)

Possible result shape:

{
    "is_handshake": bool,
    "consume": bool,
    "status": "not_handshake" | "new_identity" | "duplicate_same" | "duplicate_mismatch",
    "peer_identity": bytes | None,
    "identity_hash": str | None,
    "fragmenter_key": str | None,
}

Equivalent C++ binding could expose:

ble_protocol_core_cpp.classify_identity_handshake(data, known_identity=None)

This should not decide duplicate-connection rejection. That decision currently depends on live Python/driver state.

Risks

  • High fragility if the whole function is ported: it would pull platform and Reticulum lifecycle into protocol core.
  • The “16 bytes means handshake” rule can collide with legitimate 16-byte data if identity is missing or already known. Current behavior intentionally consumes such data.
  • _check_duplicate_identity has nontrivial live-state behavior: pending detach, stale connection cleanup, zombie timeout, and driver disconnect. Moving it would obscure race-sensitive behavior.
  • There is duplicate identity-handshake logic in handle_peripheral_data; changing one path without reconciling the other could produce divergent behavior.
  • Exception handling currently consumes packets even after failure. Tests should lock this down before any refactor.

Test Plan

Before extraction:

  • Unit-test classification only: non-16-byte data, new 16-byte identity, duplicate matching identity, duplicate mismatching identity.
  • Characterize existing _handle_identity_handshake behavior with fake driver and fake fragmenter/reassembler classes.
  • Verify duplicate identity rejection calls driver.disconnect(address) and still returns True.
  • Verify MTU fallback to 23.
  • Verify successful handshake mutates identity maps, creates fragmenter/reassembler, removes pending identity entry, and updates _last_real_data.
  • Verify existing spawned interface path updates peer_address and address_to_interface.
  • Verify exceptions return True, preserving current consume-on-error behavior.

Recommendation: extract only the pure classifier first, if anything. Leave _handle_identity_handshake as Python glue around that classifier. Full extraction would make the system more fragile at this phase.

My Response

Our goal is to migrate this project to C++ so it can be integrated with the C++ microreticulum running on an ESP32-S (LilyGo! T-Beam SUPREME). You have the suggestion of keeping some Python. Given the migration goal, what is your updated assessment?

Codex Reply

Updated assessment: for the migration goal, “keep Python glue” should mean “keep it only in the workstation Reticulum interface during transition,” not “preserve Python as part of the final ESP32 path.”

For _handle_identity_handshake, I would still not port the Python function wholesale. But I would now classify more of it as C++ migration-relevant than I did in the narrow Phase 2 report.

The right split is:

Port to C++ protocol/core logic

  • Detect whether incoming data is an identity handshake: currently len(data) == 16.
  • Represent the peer identity as the stable peer key.
  • Compute:
    • short identity key: peer_identity.hex()[:16]
    • full fragmenter key: peer_identity.hex()
  • Decide whether a 16-byte payload should be consumed as:
    • new identity
    • duplicate same identity
    • duplicate mismatching identity-like data
    • not handshake
  • Maintain peer identity/address association in a platform-neutral peer/session table.
  • Own fragmenter/reassembler creation policy once identity and MTU are known.
  • Track pending identity timeout state.
  • Track last real data time for duplicate/zombie replacement policy, if that policy is wanted on embedded too.
  • Emit protocol events such as IdentityAccepted, DuplicateRejected, PeerReady, DataConsumed, DataForReassembly.

Do not port as-is

  • Direct RNS.log(...).
  • Direct RNS.Transport.interfaces.append(...).
  • Python BLEPeerInterface creation.
  • Python dictionaries as the authoritative model.
  • Direct calls to self.driver.disconnect(...).
  • Python locking structure.
  • BlueZ/Linux-specific connection assumptions.

Those should become adapter callbacks or platform bindings.

So the updated answer is: yes, this area is relevant to C++ migration, but not as a literal method extraction. It should become a C++ peer handshake/session state machine with a thin Python adapter on Linux and a BLE/NimBLE-style adapter on ESP32.

A plausible C++-first API would look like:

enum class HandshakeAction {
    NotHandshake,
    ConsumedDuplicateSame,
    ConsumedDuplicateMismatch,
    AcceptedNewIdentity,
    RejectDuplicate
};

struct IdentityHandshakeResult {
    HandshakeAction action;
    std::array<uint8_t, 16> peer_identity;
    std::string identity_key;     // first 8 bytes as 16 hex chars
    std::string fragmenter_key;   // full 16 bytes as 32 hex chars
    bool should_disconnect;
    bool should_create_fragmentation_state;
    bool should_mark_peer_ready;
};

Then the Python side would temporarily do:

result = protocol_core.handle_identity_handshake(address, data, mtu, connection_state)

if result.should_disconnect:
    driver.disconnect(address)

if result.should_create_fragmentation_state:
    fragmenters[result.fragmenter_key] = BLEFragmenter(mtu=result.mtu)
    reassemblers[result.fragmenter_key] = BLEReassembler()

if result.should_mark_peer_ready:
    _spawn_peer_interface(...)

On ESP32/microreticulum, the same result would drive C++ BLE connection/session objects instead of Python Reticulum interfaces.

My revised recommendation:

  1. Do not migrate _handle_identity_handshake as one blob.
  2. Do migrate its protocol/session semantics into C++.
  3. First build a C++ PeerSessionManager or BLEProtocolSession that owns identity, MTU, duplicate, pending-handshake, and fragmenter/reassembler state.
  4. Keep Python BLEInterface.py as a compatibility adapter until the embedded C++ path is ready.
  5. Add equivalence tests that prove the C++ session manager makes the same decisions as current Python for known edge cases.

In short: the previous “leave Python glue” assessment is correct for the current Linux Reticulum integration, but for the ESP32 target the durable boundary should move lower. Python should become a test harness/adapter, while C++ should own the BLE protocol session model.

Next Instruction from ChatGPT

Phase 2 goal is not to preserve Python as the final architecture.

The final target is a C++ BLE protocol/session layer usable by microReticulum on ESP32-S3/T-Beam SUPREME, with Python retained only as a transitional Linux test adapter.

Design a C++ BLEProtocolSession or PeerSessionManager.

Do not modify BLEInterface.py yet.

Use current BLEInterface._handle_identity_handshake behavior as the reference, but do not port it literally.

Produce:

  1. proposed C++ header
  2. state owned by the class
  3. event/result structs
  4. exact behavior mapping from current Python cases
  5. what remains in Python adapter
  6. what would become ESP32/microReticulum adapter code
  7. unit test plan comparing decisions against current Python behavior
  8. migration steps from current Python dictionaries toward C++ session ownership

Codex response

Updated Assessment

Given the final target is ESP32-S3/microReticulum, _handle_identity_handshake should be treated as a reference behavior for a C++ session manager, not as Python glue with a small helper. The Python BLEInterface.py should eventually delegate identity/session decisions to C++ and only perform Linux/Reticulum adapter actions.

Below is the design I would use for Phase 2.

1. Proposed C++ Header

#pragma once

#include <array>
#include <cstdint>
#include <optional>
#include <string>
#include <unordered_map>
#include <vector>

namespace ble_reticulum {

using PeerIdentity = std::array<uint8_t, 16>;

enum class Role {
    Unknown,
    Central,
    Peripheral,
};

enum class HandshakeDecision {
    NotHandshake,
    DuplicateKnownIdentity,
    MismatchedKnownIdentity,
    AcceptedNewIdentity,
    RejectedDuplicateIdentity,
    ErrorConsumed,
};

enum class AdapterAction {
    None,
    DisconnectPeer,
    CreateFragmentationState,
    PeerReady,
    UpdatePeerAddress,
    RemovePendingIdentity,
    MarkRealData,
};

struct ConnectionState {
    std::string address;
    Role role = Role::Unknown;
    std::optional<PeerIdentity> known_identity;
    std::optional<uint16_t> negotiated_mtu;
    bool existing_identity_connected_elsewhere = false;
    bool existing_address_connected = false;
    bool existing_address_in_peer_table = false;
    bool identity_has_pending_detach = false;
    bool existing_connection_is_zombie = false;
    std::string existing_address_for_identity;
};

struct HandshakeResult {
    HandshakeDecision decision = HandshakeDecision::NotHandshake;
    std::vector<AdapterAction> actions;

    bool consumed = false;
    bool accepted = false;
    bool should_disconnect = false;

    std::optional<PeerIdentity> peer_identity;
    std::string identity_key;      // first 8 bytes as 16 hex chars
    std::string fragmenter_key;    // full 16 bytes as 32 hex chars

    uint16_t mtu = 23;
    std::string peer_name;
    std::string replacement_old_address;

    std::string warning;
    std::string error;
};

struct PeerSession {
    std::string address;
    PeerIdentity identity;
    std::string identity_key;
    std::string fragmenter_key;
    uint16_t mtu = 23;
    Role role = Role::Unknown;
    double pending_identity_since = 0.0;
    double last_real_data = 0.0;
    bool has_fragmentation_state = false;
    bool peer_ready = false;
};

class BLEProtocolSession {
public:
    explicit BLEProtocolSession(double pending_identity_timeout = 30.0,
                                double zombie_timeout = 120.0);

    HandshakeResult handle_identity_handshake(
        const std::string& address,
        const std::vector<uint8_t>& data,
        const ConnectionState& state,
        double now_seconds);

    void mark_pending_identity(const std::string& address, double now_seconds);
    void remove_pending_identity(const std::string& address);

    bool has_identity_for_address(const std::string& address) const;
    std::optional<PeerIdentity> identity_for_address(const std::string& address) const;

    std::optional<std::string> address_for_identity_key(const std::string& identity_key) const;

    void accept_identity(const std::string& address,
                         const PeerIdentity& identity,
                         uint16_t mtu,
                         Role role,
                         double now_seconds);

    void update_peer_address(const std::string& identity_key,
                             const std::string& new_address,
                             double now_seconds);

    void remove_address(const std::string& address);

    static bool is_identity_handshake_payload(const std::vector<uint8_t>& data);
    static PeerIdentity identity_from_payload(const std::vector<uint8_t>& data);
    static std::string compute_identity_key(const PeerIdentity& identity);
    static std::string compute_fragmenter_key(const PeerIdentity& identity);

private:
    double pending_identity_timeout_;
    double zombie_timeout_;

    std::unordered_map<std::string, PeerIdentity> address_to_identity_;
    std::unordered_map<std::string, std::string> identity_key_to_address_;
    std::unordered_map<std::string, PeerSession> sessions_by_identity_key_;
    std::unordered_map<std::string, double> pending_identity_connections_;

    HandshakeResult consume_known_identity_case(
        const std::string& address,
        const std::vector<uint8_t>& data,
        const PeerIdentity& known_identity) const;

    bool should_reject_duplicate_identity(const std::string& address,
                                          const std::string& identity_key,
                                          const ConnectionState& state) const;
};

} // namespace ble_reticulum

This is intentionally session-oriented, not Reticulum-interface-oriented. It can be bound into Python with pybind11 now and compiled directly into microReticulum later.

2. State Owned By The Class

The C++ class should own protocol/session state that is currently scattered across Python dictionaries:

  • address_to_identity_: address to 16-byte identity.
  • identity_key_to_address_: short identity key to current address.
  • sessions_by_identity_key_: stable peer session table.
  • pending_identity_connections_: addresses waiting for identity handshake.
  • Per-peer MTU value.
  • Per-peer last real data timestamp.
  • Whether fragmentation state should exist for the peer.
  • Whether peer is ready for upper-layer routing.

It should not own Linux driver objects, Reticulum Interface objects, Python locks, or BlueZ-specific objects.

3. Event / Result Structs

HandshakeResult is the main bridge. It should be boring and explicit:

  • decision: what happened.
  • consumed: whether the input data must not go to the reassembler.
  • accepted: whether a new identity was accepted.
  • should_disconnect: whether the adapter should disconnect the peer.
  • actions: ordered adapter work to perform.
  • peer_identity, identity_key, fragmenter_key.
  • mtu.
  • peer_name.
  • replacement_old_address.
  • warning / error strings for adapter logging.

The adapter should never have to reverse-engineer behavior from side effects.

4. Exact Behavior Mapping From Current Python

Current Python behavior at BLEInterface.py:1202:

Python case Current behavior C++ decision
len(data) != 16 Return False; normal BLE data path continues NotHandshake, consumed=false
len(data) == 16, known identity exists, payload matches known identity Log debug, return True, do not reassemble DuplicateKnownIdentity, consumed=true
len(data) == 16, known identity exists, payload differs Log warning, return True, do not reassemble MismatchedKnownIdentity, consumed=true, warning set
New 16-byte identity, duplicate identity check rejects Log warning, call driver.disconnect(address), return True RejectedDuplicateIdentity, consumed=true, should_disconnect=true, action DisconnectPeer
New 16-byte identity accepted Store mappings, get MTU or fallback 23, create fragmentation state, spawn/update peer interface, mark real data, clear pending identity AcceptedNewIdentity, consumed=true, actions CreateFragmentationState, PeerReady, MarkRealData, RemovePendingIdentity
Existing identity_hash already spawned under old address Update peer interface address and address_to_interface AcceptedNewIdentity or address update result with action UpdatePeerAddress
Any exception Log error, return True, consume data ErrorConsumed, consumed=true, error set

Duplicate identity behavior maps from _check_duplicate_identity:

Python duplicate subcase Current behavior C++ result
No identity or invalid length Allow no rejection
Same identity, same address Allow no rejection
Same identity, different address, pending detach exists Allow after stale cleanup action/update state to replace stale address
Existing address not connected and not in peer table Allow after stale cleanup action/update state to replace stale address
Existing connection is zombie Allow, request old disconnect action DisconnectPeer for old address plus accept new
Existing connection alive Reject duplicate RejectedDuplicateIdentity

5. What Remains In Python Adapter

During transition, Python should keep only adapter work:

  • Receive driver callback.
  • Convert Python bytes to C++ input.
  • Build ConnectionState from current Linux/Reticulum state.
  • Call BLEProtocolSession.handle_identity_handshake(...).
  • Log result using RNS.log.
  • If requested, call driver.disconnect(address).
  • If requested, create Python BLEFragmenter / BLEReassembler until those are also session-owned in C++.
  • If requested, call _spawn_peer_interface(...).
  • Maintain compatibility dictionaries only as mirrors while migrating:
    • address_to_identity
    • identity_to_address
    • spawned_interfaces
    • address_to_interface
    • _pending_identity_connections
    • _last_real_data

Eventually Python should stop being the source of truth for identity/session state.

6. ESP32 / microReticulum Adapter Code

On ESP32-S3, the adapter should replace Python-specific side effects with embedded equivalents:

  • BLE stack callback receives connection handle/address and payload.
  • Adapter builds ConnectionState from NimBLE/ESP-IDF connection table.
  • Calls BLEProtocolSession.handle_identity_handshake(...).
  • If should_disconnect, call BLE stack disconnect for the connection handle.
  • If CreateFragmentationState, allocate or initialize C++ fragmenter/reassembler objects for the peer session.
  • If PeerReady, register the peer/session with microReticulums link/interface abstraction.
  • If UpdatePeerAddress, update the connection handle/address mapping.
  • If RemovePendingIdentity, clear pending timeout state.
  • If MarkRealData, refresh liveness timestamp.
  • If consumed=false, pass payload to the reassembler/data path.

ESP32 code should use connection handles internally where possible. BLE MAC address should be metadata, not the only stable key, because the Python behavior already shows MAC rotation issues.

7. Unit Test Plan

Tests should compare C++ decisions against current Python behavior before any behavior change.

Core C++ tests:

  • Non-16-byte data returns NotHandshake, consumed=false.
  • 16-byte new identity returns AcceptedNewIdentity, identity key equals identity.hex()[:16].
  • Fragmenter key equals full identity hex.
  • Missing MTU falls back to 23.
  • Provided MTU is preserved.
  • Known identity plus same payload returns DuplicateKnownIdentity, consumed=true.
  • Known identity plus different payload returns MismatchedKnownIdentity, consumed=true.
  • Duplicate same identity at live different address returns RejectedDuplicateIdentity, should_disconnect=true.
  • Duplicate identity with pending detach is accepted and proposes stale address replacement.
  • Duplicate identity with old address disconnected and absent from peer table is accepted.
  • Duplicate identity with zombie old connection is accepted and requests old disconnect.
  • Error-path compatibility: malformed adapter state should produce consumed error result if matching current Python consume-on-error semantics.

Python equivalence tests:

  • Use a fake BLEInterface or small harness around current _handle_identity_handshake.
  • Use a fake driver recording disconnect() and get_peer_mtu().
  • Run identical inputs through Python and C++.
  • Compare:
    • consumed/not consumed
    • disconnect requested/called
    • accepted/rejected
    • identity key
    • fragmenter key
    • MTU
    • pending identity removal
    • mapping updates
    • spawned/update-interface intent

Do not require log string equality. Test decision categories and side effects.

8. Migration Steps

  1. Add pure C++ identity/session types alongside current protocol core.
  2. Bind the session manager into ble_protocol_core_cpp for Python tests.
  3. Add tests that encode current _handle_identity_handshake cases without changing BLEInterface.py.
  4. Add a Python adapter harness in tests that converts current Python state into ConnectionState.
  5. Once tests pass, add optional use of C++ session manager behind an environment flag, still mirroring Python dictionaries.
  6. Move identity mapping source of truth to C++ session manager while keeping Python dictionaries as compatibility mirrors.
  7. Move pending identity timeout and last-real-data tracking into C++.
  8. Move fragmenter/reassembler ownership into C++ session state.
  9. Reduce _handle_identity_handshake to adapter code: call C++, perform requested side effects.
  10. Reuse the same C++ session manager from microReticulum ESP32 BLE adapter.
  11. Once embedded path is stable, retire Python-only identity decision logic.

Bottom line: _handle_identity_handshake should become a C++ session-state decision engine plus thin platform adapters. Python remains useful for Linux field testing, but the authoritative BLE protocol/session model should move to C++.