ble-reticulum/REFACTORING_GUIDE.md
torlando-tech 63064ccf3a Refactor BLEInterface to driver-based architecture
Major architectural refactoring to separate high-level Reticulum protocol
logic from platform-specific Bluetooth operations. This enables code sharing
between pure Python and Android (Columba) implementations, improves
testability, and creates a clean boundary for future platform support.

ARCHITECTURE CHANGES:

1. **Driver Abstraction Layer**
   - Created BLEDriverInterface (bluetooth_driver.py) defining the contract
     for all platform-specific BLE drivers
   - Abstraction includes 18 methods + 6 callbacks for complete BLE lifecycle
   - Enhanced BLEDevice dataclass with service_uuids and manufacturer_data
   - Added on_mtu_negotiated callback for delayed MTU reporting
   - Added on_error callback for consistent platform error reporting

2. **Linux Driver Implementation**
   - Created LinuxBluetoothDriver (linux_bluetooth_driver.py, 1534 lines)
   - Moved ALL bleak/bluezero/D-Bus code from BLEInterface
   - Preserves 5 critical platform workarounds:
     * BlueZ ServicesResolved race condition patch
     * D-Bus LE-only connection (ConnectDevice)
     * BLE Agent registration for Just Works pairing
     * MTU negotiation with 3-method fallback
     * Service discovery delay for bluezero timing
   - Role-aware send() automatically chooses GATT write vs notification
   - Dedicated asyncio event loop management in separate thread
   - Configuration via constructor (no Reticulum dependencies)

3. **Refactored BLEInterface**
   - Removed 801 lines (32.3% reduction: 2479 → 1678 lines)
   - Removed all platform-specific imports (bleak, bluezero, dbus_fast)
   - Removed 9 async methods (moved to driver)
   - Driver dependency injection via constructor
   - Implemented 6 driver callbacks for event handling
   - PRESERVED high-level logic:
     * Peer scoring algorithm (RSSI + history + recency)
     * Connection blacklist with exponential backoff
     * MAC-based connection direction (prevents dual connections)
     * Fragmentation/reassembly orchestration (identity-based keying)
     * Interface spawning per peer

4. **Simplified BLEPeerInterface**
   - Removed connection_type, client, mtu parameters
   - Deleted _send_via_central() and _send_via_peripheral() methods
   - Single send path via driver.send() (driver handles role routing)
   - 77 lines removed from peer interface class

5. **Mock Driver for Testing**
   - Created MockBLEDriver (tests/mock_ble_driver.py)
   - Complete BLEDriverInterface implementation without hardware
   - Bidirectional communication via link_drivers()
   - Enables unit testing of BLEInterface logic (fragmentation, reassembly,
     peer lifecycle, blacklist management)

CRITICAL FIXES:

1. **Restored Periodic Cleanup Task** (CRITICAL: prevents memory leaks)
   - Converted from async (driver-owned loop) to threading.Timer
   - Runs every 30 seconds to clean stale reassembly buffers
   - Essential for long-running instances (Pi Zero with 512MB RAM)
   - Properly cancelled in detach() for clean shutdown

2. **Fixed Naming Consistency**
   - Renamed processOutgoing → process_outgoing (snake_case)

FILES MODIFIED:
- src/RNS/Interfaces/BLEInterface.py (refactored, -801 lines)

FILES ADDED:
- bluetooth_driver.py (driver abstraction interface)
- linux_bluetooth_driver.py (Linux/BlueZ implementation, 1534 lines)
- tests/mock_ble_driver.py (mock driver for unit tests)
- REFACTORING_GUIDE.md (comprehensive refactoring documentation)
- BLE_PROTOCOL_v2.2.md (protocol specification)
- tests/test_refactor_suite.py (initial test suite)

BENEFITS:

1. **Testability** - Mock driver enables hardware-free unit testing
2. **Portability** - Easy to create Android/Windows/macOS drivers
3. **Maintainability** - Platform quirks isolated in single driver file
4. **Code Sharing** - High-level logic shared across all platforms
5. **Clean Architecture** - Clear separation of concerns

TESTING REQUIRED:

- Tier 1 (Unit): Test with MockBLEDriver (fragmentation, reassembly, lifecycle)
- Tier 2 (Integration): Test on Raspberry Pi hardware (scanning, connecting,
  dual mode, MTU negotiation, identity exchange)
- Tier 3 (Regression): Full Reticulum stack (announces, LXMF, multi-hop)
- Tier 4 (Edge Cases): MAC rotation, identity handshake, reconnection,
  reassembly timeout, discovery cache pruning

BACKWARD COMPATIBILITY:

- Configuration: Fully backward compatible (same config parameters)
- Protocol: No changes to BLE wire protocol (v2.2)
- Interface API: Unchanged for Reticulum Transport integration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 23:15:22 -05:00

10 KiB

Refactoring BLEInterface to a Driver-Based Architecture

1. Goal

This guide outlines the process of refactoring the existing RNS.Interfaces.BLEInterface to decouple the high-level Reticulum protocol logic from the platform-specific Bluetooth implementation (bleak/bluezero).

The goal is to create a clean architectural boundary by introducing a BLEDriverInterface. The existing BLEInterface will be refactored to use this driver, and the Linux-specific bleak and bluezero code will be moved into a new concrete implementation of this driver, BleakDriver.

This will result in a more modular, maintainable, and testable system, and it will make it possible to share the high-level BLEInterface code between the pure Python implementation and the Android (Columba) implementation.

2. Prerequisites: The Driver Contract

First, create a new file, RNS/Interfaces/bluetooth_driver.py, and add the abstract interface definition we designed. This file defines the contract that all platform-specific drivers must follow.

# RNS/Interfaces/bluetooth_driver.py

from abc import ABC, abstractmethod
from typing import List, Optional, Callable
from enum import Enum, auto
from dataclasses import dataclass

# --- Data Structures ---

@dataclass
class BLEDevice:
    """Represents a discovered BLE device."""
    address: str
    name: str
    rssi: int

class DriverState(Enum):
    """Represents the state of the BLE driver."""
    IDLE = auto()
    SCANNING = auto()
    ADVERTISING = auto()

# --- Driver Interface ---

class BLEDriverInterface(ABC):
    """
    Abstract interface for a platform-specific BLE driver.
    """

    # --- Callbacks ---
    on_device_discovered: Optional[Callable[[BLEDevice], None]] = None
    on_device_connected: Optional[Callable[[str, int], None]] = None  # address, mtu
    on_device_disconnected: Optional[Callable[[str], None]] = None # address
    on_data_received: Optional[Callable[[str, bytes], None]] = None # address, data

    # --- Lifecycle & Configuration ---

    @abstractmethod
    def start(self, service_uuid: str, rx_char_uuid: str, tx_char_uuid: str, identity_char_uuid: str):
        """
        Initializes the driver and its underlying BLE stack.
        """
        pass

    @abstractmethod
    def stop(self):
        """
        Stops all BLE activity and releases resources.
        """
        pass

    @abstractmethod
    def set_identity(self, identity_bytes: bytes):
        """
        Sets the value of the read-only Identity characteristic for the local GATT server.
        """
        pass

    # --- State & Properties ---

    @property
    @abstractmethod
    def state(self) -> DriverState:
        pass

    @property
    @abstractmethod
    def connected_peers(self) -> List[str]:
        pass

    # --- Core Actions ---

    @abstractmethod
    def start_scanning(self):
        pass

    @abstractmethod
    def stop_scanning(self):
        pass

    @abstractmethod
    def start_advertising(self, device_name: str):
        pass

    @abstractmethod
    def stop_advertising(self):
        pass

    @abstractmethod
    def connect(self, address: str):
        pass

    @abstractmethod
    def disconnect(self, address: str):
        pass

    @abstractmethod
    def send(self, address: str, data: bytes):
        pass

3. Step-by-Step Refactoring Guide

Step 1: Create the BleakDriver Implementation

Create a new file, RNS/Interfaces/bleak_driver.py. This file will contain the new BleakDriver class that implements the BLEDriverInterface and encapsulates all bleak and bluezero code.

# RNS/Interfaces/bleak_driver.py

from .bluetooth_driver import BLEDriverInterface, BLEDevice, DriverState
# Add other necessary imports like bleak, bluezero, asyncio, etc.

class BleakDriver(BLEDriverInterface):
    def __init__(self):
        # Initialize properties to hold clients, state, etc.
        self._state = DriverState.IDLE
        self._clients = {} # address -> BleakClient
        # ...and so on

    # Implement all the abstract methods from the interface here
    def start(self, service_uuid, rx_char_uuid, tx_char_uuid, identity_char_uuid):
        # Code to initialize bleak and bluezero will go here
        pass

    def start_scanning(self):
        # Code that uses bleak.BleakScanner will go here
        pass

    def send(self, address, data):
        # Code that uses bleak_client.write_gatt_char will go here
        pass

    # ... etc.

Step 2: Move Platform-Specific Code to BleakDriver

Go through the existing BLEInterface.py method by method and move any code that directly calls bleak or bluezero into the corresponding method in your new BleakDriver class.

Example: Moving the send logic

Before (BLEInterface.py):

# (Inside BLEPeerInterface class)
async def _send_fragment(self, fragment):
    # ...
    await self.client.write_gatt_char(self.parent.WRITE_CH_UUID, fragment)
    # ...

After (bleak_driver.py):

# (Inside BleakDriver class)
async def send(self, address: str, data: bytes):
    if address in self._clients:
        client = self._clients[address]
        try:
            # The driver now handles the actual write operation
            await client.write_gatt_char(self.rx_char_uuid, data)
        except Exception as e:
            # Handle exceptions and possibly trigger disconnect
            pass

Step 3: Refactor BLEInterface to Use the Driver

Modify BLEInterface.py to remove all direct dependencies on bleak and bluezero. Instead, it will be initialized with a driver instance and will use it to perform all BLE operations.

Example: Refactoring __init__ and _send_fragment

Before (BLEInterface.py):

import bleak
from bluezero import peripheral

class BLEInterface(Interface):
    def __init__(self, owner, name, ...):
        # ... bleak and bluezero objects initialized here
        pass

    # ... methods with direct bleak/bluezero calls

After (BLEInterface.py):

# No more bleak or bluezero imports!
from .bluetooth_driver import BLEDriverInterface, BLEDevice

class BLEInterface(Interface):
    def __init__(self, owner, name, ..., driver: BLEDriverInterface):
        super().__init__()
        self.driver = driver # Dependency Injection

        # Assign callbacks so the driver can report events back to us
        self.driver.on_device_discovered = self._device_discovered_callback
        self.driver.on_data_received = self._data_received_callback
        # ... etc.

    # This method no longer needs to be async if the driver's send is blocking
    # or if we want to fire-and-forget
    def _send_fragment(self, fragment, peer_address):
        # High-level logic just tells the driver to send
        self.driver.send(peer_address, fragment)

    # --- Callback Implementations ---
    def _device_discovered_callback(self, device: BLEDevice):
        # Logic to handle a discovered device
        pass

    def _data_received_callback(self, address: str, data: bytes):
        # This is where you feed the raw data (a fragment) into the reassembler
        pass

4. Thorough Testing Plan

A multi-layered testing strategy is crucial for a refactor of this scale.

Tier 1: Unit Testing (Mock Driver)

The biggest advantage of this new architecture is testability. You can now test your entire BLEInterface and fragmentation logic without any Bluetooth hardware.

  1. Create a MockBLEDriver:
    • Create a tests/mock_ble_driver.py file.
    • The MockBLEDriver class will implement BLEDriverInterface.
    • Its methods will not use Bluetooth. Instead, they will simulate it. For example, its send() method could store the data in a list and immediately trigger the on_data_received callback on a paired "virtual" peer's mock driver.
  2. Write BLEInterface Unit Tests:
    • Write pytest tests that initialize BLEInterface with the MockBLEDriver.
    • Test Case 1: Fragmentation. Call BLEInterface.process_outgoing() with a large packet. Assert that the mock_driver.send() method was called multiple times with correctly fragmented data (correct headers, sequence numbers, etc.).
    • Test Case 2: Reassembly. Have the mock_driver call the on_data_received callback with a sequence of fragments. Assert that BLEInterface correctly reassembles them and passes the complete packet to RNS.Transport.inbound.
    • Test Case 3: Peer Lifecycle. Simulate device discovery, connection, and disconnection events from the mock driver and assert that BLEInterface creates and destroys its internal peer representations correctly.

Tier 2: Integration Testing (Driver Level)

This tier tests your actual BleakDriver implementation against real hardware.

  1. Create Test Scripts: Write simple Python scripts that use only the BleakDriver.
  2. Setup: You will need two machines with Bluetooth, or one machine and your Columba app on an Android device.
  3. Test Cases:
    • Scanning Test: Run a script that starts the driver and prints discovered devices. Verify that it finds your other test device.
    • Connection Test: Write a script to connect to the test device. Verify that the on_device_connected callback fires and that driver.connected_peers is updated.
    • Data I/O Test: After connecting, use driver.send() to send a simple "hello world" byte string. On the other device, verify that the bytes are received correctly. Test this in both directions.

Tier 3: End-to-End Testing (Full Stack)

This is the final validation, testing the entire refactored application.

  1. Run Full Application: Start the full Reticulum application on two Linux machines using the refactored code.
  2. Test Cases:
    • Announce Exchange: Verify that the two nodes discover each other and exchange announces. Check the logs for successful path discovery.
    • LXMF Message Transfer: Use a tool like lxmf-send or a simple script to send a message from one node to the other. Verify it is received.
    • Cross-Compatibility Test: Test interoperability between a refactored pure Python node and your Columba Android application.

By following this guide and testing plan, you can confidently execute the refactor, resulting in a more robust, maintainable, and future-proof architecture for your project.