Add tests to test_zombie_connection_detection.py (which CI runs) to cover:
- _handle_identity_handshake: non-16-byte rejection, duplicate handling
- _pending_identity_connections cleanup after handshake
- _spawn_peer_interface zombie tracking initialization
These tests cover the same code paths as test_v2_2_identity_handshake.py
but are in a file that CI includes, achieving 100% patch coverage.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace Mock-based fixtures with real BLEInterface instances in
stale identity check tests. This ensures coverage.py properly
tracks execution of production code paths.
The Mock approach with method binding executed the production code
but coverage tracking was inconsistent. Using real instances
guarantees proper coverage attribution.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add test for _pending_identity_connections cleanup during successful
identity handshake (lines 1272-1275), achieving 100% patch coverage
for PR #38 changes.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tests verify that:
- Duplicate 16-byte handshake matching known identity is consumed
- Different 16-byte data is also consumed to prevent reassembler errors
- Non-16-byte data is not incorrectly consumed as handshake
- Normal handshake processing works when identity not yet known
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Kotlin provides the identity via callback (from the identity characteristic read),
the address_to_identity mapping gets set BEFORE the 16-byte handshake data arrives
through _data_received_callback. Previously, _handle_identity_handshake would see the
identity already exists and return False, causing the 16-byte handshake data to be
passed to the reassembler where it fails with "Invalid fragment type 0xXX".
The fix checks if received 16-byte data matches the known identity and consumes it
silently if so. This prevents the handshake data from being misinterpreted as a
fragment.
Symptoms fixed:
- BLEReassembler: Invalid fragment type 0xc9 (first byte of peer identity)
- Messages not flowing even though connections appear established
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When BLE link degrades, 1-byte keepalives may still work while larger data
packets fail. Both sides think the connection is "alive" based on keepalives,
but data can't flow. This causes a deadlock where new connections are
rejected as "duplicates" even though the existing connection is non-functional.
This change adds zombie detection by tracking when real data (not keepalives)
was last received. If an existing connection has only exchanged keepalives
for > 30 seconds (configurable via _zombie_timeout), new connections from
the same identity are allowed and the zombie connection is disconnected.
Changes:
- Add _last_real_data dict to track last real data timestamp per identity
- Add _zombie_timeout (default 30s) for configurable zombie threshold
- Update _check_duplicate_identity with Check 3: zombie detection
- Update _handle_ble_data to track real data activity after keepalive filter
- Initialize tracking in _handle_identity_handshake and _spawn_peer_interface
- Clean up tracking in _process_pending_detaches
- Add comprehensive test suite for zombie detection
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a peer disconnects, identity_to_address is NOT cleaned up immediately -
it's only removed after a 2-second grace period. However, _check_duplicate_identity
was not checking if the existing address is still connected before rejecting.
This caused legitimate reconnections from the same identity (after MAC rotation
or reconnection) to be incorrectly rejected as "duplicates" during the grace
period or when cleanup was delayed.
The fix adds two checks before rejecting:
1. If pending_detach exists for this identity (old connection already gone)
2. If existing address is not in connected_peers or peers dict
Also adds TDD tests that demonstrate the bug and verify the fix.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add tests that exercise the actual code path in linux_bluetooth_driver.py
for duplicate identity exception handling. These tests patch BleakClient
to verify that:
- Duplicate identity exceptions are logged as WARNING, not ERROR
- on_error callback uses 'info' severity for duplicate identity errors
- Normal connection failures still use 'error' severity
This improves patch coverage for the duplicate identity handling fix
by testing the driver code directly rather than just the logic in isolation.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix BLEInterface.handle_peripheral_data to use _compute_identity_hash
instead of RNS.Identity.full_hash for consistent identity hash computation
- Update MockBLEDriver.on_device_connected callback to match the
(address, peer_identity) signature in bluetooth_driver.py
- Fix test_v2_2_identity_handshake.py and test_v2_2_race_conditions.py
to properly mock ble_reticulum.Interface without breaking the namespace
- Use BLEFragmenter/BLEReassembler directly in tests instead of
non-existent _create_fragmenter/_create_reassembler methods
- Fix asyncio.get_event_loop() deprecation in test_ble_peer_interface.py
for Python 3.10+ compatibility
- Update MAC address test fixtures to account for v2.2 MAC sorting logic
- Fix test_peer_address_mac_rotation to properly simulate MAC rotation
where old connection is dropped before new one arrives
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Document the narrow race window where data could arrive from an old MAC
address before onAddressChanged callback invalidates the cache entry.
The window is very small since onAddressChanged fires synchronously
during Kotlin deduplication, and _address_changed_callback() cleans up
the stale cache entry.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, _handle_identity_handshake (peripheral mode) did not check
for duplicate identities. If a peer connected via two MACs simultaneously,
both connections could be accepted.
Now, _handle_identity_handshake calls _check_duplicate_identity before
accepting the handshake. If the identity is already connected at a
different MAC, the new connection is rejected and disconnected.
This makes both central and peripheral modes consistent in rejecting
duplicate connections during MAC rotation overlap.
Also adds tests for peripheral mode duplicate rejection.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a peer connects with an identity already connected at a different
MAC address (Android MAC rotation), the connection is correctly rejected.
However, the error message format "Connection failed to {address}" was
matching the blacklist regex, causing the new MAC to be blacklisted.
After 3 duplicate rejections, the new MAC would be blacklisted for 60s+,
creating connectivity gaps when the old MAC finally disconnected.
Fix:
- Detect "Duplicate identity" in exception message
- Use severity "info" instead of "error" (doesn't trigger blacklist)
- Use safe message format "Duplicate identity rejected for {address}"
which doesn't match the blacklist regex pattern
Also adds comprehensive tests for MAC rotation blacklist behavior.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Expand _compute_identity_hash docstring to explain:
- Uses truncated 64-bit keys for spawned_interfaces and identity_to_address
- Birthday collision risk at ~2^32 (~4 billion) identities
- Astronomically safe for BLE mesh networks with <100 peers
- Note that fragmenter keys use full 32-char hex for packet reassembly
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add test suite to verify that BLEPeerInterface.peer_address is properly
updated when BLE MAC address rotation occurs. Tests cover all 4 code paths:
- _address_changed_callback: primary path for address migration
- _mtu_negotiated_callback: when interface exists for identity at new address
- _handle_identity_handshake: when identity arrives at new address
- _spawn_peer_interface: when reusing interface for new address
Also includes tests for:
- Proper logging of address updates for debugging
- Consistency between peer_address and address_to_interface mapping
- Multiple consecutive MAC rotations
These tests prevent regression of the bidirectional BLE communication bug
where peripheral->central sends failed after MAC rotation.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When BLE MAC address rotation occurs (same identity, different address),
the BLEPeerInterface.peer_address field was not being updated. This caused
sends to fail with "Cannot send - not connected" because Python was using
the stale address that no longer matched Kotlin's connectedPeers map.
This fix updates peer_address in all code paths where MAC rotation can occur:
- _mtu_negotiated_callback: when interface already exists for identity
- _handle_identity_handshake: when interface already exists for identity
- _address_changed_callback: when address migration is triggered
- _spawn_peer_interface: when reusing existing interface for new address
Fixes bidirectional BLE communication failure where peripheral could not
send data to central after MAC rotation.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When _process_pending_detaches() finds that an address has reconnected
during the grace period, the pending detach entry was not being removed
from _pending_detach dict. This caused the entry to be re-evaluated on
every cleanup cycle.
Now properly deletes the entry when cancelling the detach.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add test coverage for new BLEInterface cleanup functionality:
- _cleanup_pending_identity_connections: timeout handling for non-Reticulum devices
- _process_pending_detaches: delayed interface detachment with grace period
- _validate_spawned_interfaces: orphaned interface cleanup
- Multi-address disconnect handling (keeping interface alive for MAC rotation)
- Thread-safe _spawn_peer_interface with locking
- Central disconnect callback with address fallback
These tests cover the 195 lines of new code in BLEInterface.py to improve
patch coverage for PR #35.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix mock_ble_driver.py import path (src/RNS/Interfaces -> src/ble_reticulum)
- Add address_to_interface, _pending_detach, _pending_detach_grace_period to test fixture
- Update test_identity_cache to expect grace period behavior (deferred cleanup)
- Update test_v2_2_mac_sorting to use renamed _cleanup_stale_address function
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rewrite _validate_spawned_interfaces() with 3-pass approach:
- Pass 1: Collect orphaned addresses
- Pass 2: Clean up address mappings, track interfaces to detach
- Pass 3: Only detach interfaces with zero connected addresses
- Fragmenters only cleaned up when interface fully detached
- Enhance _spawn_peer_interface() reuse logic:
- Update address_to_identity and identity_to_address when reusing
- Cancel pending detach for the identity
- Mark interface as online
- Fix disconnect callbacks to preserve fragmenters:
- _device_disconnected_callback: defer fragmenter cleanup to grace period
- handle_central_disconnected: same fragmenter preservation
- _process_pending_detaches: clean up fragmenters on actual detach
- Rename _cleanup_stale_interface() to _cleanup_stale_address():
- No longer detaches interface during MAC rotation
- Only cleans up stale address-specific mappings
- Interface preserved for reuse with new address
Fixes orphaned peer interfaces and "No fragmenter for peer" warnings
during BLE MAC address rotation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Connections from non-Reticulum BLE devices (AirTags, BLE scanners, etc.)
that connect to our GATT server but never complete the identity handshake
are now automatically disconnected after 30 seconds.
Changes:
- Track pending identity connections with timestamps in _pending_identity_connections
- Add _cleanup_pending_identity_connections() to disconnect stale connections
- Remove from pending tracking when identity is provided in callback
- Add debug logging for cleanup timer operations
This prevents non-protocol devices from appearing indefinitely in the
BLE connections list.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
BLE peer interfaces weren't being cleaned up when connections dropped
if the identity-to-address mapping wasn't available at disconnect time.
This caused orphaned interfaces to persist (peer interfaces shown with
zero active connections).
Changes:
- Add address_to_interface mapping for direct address-based cleanup
- Update _device_disconnected_callback with dual-index approach:
try identity lookup first, fall back to address_to_interface
- Update handle_central_disconnected with same dual-index approach
- Add _validate_spawned_interfaces() periodic validation (every 30s)
that cross-checks interfaces against driver.connected_peers
- Update _cleanup_stale_interface and _address_changed_callback to
maintain the new mapping
- Clear address_to_interface on detach()
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous patch changed RNS.log in the test's namespace, but
BLEInterface.py imports RNS at module load time. To capture log
calls from _address_changed_callback(), we need to patch RNS.log
where it's used: ble_reticulum.BLEInterface.RNS.log
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test was using mock_rns.log.assert_called() but the real
BLEInterface code calls the actual RNS.log function, not the mock.
Fixed by patching RNS.log at module level to capture actual log calls,
then asserting the "no identity found" warning was logged.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 14 tests that exercise the REAL BLEInterface code:
- TestIdentityCacheOnDisconnect: verify caching on disconnect
- TestIdentityCacheOnDataReceive: verify cache restoration on data
- TestAddressChangedCallback: verify address migration
- TestCacheTTL: verify 60-second TTL behavior
- TestReassemblyCodePath: verify cache in reassembly path
- TestEdgeCases: concurrent access, multiple disconnects
Tests skip locally if RNS not installed but run in CI to provide
actual line coverage on the identity cache changes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TestIdentityCacheIntegration class that imports and tests actual
BLEInterface methods instead of just mocking the logic. This should
provide codecov coverage on the changed lines.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add CODECOV_TOKEN to codecov-action uploads (required for v4)
- Change unit test coverage from single file to entire package
- Add codecov.yml with coverage thresholds and flags
Note: CODECOV_TOKEN secret must be added to repository settings.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Python's disconnect callback fires but the driver layer (Android/Kotlin)
maintains or quickly re-establishes the GATT connection, data was being
dropped because address_to_identity was cleared.
Changes:
- Add _identity_cache with 60-second TTL to preserve identities after disconnect
- Cache identity in _device_disconnected_callback before cleanup
- Check cache in _handle_ble_data and restore identity if found
- Add on_address_changed callback for dual connection deduplication
- Add _address_changed_callback to migrate identity mappings
- Support driver.request_identity_resync() for fallback recovery
This fixes the "no identity for peer X, dropping data" warning that occurred
when the Python layer lost track of a peer that was still connected at the
driver level.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update install.sh to copy from src/ble_reticulum/
- Update test files with new source paths
- Update GitHub workflows for new package structure
- Remove temporary refactoring helper scripts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The sed replacement was too aggressive - it replaced the import for
the base Interface class from the Reticulum package itself.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes namespace collision with Reticulum's own RNS.Interfaces package.
When both packages were installed, the collision caused import issues
and prevented BLE discovery between devices.
Changes:
- Rename src/RNS/Interfaces/ to src/ble_reticulum/
- Update pyproject.toml package configuration
- Update all imports in source and test files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The peer_identity parameter is already the identity hash received from
the BLE handshake. Calling RNS.Identity.full_hash() on it again produced
a completely different value, causing identity mismatch between peers.
This caused "no reassembler for X" errors because the sending peer's
identity didn't match what the receiving peer computed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(ble): Increase D-Bus monitoring intervals to prevent HCI errors
The D-Bus monitoring threads were polling too frequently (0.5s and 30s),
causing HCI command collisions on BCM43xx single-radio chips. These chips
cannot handle concurrent BLE operations, and the frequent D-Bus activity
was interfering with scan/advertise cycles.
Changes:
- Increase D-Bus disconnect monitor interval from 0.5s to 5s
- Increase stale connection poll interval from 30s to 120s
This eliminates HCI errors (Opcode 0x2005/0x2006) while preserving
disconnect detection functionality with slightly higher latency.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(ble): Convert D-Bus monitoring to event-driven approach
Replace polling-based D-Bus monitoring with true event-driven pattern:
1. D-Bus monitor thread:
- Use asyncio.Event instead of periodic sleep
- Store loop reference for thread-safe shutdown
- Use call_soon_threadsafe to wake loop on stop
2. Stale poll thread:
- Replace busy-wait loop (240 x 0.5s) with single Event.wait()
- Increase interval from 120s to 300s (safety net only)
- Immediate response to stop signal
Benefits:
- Zero CPU usage while waiting (no periodic wakeups)
- Immediate shutdown response (ms instead of 5s)
- Cleaner, simpler code
- Maintains disconnect detection via D-Bus signals
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* test(ble): Add comprehensive unit tests for HCI error fixes
Add 26 new unit tests for the event-driven D-Bus monitoring fixes
that eliminated HCI errors on BCM43xx single-radio chips.
Test coverage:
- TestEventDrivenDBusMonitor: Tests asyncio.Event usage, immediate
wake response, call_soon_threadsafe cross-thread signaling
- TestStalePollImprovements: Tests threading.Event.wait() usage,
300s interval, immediate stop response
- TestStopShutdownBehavior: Tests stop() async signaling, RuntimeError
handling, shutdown latency improvement
- TestIntegrationScenarios: Tests full lifecycle, multiple stop calls
- TestCodeVerification: Verifies actual source patterns match expected
All 26 tests pass without requiring pytest-asyncio plugin.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(tests): Use threading.RLock instead of asyncio.Lock in test fixtures
In Python 3.8/3.9, asyncio.Lock() requires a running event loop. When
test_hci_error_fixes.py runs first (alphabetically) and uses asyncio.run(),
it closes the event loop after each test. Subsequent test fixtures that
create asyncio.Lock() then fail with "no current event loop" errors.
Since these are mock fixtures that don't need async semantics, use
threading.RLock() instead.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(tests): Replace all asyncio.Lock() with threading.RLock() in test mocks
asyncio.Lock() requires a running event loop in Python 3.8/3.9. When
test files using asyncio.run() execute first, the event loop is closed,
causing subsequent test fixtures to fail when creating asyncio.Lock().
Fixed in:
- test_peripheral_disconnect_cleanup.py (mock_gatt_server fixture)
- test_bluez_state_cleanup.py (mock_driver fixture)
- test_ble_peer_interface.py (create_mock_peer_interface helper)
- conftest.py (create_mock_interface helper)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: torlando-tech <torlando-tech@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The _cleanup_stale_interface() method was cleaning up identity_to_address
but not the reverse mapping address_to_identity. This caused:
- Memory leak: stale entries accumulate over time
- Inconsistent state: bidirectional mappings become out of sync
This fix matches the cleanup pattern in _disconnected_callback() which
properly cleans up both directions of the mapping.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
pip extracts package metadata from the wheel filename, which must follow
PEP 427 format: {package}-{version}-{pythontag}-{abi}-{platform}.whl
The previous filename (dbus_fast-armv6l-$$.whl) caused pip to fail with
"Invalid wheel filename (wrong number of parts)" on 32-bit ARM systems.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Detect and handle RNS installations via `uv tool install rns`:
- Add uv detection when python3 can import RNS (path contains /uv/tools/)
- Add shebang-based detection when system python3 differs from tool's python
- Install dependencies using `uv pip install --python`
- Handle uv Python path for setcap Bluetooth permissions
This fixes "Could not determine installation mode" errors for users
who install Reticulum with uv instead of pip/pipx.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolved conflicts:
- BLEInterface.py, test_v2_2_mac_sorting.py, CLAUDE.md: Keep release
(HW_MTU fix, pending_mtu, MAC rotation recovery/bypass, corrected tests)
- install.sh, README.md: Keep main (JustWorksRepairing auto-configuration)
- CHANGELOG.md: Consolidated detailed notes from main into [0.2.2] section
- Added FUNDING.yml from main
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>