feat: add zombie connection detection to break symmetric deadlock

When BLE link degrades, 1-byte keepalives may still work while larger data
packets fail. Both sides think the connection is "alive" based on keepalives,
but data can't flow. This causes a deadlock where new connections are
rejected as "duplicates" even though the existing connection is non-functional.

This change adds zombie detection by tracking when real data (not keepalives)
was last received. If an existing connection has only exchanged keepalives
for > 30 seconds (configurable via _zombie_timeout), new connections from
the same identity are allowed and the zombie connection is disconnected.

Changes:
- Add _last_real_data dict to track last real data timestamp per identity
- Add _zombie_timeout (default 30s) for configurable zombie threshold
- Update _check_duplicate_identity with Check 3: zombie detection
- Update _handle_ble_data to track real data activity after keepalive filter
- Initialize tracking in _handle_identity_handshake and _spawn_peer_interface
- Clean up tracking in _process_pending_detaches
- Add comprehensive test suite for zombie detection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
torlando-tech 2026-01-18 12:47:45 -05:00
commit 73be6d93c0
5 changed files with 439 additions and 1 deletions

View file

@ -90,6 +90,8 @@ def ble_interface(mock_rns, mock_driver):
interface._identity_cache_ttl = 60
interface._pending_detach = {} # identity_hash -> timestamp
interface._pending_detach_grace_period = 2.0 # seconds
interface._last_real_data = {} # Track last real data activity for zombie detection
interface._zombie_timeout = 30.0 # Zombie connection timeout
# Fragmentation
interface.fragmenters = {}