fix(ble): Add connection race condition prevention and improve error handling
Implements comprehensive connection state tracking to prevent "Operation already in progress" errors and connection retry storms. BLE Interface changes: - Record connection attempts before calling driver.connect() - Add 5-second rate limiting between attempts to same peer - Skip connections already in progress via _connecting_peers check - Downgrade expected race conditions to DEBUG level - Auto-blacklist MAC addresses on connection failures - Add diagnostic logging for concurrent connection tracking BLE Driver changes: - Add _connecting_peers set to track in-progress connections - Prevent concurrent connection attempts to same address - Attach cleanup callbacks to connection Futures - Add defense-in-depth cleanup in finally blocks - Detailed logging for connection state debugging Documentation updates: - Add deployment workflow documentation to README.md - Update .github/workflows/README.md with CD workflow details - Document containerized runner SSH configuration - Update reference documentation (CLAUDE.md, BLE_PROTOCOL, etc.) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
1e4f1f5fb3
commit
12ff03d2fa
7 changed files with 444 additions and 5 deletions
|
|
@ -40,6 +40,18 @@ class DriverState(Enum):
|
|||
class BLEDriverInterface(ABC):
|
||||
"""
|
||||
Abstract interface for a platform-specific BLE driver.
|
||||
|
||||
Driver implementations should maintain connection state tracking
|
||||
to prevent race conditions from concurrent connection attempts:
|
||||
|
||||
self._connecting_peers: set = set() # addresses with pending connections
|
||||
self._connecting_lock: threading.Lock = threading.Lock()
|
||||
|
||||
The connect() method should check this set before initiating a connection,
|
||||
and always clean up the set in a finally block to ensure proper state
|
||||
management even on connection failures. This prevents "Operation already
|
||||
in progress" errors when discovery callbacks trigger multiple simultaneous
|
||||
connection attempts to the same peer.
|
||||
"""
|
||||
|
||||
# --- Callbacks ---
|
||||
|
|
@ -256,6 +268,11 @@ This tier tests your actual `BleakDriver` implementation against real hardware.
|
|||
* **Scanning Test:** Run a script that starts the driver and prints discovered devices. Verify that it finds your other test device.
|
||||
* **Connection Test:** Write a script to connect to the test device. Verify that the `on_device_connected` callback fires and that `driver.connected_peers` is updated.
|
||||
* **Data I/O Test:** After connecting, use `driver.send()` to send a simple "hello world" byte string. On the other device, verify that the bytes are received correctly. Test this in both directions.
|
||||
* **Connection Race Condition Test:** Simulate rapid discovery callbacks for the same peer (e.g., by triggering `on_device_discovered` multiple times in quick succession). Verify that:
|
||||
- Only one connection attempt is made (check `driver._connecting_peers` contains only one entry)
|
||||
- No "Operation already in progress" errors appear in logs
|
||||
- The `_connecting_peers` set is properly cleaned up after connection (success or failure)
|
||||
- Subsequent connection attempts are properly rate-limited (5-second minimum interval)
|
||||
|
||||
### Tier 3: End-to-End Testing (Full Stack)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue