docs(spec): fix §10.2 Resource integrity hash — prefix is not r, not hashed
§10.2 step 3 wrongly equated the random-hash prefix prepended to the Resource body with the advertisement's `r` field, and step 5 fed that prefix into the hash/expected_proof input. Upstream RNS uses two distinct get_random_hash()[:4] values: a throwaway prefix the receiver strips and discards, and self.random_hash (the adv `r` field). The integrity hash is SHA256(uncompressed_plaintext || r) over the prefix-stripped, decompressed body — exactly as §10.8 already stated. - §10.2 steps 3 & 5 corrected to agree with §10.8 - §10.8: renamed misleading plaintext_with_random / data_with_random - §10.12: wire-layering block rewritten to match - README: errata entry under Spec corrections Verified against RNS 1.2.5 (Resource.py:332,405,412,440-443,682-694,755). Resolves #9. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
3eea25977a
commit
1b955d19a9
2 changed files with 18 additions and 11 deletions
|
|
@ -31,6 +31,9 @@ As content grows, `SPEC.md` will be split into per-layer files (packet header, i
|
|||
|
||||
Errata that may invalidate code built against an earlier revision of `SPEC.md`. Newest first. Feature additions and ordinary edits live in `git log` — this section is reserved for cases where the spec said one thing, that turned out to be wrong, and an implementer who pulled the bad version needs to fix their code.
|
||||
|
||||
- **2026-05-17 — §10.2 Resource integrity hash: the 4-byte prefix is NOT `r`, and is NOT in the hash input.**
|
||||
Bad text introduced in [`95823ad`](../../commit/95823ad); on master from 2026-05-03 to 2026-05-17. §10.2 step 3 wrongly equated the random-hash *prefix* prepended to the Resource body with the advertisement's `r` field, and step 5 wrongly fed that prefix into `hash`/`expected_proof` (claiming `hash = SHA256(random_hash || body || random_hash)`). Upstream `RNS/Resource.py` (1.2.4) uses *two distinct* `get_random_hash()[:4]` values: a throwaway prefix the receiver strips and discards (`:405`/`412`, `:682`), and `self.random_hash` — the advertisement's `r` field (`:440`, `:1285`). The integrity hash is `SHA256(uncompressed_plaintext || r)` over the prefix-stripped, decompressed body (`:441`, `:694`) — exactly as §10.8 already stated. An implementer who trusted §10.2 step 5 computes a hash no spec-compliant peer accepts; every Resource is rejected as `CORRUPT`. §10.2 corrected to agree with §10.8; §10.12's wire-layering block fixed to match. Surfaced by [issue #9](../../issues/9).
|
||||
|
||||
- **2026-05-06 — §2.1 flag byte: bit 7 is the IFAC flag, not part of `header_type`.**
|
||||
Bad text introduced in [`8c4d550`](../../commit/8c4d550), corrected in [`0c2021e`](../../commit/0c2021e); on master from 2026-05-04 to 2026-05-06. The corrected layout is `ifac_flag(bit 7) | header_type(bit 6) | context_flag(5) | transport_type(4) | destination_type(3-2) | packet_type(1-0)`, matching the official manual §4.6.3 and upstream `RNS/Packet.py:246` (parse mask `0b01000000 >> 6`) / `RNS/Transport.py:1003` (IFAC setter `raw[0] | 0x80`). Implementers who consumed the bad version will mis-parse every IFAC-protected packet as `header_type ∈ {2, 3}` and drop it. Surfaced by [issue #4](../../issues/4) item #1.
|
||||
|
||||
|
|
|
|||
26
SPEC.md
26
SPEC.md
|
|
@ -2267,13 +2267,15 @@ Given input data and an `RNS.Link` in `ACTIVE` state (`RNS/Resource.py:248-478`)
|
|||
|
||||
1. **Optional metadata prefix.** If the caller supplied a `metadata` dict, msgpack-pack it and prepend `length(3 bytes, big-endian uint24) || packed_metadata` to the body. The `has_metadata` (`x`) flag in the advertisement signals this. Receivers strip the prefix during reassembly (line 699-707).
|
||||
2. **Optional bz2 compression.** If `auto_compress` is true and the data fits within `auto_compress_limit` (default 64 MiB), the body is bz2-compressed and the `compressed` (`c`) flag is set. If compression doesn't shrink the data, the uncompressed form is sent and `c` is cleared.
|
||||
3. **Random hash prefix.** A 4-byte (`Resource.RANDOM_HASH_SIZE`) random hash is prepended to the (compressed-or-not) body. This is the `r` field in the advertisement and is part of the input to `hash` and `expected_proof`.
|
||||
3. **Random hash prefix.** A 4-byte (`Resource.RANDOM_HASH_SIZE`) random hash is prepended to the (compressed-or-not) body — `Resource.py:405`/`412`, a fresh `RNS.Identity.get_random_hash()[:4]` call. This prefix is **not** the `r` field, and is **not** part of the `hash` / `expected_proof` input. It is a separate throwaway value that travels inside the encrypted blob; the receiver strips and discards it (§10.8 step 3). The advertisement's `r` field carries a *different* value — `self.random_hash`, generated by its own `get_random_hash()[:4]` call at `Resource.py:440` — which is the actual integrity-hash and hashmap salt.
|
||||
4. **Link encryption.** The full `random_hash || (compressed?) data` blob is encrypted using `link.encrypt(...)` — i.e. the link-derived Token form (§3.1), no ephemeral_pub prefix. The `encrypted` (`e`) flag is set.
|
||||
5. **Hash and proof material.**
|
||||
- `data_with_random = random_hash || (compressed?) plaintext`
|
||||
- `hash = SHA256(data_with_random || random_hash)` (32 bytes)
|
||||
5. **Hash and proof material** (`Resource.py:440-443`). All three are computed over the **original uncompressed `plaintext`** — the caller's input, including any metadata prefix from step 1 (`Resource.py:332`) — *not* the compressed body, and *not* the random-prefixed wire blob from step 3:
|
||||
- `random_hash = RNS.Identity.get_random_hash()[:4]` — the value the advertisement's `r` field carries.
|
||||
- `hash = SHA256(plaintext || random_hash)` (32 bytes)
|
||||
- `truncated_hash = hash[:16]`
|
||||
- `expected_proof = SHA256(data_with_random || hash)` (32 bytes) — what the receiver will eventually return in the RESOURCE_PRF packet.
|
||||
- `expected_proof = SHA256(plaintext || hash)` (32 bytes) — what the receiver will eventually return in the RESOURCE_PRF packet.
|
||||
|
||||
The 4-byte prefix from step 3 is **not** in any of these inputs. The receiver strips the prefix and bz2-decompresses *before* hashing (§10.8 steps 3-5), so the sender must hash the uncompressed, unprefixed `plaintext` for the two sides to agree. A receiver that includes the prefix, or hashes the compressed form, rejects every legitimate Resource as `CORRUPT`.
|
||||
6. **Part split.** The encrypted body is sliced into parts of size `SDU = link.mtu - HEADER_MAXSIZE - IFAC_MIN_SIZE`. Each part becomes a packed `RNS.Packet(link, part_data, context=RESOURCE)`; the packed wire bytes are stored in `parts[i]` for later sending.
|
||||
7. **Hashmap.** Each part is fingerprinted to `MAPHASH_LEN = 4 bytes`. The full hashmap is `b"".join(map_hashes)`. **Hash collisions within the COLLISION_GUARD_SIZE = 2 × WINDOW_MAX + HASHMAP_MAX_LEN window are detected at construction time** — if two parts hash to the same 4-byte map_hash within that window, the random hash is regenerated and the whole hashmap is recomputed. Without this guard, the receiver can't disambiguate which part it just received from a part-request that named a colliding map_hash.
|
||||
|
||||
|
|
@ -2445,7 +2447,7 @@ When the receiver has assembled the full resource (`received_count == total_part
|
|||
2. `link.decrypt(...)` to plaintext.
|
||||
3. Strip the 4-byte `random_hash` prefix — **discard, do NOT compare to advertisement.r** (see callout below).
|
||||
4. If `compressed`: bz2-decompress.
|
||||
5. Recompute `SHA256(plaintext_with_random || random_hash)` and compare to `h`.
|
||||
5. Recompute `SHA256(plaintext || random_hash)` — over the prefix-stripped, decompressed body — and compare to `h`.
|
||||
6. If match: peel off metadata if `x` is set, write `data` to the destination; status = `COMPLETE`.
|
||||
7. If mismatch: status = `CORRUPT`; cancel.
|
||||
|
||||
|
|
@ -2459,7 +2461,7 @@ When the receiver has assembled the full resource (`received_count == total_part
|
|||
> formula `SHA256(data || r)`). A receiver that does
|
||||
> `assert prefix == advertisement.r` will reject every legitimate
|
||||
> Resource as corrupt. Just strip and discard. Integrity is proven
|
||||
> exclusively by step 5's `SHA256(plaintext_with_random || random_hash)`
|
||||
> exclusively by step 5's `SHA256(plaintext || random_hash)`
|
||||
> against `h` — that's the only check that matters; the prefix
|
||||
> bytes are scaffolding.
|
||||
|
||||
|
|
@ -2467,7 +2469,7 @@ On `COMPLETE`, the receiver emits the proof:
|
|||
|
||||
```
|
||||
proof_data = resource_hash(32) || full_proof(32)
|
||||
where full_proof = SHA256(data_with_random || resource_hash)
|
||||
where full_proof = SHA256(plaintext || resource_hash)
|
||||
```
|
||||
|
||||
sent as `RNS.Packet(link, proof_data, packet_type=PROOF, context=RESOURCE_PRF)` (`Resource.py:755-766`). The `full_proof` is exactly what the initiator pre-computed as `expected_proof` in §10.2 step 5 — it can validate the proof bytewise without re-running the SHA-256.
|
||||
|
|
@ -2511,10 +2513,12 @@ The 3-byte big-endian uint24 metadata length encoding (§10.2 step 1) is what li
|
|||
Encryption layering is **outermost** — the wire bytes look like:
|
||||
|
||||
```
|
||||
plaintext = data_with_random || random_hash # SHA-256 input
|
||||
data_with_random = random_hash(4) || maybe_compressed_body
|
||||
wire_blob = prefix(4) || maybe_compressed # the body that gets encrypted
|
||||
prefix = fresh get_random_hash()[:4] # NOT `r`; receiver strips & discards
|
||||
maybe_compressed = compressed_body iff `c` flag, else uncompressed
|
||||
parts[i] = link.encrypt( data_with_random[i*SDU : (i+1)*SDU] )
|
||||
parts[i] = link.encrypt(wire_blob)[i*SDU : (i+1)*SDU] # encrypt whole, then slice
|
||||
|
||||
hash = SHA256(uncompressed_body || random_hash) # integrity; random_hash = adv `r`
|
||||
```
|
||||
|
||||
Critically, **the link encryption is applied to the WHOLE concatenated data first, then sliced into parts** — not to each part individually. This means part boundaries don't align with cipher block boundaries; a missing part can't be decrypted in isolation. The receiver must accumulate all parts before calling `link.decrypt()` (`Resource.py:676-679`).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue