From 88960829f1d134e918052513051ccfd3190e45cc Mon Sep 17 00:00:00 2001 From: fiatjaf Date: Tue, 29 Oct 2024 11:30:33 -0300 Subject: [PATCH] nip45: add hyperloglog relay response. --- 45.md | 55 ++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 9 deletions(-) diff --git a/45.md b/45.md index 14d656ec..842d3402 100644 --- a/45.md +++ b/45.md @@ -29,29 +29,66 @@ In case a relay uses probabilistic counts, it MAY indicate it in the response wi Whenever the relay decides to refuse to fulfill the `COUNT` request, it MUST return a `CLOSED` message. +## HyperLogLog + +Relays may return an HyperLogLog value together with the count, hex-encoded. + +``` +["COUNT", , {"count": , "hll": ""}] +``` + +This is so it enables merging results from multiple relays and yielding a reasonable estimate of reaction counts, comment counts and follower counts, while saving many millions of bytes of bandwidth for everybody. + +### Algorithm + +The HLL value must be calculated with a precision of `8`, i.e. with 256 registers. + +To compute HLL values, first initi the 256 registers to `0` each; then, for on every event to be counted, + + 1. take byte `16` of the `id` and use it to determine the register index; + 2. count the number of leading zero bits in the following bytes `17..24` of the `id`; + 3. if the number of leading zeros is bigger than what was previously stored in that register, overwrite it. + +That is all that has to be done on the relay side, and therefore the only part needed for interoperability. + +On the client side, these HLL values received from different relays can be merged (by simply going through all the registers in HLL values from each relay and picking the highest value for each register, regardless of the relay). + +And finally the absolute count can be estimated by running some methods I don't dare to describe here in English, it's better to check some implementation source code (also, there can be different ways of performing the estimation, with different quirks applied on top of the raw registers). + +### `hll` encoding + +The value `hll` value must be the concatenation of the 256 registers, each being a uint8 value (i.e. a byte). Therefore `hll` will be a 512-character hex string. + ## Examples -### Followers count - -``` -["COUNT", , {"kinds": [3], "#p": []}] -["COUNT", , {"count": 238}] -``` - -### Count posts and reactions +### Count notes and reactions ``` ["COUNT", , {"kinds": [1, 7], "authors": []}] ["COUNT", , {"count": 5}] ``` -### Count posts approximately +### Count notes approximately ``` ["COUNT", , {"kinds": [1]}] ["COUNT", , {"count": 93412452, "approximate": true}] ``` +### Followers count with HyperLogLog + +``` +["COUNT", , {"kinds": [3], "#p": []}] +["COUNT", , {"count": 16578, "hll": "0607070505060806050508060707070706090d080b0605090607070b07090606060b0705070709050807080805080407060906080707080507070805060509040a0b06060704060405070706080607050907070b08060808080b080607090a06060805060604070908050607060805050d05060906090809080807050e0705070507060907060606070708080b0807070708080706060609080705060604060409070a0808050a0506050b0810060a0908070709080b0a07050806060508060607080606080707050806080c0a0707070a080808050608080f070506070706070a0908090c080708080806090508060606090906060d07050708080405070708"}] +``` + +### Reaction counts with HyperLogLog + +``` +["COUNT", , {"kinds": [7], "#e": []}] +["COUNT", , {"count": 2044, "hll": "01ef070505060806050508060707070706090d080b0605090607070b07090606060b0705070709050807080805080407060906080707080507070805060509040a0b06060704060405070706080607050907070b08060808080b080607090a06060805060604070908050607060805050d05060906090809080807050e0705070507060907060606070708080b0807070708080706060609080705060604060409070a0808050a0506050b0810060a0908070709080b0a07050806060508060607080606080707050806080c0a0707070a080808050608080f070506070706070a0908090c080708080806090508060606090906060d07050708080405070708"}] +``` + ### Relay refuses to count ```