Implemented comprehensive database query tracking to identify clients causing
high CPU usage through excessive database queries. The relay now tracks and
displays query statistics per WebSocket connection in the admin UI.
Features Added:
- Track db_queries_executed and db_rows_returned per connection
- Calculate query rate (queries/minute) and row rate (rows/minute)
- Display stats in admin UI grouped by IP address and WebSocket
- Show: IP, Subscriptions, Queries, Rows, Query Rate, Duration
Implementation:
- Added tracking fields to per_session_data structure
- Increment counters in handle_req_message() and handle_count_message()
- Extract stats from pss in query_subscription_details()
- Updated admin UI to display IP address and query metrics
Use Case:
Admins can now identify abusive clients by monitoring:
- High query rates (>50 queries/min indicates polling abuse)
- High row counts (>10K rows/min indicates broad filter abuse)
- Query patterns (high queries + low rows = targeted, high both = crawler)
This enables informed decisions about which IPs to blacklist based on
actual resource consumption rather than just connection count.
The kind index was adding subscriptions multiple times when filters contained
duplicate kinds (e.g., 'kinds': [1, 1, 1] or multiple filters with same kind).
This caused:
- Redundant malloc/free operations during add/remove
- Multiple index entries for same subscription+kind pair
- Excessive TRACE logging (7+ removals for single subscription)
- Wasted CPU cycles on duplicate operations
Fix:
- Added bitmap-based deduplication in add_subscription_to_kind_index()
- Uses 8KB bitmap (65536 bits) to track which kinds already added
- Prevents adding same subscription to same kind index multiple times
- Reduces index operations by 3-10x for subscriptions with duplicate kinds
Performance Impact:
- Eliminates redundant malloc/free cycles
- Reduces lock contention on kind index operations
- Decreases log volume significantly
- Should reduce CPU usage by 20-40% under production load
The kind index optimization in v1.1.4 introduced a critical bug that caused
segmentation faults in production. The bug was in add_subscription_to_kind_index()
which directly assigned sub->next for no-kind-filter subscriptions, corrupting
the main active_subscriptions linked list.
Root Cause:
- subscription_t has only ONE 'next' pointer used by active_subscriptions list
- Code tried to reuse 'next' for no_kind_filter_subs list
- This overwrote the active_subscriptions linkage, breaking list traversal
- Result: segfaults when iterating subscriptions
Fix:
- Added no_kind_filter_node_t wrapper structure (like kind_subscription_node_t)
- Changed no_kind_filter_subs from subscription_t* to no_kind_filter_node_t*
- Updated add/remove functions to use wrapper nodes
- Updated broadcast function to iterate through wrapper nodes
This follows the same pattern already used for kind_index entries and
prevents any corruption of the subscription structure's next pointer.
The print_version() function was displaying a hardcoded 'v1.0.0' string instead
of using the VERSION define from main.h. This caused version mismatches where
the git tag and main.h showed v1.1.1 but the binary reported v1.0.0.
Now print_version() uses the VERSION macro, ensuring all version displays are
consistent and automatically updated when increment_and_push.sh updates main.h.
Previously, send_notice_message() called queue_message() with NULL pss, causing
all NOTICE messages to fail silently. This affected filter validation errors
(e.g., invalid kinds > 65535 per NIP-01) where clients received no response.
Changes:
- Updated send_notice_message() signature to accept struct per_session_data* pss
- Updated 37 call sites across websockets.c (31) and nip042.c (6)
- Updated forward declarations in main.c, websockets.c, and nip042.c
- Added tests/invalid_kind_test.sh to verify NOTICE responses for invalid filters
Fixes issue where REQ with kinds:[99999] received no response instead of NOTICE.