# Database Query Performance Analysis Report **Analysis Date:** 2026-02-02 **Log Duration:** ~6 minutes (15:24:50 - 15:30:58) **Total Queries:** 366 queries **Data Source:** serverlog.txt --- ## Executive Summary The relay is experiencing moderate performance issues with an average query time of **10.4ms** and a maximum query time of **672ms**. The primary bottlenecks are: 1. **Tag-based searches using `json_each()`** - 53% of all queries (194/366) 2. **Monitoring system queries** - Taking 540-550ms each 3. **Multiple pubkey lookups** - Kind 10002 queries with 15-50 pubkeys --- ## Query Performance Metrics ### Overall Statistics - **Total Queries:** 366 - **Average Query Time:** 10,440 μs (10.4 ms) - **Minimum Query Time:** 14 μs - **Maximum Query Time:** 672,846 μs (672.8 ms) - **Slow Queries (>10ms):** 8 queries (2.2%) ### Query Type Breakdown | Type | Count | Percentage | |------|-------|------------| | REQ | 359 | 98.1% | | MONITOR | 7 | 1.9% | --- ## Critical Performance Issues ### 1. **SLOWEST QUERY: 672ms Tag Search (IP: 192.42.116.178)** ```sql SELECT id, pubkey, created_at, kind, content, sig, tags FROM events WHERE 1=1 AND (kind < 20000 OR kind >= 30000) AND kind IN (5,6300,7000,2004,1622) AND EXISTS ( SELECT 1 FROM json_each(json(tags)) WHERE json_extract(value, '$[0]') = ? AND json_extract(value, '$[1]') IN (?) ) ORDER BY created_at DESC LIMIT 100 ``` **Problem:** Full table scan with JSON parsing for every row **Impact:** 672ms for 0 results (wasted computation) **Root Cause:** No index on tag values, requires scanning all events --- ### 2. **Monitoring System Queries: 540-550ms Each** ```sql SELECT * FROM active_subscriptions_log ORDER BY created_at DESC ``` **Occurrences:** 4 queries in 6 minutes **Average Time:** 545ms **Rows Returned:** 20-52 rows **Problem:** Extremely slow for small result sets **Root Cause:** Likely missing index on `created_at` column --- ### 3. **Tag-Based Searches (json_each) - 53% of All Queries** - **Total:** 194 queries (53% of all queries) - **Pattern:** `EXISTS (SELECT 1 FROM json_each(json(tags)) WHERE ...)` - **Most Common:** Kind 1984 (105 queries), Kind 1111 (47 queries) - **Problem:** Every tag search requires full JSON parsing **Example Pattern:** ```sql WHERE kind IN (1984) AND pubkey IN (?) AND EXISTS ( SELECT 1 FROM json_each(json(tags)) WHERE json_extract(value, '$[0]') = ? AND json_extract(value, '$[1]') IN (?,?,?,?,?,?,?,?,?,?,?) ) ``` --- ### 4. **Multiple Pubkey Lookups (Kind 10002)** - **Total:** 64 queries for kind 10002 - **Average Time:** 2,500-3,300 μs (2.5-3.3ms) - **Pattern:** Queries with 15-50 pubkeys in `IN` clause - **Problem:** Large `IN` clauses without proper indexing **Example:** ```sql WHERE kind IN (10002) AND pubkey IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) ``` --- ## Client Activity Analysis ### Top Query Sources (by IP) | IP Address | Query Count | Percentage | Notes | |------------|-------------|------------|-------| | 45.84.107.222 | 101 | 27.6% | **Highest activity** | | 23.234.109.54 | 69 | 18.9% | Second highest | | 185.220.101.38 | 56 | 15.3% | Third highest | | 192.42.116.178 | 51 | 13.9% | **Source of 672ms query** | | 149.22.80.85 | 34 | 9.3% | | | 174.138.53.241 | 24 | 6.6% | | | Others | 31 | 8.5% | 6 other IPs | **Observation:** Top 3 IPs account for 61.8% of all queries --- ## Most Common Query Patterns | Kind Filter | Query Count | Percentage | |-------------|-------------|------------| | kind IN (1984) | 105 | 28.7% | | kind IN (10002) | 64 | 17.5% | | kind IN (1111) | 47 | 12.8% | | kind IN (0,2,3,10002) | 24 | 6.6% | | kind IN (9735) | 23 | 6.3% | | kind IN (0,30315,10002,10050) | 20 | 5.5% | | Others | 83 | 22.7% | --- ## Optimization Recommendations ### Priority 1: Critical (Immediate Action Required) #### 1.1 Add Index on `active_subscriptions_log.created_at` **Impact:** Will reduce monitoring queries from 540ms to <10ms **Effort:** Low **SQL:** ```sql CREATE INDEX IF NOT EXISTS idx_active_subscriptions_created_at ON active_subscriptions_log(created_at DESC); ``` #### 1.2 Implement Tag Indexing System **Impact:** Will reduce tag searches from 100-600ms to <10ms **Effort:** High **Options:** - **Option A:** Create separate `event_tags` table with indexes ```sql CREATE TABLE event_tags ( event_id TEXT NOT NULL, tag_name TEXT NOT NULL, tag_value TEXT NOT NULL, FOREIGN KEY (event_id) REFERENCES events(id) ); CREATE INDEX idx_event_tags_lookup ON event_tags(tag_name, tag_value); ``` - **Option B:** Use SQLite JSON1 extension with generated columns (if available) - **Option C:** Implement application-level tag caching **Recommended:** Option A (most reliable and performant) --- ### Priority 2: High (Implement Within Week) #### 2.1 Optimize Multiple Pubkey Queries **Current:** `pubkey IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)` **Problem:** Large IN clauses are inefficient **Solution:** - Add composite index: `CREATE INDEX idx_events_kind_pubkey ON events(kind, pubkey, created_at DESC);` - Consider query rewriting for >10 pubkeys #### 2.2 Add Query Result Caching **Target Queries:** - Kind 0 (profile) lookups - frequently repeated - Kind 10002 (relay lists) - relatively static - Kind 1984 (reports) - can be cached for 30-60 seconds **Implementation:** - Use in-memory LRU cache (e.g., 1000 entries) - Cache key: hash of SQL + parameters - TTL: 30-60 seconds for most queries --- ### Priority 3: Medium (Optimize Over Time) #### 3.1 Disable or Throttle Monitoring Queries **Current:** Monitoring queries run every ~60 seconds **Impact:** Each query takes 540ms **Options:** - Increase throttle to 300 seconds (5 minutes) - Disable monitoring entirely if not actively used - Optimize `active_subscriptions_log` table structure #### 3.2 Implement Query Complexity Limits **Problem:** Some queries scan entire table (e.g., 672ms query returned 0 results) **Solution:** - Reject queries without time bounds (require `since` or `until`) - Limit number of kinds in single query (max 10) - Limit number of pubkeys in single query (max 20) #### 3.3 Add Composite Indexes for Common Patterns ```sql -- For kind + created_at queries (most common pattern) CREATE INDEX idx_events_kind_created ON events(kind, created_at DESC); -- For kind + pubkey + created_at queries CREATE INDEX idx_events_kind_pubkey_created ON events(kind, pubkey, created_at DESC); ``` --- ## Database Schema Recommendations ### Current Issues 1. **No tag indexing** - Forces full table scans with JSON parsing 2. **Missing created_at indexes** - Monitoring queries are extremely slow 3. **No composite indexes** - Multi-condition queries inefficient ### Recommended Schema Changes ```sql -- 1. Add tag indexing table CREATE TABLE IF NOT EXISTS event_tags ( event_id TEXT NOT NULL, tag_name TEXT NOT NULL, tag_value TEXT NOT NULL, FOREIGN KEY (event_id) REFERENCES events(id) ON DELETE CASCADE ); CREATE INDEX idx_event_tags_lookup ON event_tags(tag_name, tag_value); CREATE INDEX idx_event_tags_event_id ON event_tags(event_id); -- 2. Add monitoring table index CREATE INDEX IF NOT EXISTS idx_active_subscriptions_created_at ON active_subscriptions_log(created_at DESC); -- 3. Add composite indexes for common query patterns CREATE INDEX IF NOT EXISTS idx_events_kind_created ON events(kind, created_at DESC); CREATE INDEX IF NOT EXISTS idx_events_kind_pubkey_created ON events(kind, pubkey, created_at DESC); -- 4. Add index for pubkey lookups CREATE INDEX IF NOT EXISTS idx_events_pubkey_created ON events(pubkey, created_at DESC); ``` --- ## Expected Performance Improvements | Optimization | Current Avg | Expected Avg | Improvement | |--------------|-------------|--------------|-------------| | Tag searches (with event_tags table) | 100-600ms | 5-20ms | **95-97%** | | Monitoring queries (with index) | 540ms | 5-10ms | **98%** | | Multiple pubkey queries (with index) | 2.5-3.3ms | 0.5-1ms | **70-80%** | | Overall average query time | 10.4ms | 2-3ms | **70-80%** | --- ## Client Behavior Analysis ### Potentially Abusive Patterns #### IP: 192.42.116.178 (51 queries) - **Issue:** Generated the slowest query (672ms) - **Pattern:** Complex tag searches with multiple kinds - **Recommendation:** Monitor for repeated expensive queries #### IP: 45.84.107.222 (101 queries - 27.6% of total) - **Issue:** Highest query volume - **Pattern:** Mix of kind 10002, 1984, and various other kinds - **Recommendation:** Acceptable if queries are efficient; monitor for abuse ### Normal Behavior - Most queries are <1ms (fast) - Majority return 0-10 rows (reasonable) - Query patterns match typical Nostr client behavior --- ## Action Plan ### Immediate (Today) 1. ✅ Add index on `active_subscriptions_log.created_at` 2. ✅ Increase monitoring throttle from 60s to 300s (or disable) 3. ✅ Monitor IP 192.42.116.178 for repeated expensive queries ### This Week 1. ⏳ Design and implement `event_tags` table 2. ⏳ Add composite indexes for common query patterns 3. ⏳ Implement query complexity limits (require time bounds) ### This Month 1. ⏳ Implement query result caching (LRU cache) 2. ⏳ Add query cost estimation and rejection 3. ⏳ Optimize subscription matching algorithm --- ## Monitoring Recommendations ### Key Metrics to Track 1. **Average query time** - Target: <5ms 2. **P95 query time** - Target: <50ms 3. **P99 query time** - Target: <100ms 4. **Queries >100ms** - Target: <1% of queries 5. **Tag search percentage** - Target: <30% after optimization ### Alert Thresholds - **Critical:** Average query time >20ms for 5 minutes - **Warning:** Any single query >1000ms (1 second) - **Info:** Client making >100 queries/minute --- ## Conclusion The relay is experiencing performance issues primarily due to: 1. **Lack of tag indexing** (53% of queries affected) 2. **Missing indexes on monitoring tables** (540ms queries) 3. **Inefficient multiple pubkey lookups** Implementing the recommended optimizations will reduce average query time from **10.4ms to 2-3ms** (70-80% improvement) and eliminate the 500-600ms slow queries entirely. **Priority:** Implement tag indexing system and add missing indexes within 1 week to prevent performance degradation as database grows.