Log shipping ingestion

Log shipping ingestion

We have two ways of getting WAF log data from Sucuri:

  1. Push (primary): log shipping — Sucuri POSTs batches of log entries to our endpoint as events happen.
  2. Pull (fallback): audit_trails API — we ask Sucuri for entries by date. See audit-trails.

Both feeds are blocked-requests only — see waf-metrics-blocked-only.

Endpoint

POST /api/v1/logsapp/controllers/api/v1/logs_controller.rb

Aspect Value
Method POST
Content-Type application/json
Max payload 50 MB
Auth privateKey (or api_key) parameter — checked against ENV['WAF_LOG_INGESTION_API_KEY'], falls back to any valid App.api_key
CSRF Skipped (API endpoint)
Sucuri config Configured on Sucuri’s side under Coralogix-style log forwarding (matches the field names below)

Request payload shape

json { "subsystemName": "example.com", "logEntries": [ { "severity": 3, "text": "<Apache combined log line with Sucuri block fields>", "category": "<block code / event type>" }, ... ] }

  • subsystemName — the customer’s domain. We look up the corresponding Waf via Waf.lookup_sucuri_domain.
  • logEntries[].text — Apache-format combined log line with appended Sucuri fields (block reason, geo, cache hit, etc.). Parsed by ApacheLogParser.

Processing pipeline

  1. Controller validates content-type, size, auth.
  2. Looks up Waf by subsystemName. Unknown domain → returns accepted with reason: 'unknown_domain' and (unless allowlisted) notifies Honeybadger.
  3. Creates a WafLogEntry row with raw_payload (jsonb) + payload_size.
  4. Enqueues WafLogProcessorWorker (Sidekiq queue: waf_logs, retry: 5).
  5. Worker calls WafLogProcessor → extracts the first log entry’s Apache text → ApacheLogParser returns structured fields (client_ip, http_method, request_path, http_status, response_size, referrer, user_agent, proxy_block_id, cache_hit, geo, timestamp).
  6. Worker writes WafMetricsRecorder (drives time-series counters) and WafIngestionCounter (fast-read counter for backroom admin views).
  7. On success, raw_payload is cleared to save DB space — unless ENV['KEEP_RAW_PAYLOADS_FOR_DEBUGGING'] is set.

Customer-facing query API

Customers query the ingested log data via GET /api/logs/query and GET /api/logs/scroll. See docs/api/api-logs-endpoint.md. Max date range per query: 90 days. Page size: 10,000 (also the max).

Skipped / silently dropped

SILENTLY_SKIPPED_DOMAINS in the controller suppresses Honeybadger notifications for domains Sucuri ships logs for but we don’t manage (e.g. able.expssl-wild.com).

If a WAF has log_analytics_enabled? false, the payload is acknowledged with reason: 'log_analytics_disabled' and dropped — no WafLogEntry row created.

Time zones

Sucuri designates timestamps and the audit-trail date parameter in US Eastern time. We store and query in UTC. See time-zones.