BLACKSHIELD

公開ガイド

Network Sensor Scaling and Performance

Capacity planning, performance tuning, and sizing guidance for high-volume network telemetry ingestion. 対象: Platform architects, operations engineers, security engineers. 一般的な設定時間: 10 minutes.

reference

Use this if

Capacity planning, performance tuning, and sizing guidance for high-volume network telemetry ingestion.

Audience
Platform architects, operations engineers, security engineers
Typical time
10 minutes

開始前に

  • You have deployed at least one network sensor and are familiar with basic operations.
  • You understand your network's typical traffic volume and have monitoring in place.
  • You have access to CloudWatch (AWS), Cloud Monitoring (GCP), or Monitor (Azure).

Guide walkthrough

ステップ 1

Determine traffic volume and sizing tier

Size the sensor infrastructure based on expected network traffic and alert volume.

  • Low volume (<10 Gbps, <1000 alerts/min): t3.medium or g1-small — single sensor instance.
  • Medium volume (10–50 Gbps, 1k–10k alerts/min): m5.large or n1-standard-2 — single sensor, upgrade CPU/memory.
  • High volume (50–500 Gbps, 10k–100k alerts/min): c5.2xlarge or n1-standard-4 — multi-sensor active-passive or active-active.
  • Very high volume (>500 Gbps): multi-sensor active-active with load balancing and dedicated backend.

What success looks like

Very high volume (>500 Gbps): multi-sensor active-active with load balancing and dedicated backend.

ステップ 2

Tuning configuration for your workload

Adjust sensor parameters to match your priorities (real-time vs. accuracy vs. cost).

  • Real-time priority: MIN_SEVERITY=medium, FLUSH_INTERVAL_SECONDS=10, PACKET_SAMPLING_RATE=1.0 (no sampling).
  • Cost-optimized: MIN_SEVERITY=high, FLUSH_INTERVAL_SECONDS=300, PACKET_SAMPLING_RATE=0.1 (10% sampling).
  • High-volume ingestion: set MAX_EVENTS_PER_BATCH=5000, BATCH_TIMEOUT_SECONDS=30.
  • Reduce API load: set SCAN_INTERVAL_SECONDS=60 for periodic ingestion vs. continuous.

What success looks like

Reduce API load: set SCAN_INTERVAL_SECONDS=60 for periodic ingestion vs. continuous.

ステップ 3

Sensor type comparison and selection

Choose between Suricata, Zeek, and eBPF based on use case and resource constraints.

  • Suricata: 15–20 Gbps per core, best for malware/IDS detection, highest memory (4–8 GB for 50 Gbps).
  • Zeek: 5–10 Gbps per core, best for protocol analysis and behavior profiling, moderate memory (2–4 GB).
  • eBPF: 50–100 Gbps per core, best for runtime events and system call monitoring, lowest memory (500 MB–1 GB).

What success looks like

eBPF: 50–100 Gbps per core, best for runtime events and system call monitoring, lowest memory (500 MB–1 GB).

ステップ 4

Monitoring and alerting

Set up dashboards and alerts to track sensor health and performance.

  • Monitor: findings_ingested_total, capture_packets_dropped, cpu_usage, memory_usage, api_request_latency.
  • Alert on: cpu_usage > 80%, memory_usage > 85%, capture_packets_dropped > 1%, api_errors_5xx > 10/min.
  • Enable CloudWatch (AWS), Cloud Monitoring (GCP), or Monitor (Azure) agent on sensor VM.
  • Export metrics to your SIEM or observability platform for centralized alerting.

What success looks like

Export metrics to your SIEM or observability platform for centralized alerting.

Demonstration only

This configuration is designed for ease of use. To deploy scanner clients at scale, please plan your deployment architecture accordingly or contact us for enterprise best practices.

実行する

sensor-environment-vars.env

bash
# Production real-time configuration
SENSOR_TYPE=suricata
MIN_SEVERITY=medium
SCAN_INTERVAL_SECONDS=30
PACKET_SAMPLING_RATE=1.0
FLUSH_INTERVAL_SECONDS=10
MAX_EVENTS_PER_BATCH=1000
BATCH_TIMEOUT_SECONDS=10

What success looks like

  • Sensor CPU utilization stays below 80% during normal traffic patterns.
  • Memory usage is stable and does not exceed the allocated instance size.
  • Ingestion latency (from capture to platform) is less than 30 seconds.
Network Sensor Scaling and Performance | BlackShield Docs