BLACKSHIELD

Guía pública

Network Sensor Scaling and Performance

Capacity planning, performance tuning, and sizing guidance for high-volume network telemetry ingestion. Audiencia: Platform architects, operations engineers, security engineers. Tiempo típico de configuración: 10 minutes.

reference

Úsalo si

Capacity planning, performance tuning, and sizing guidance for high-volume network telemetry ingestion.

Audience
Platform architects, operations engineers, security engineers
Typical time
10 minutes

Antes de comenzar

  • You have deployed at least one network sensor and are familiar with basic operations.
  • You understand your network's typical traffic volume and have monitoring in place.
  • You have access to CloudWatch (AWS), Cloud Monitoring (GCP), or Monitor (Azure).

Guide walkthrough

Paso 1

Determine traffic volume and sizing tier

Size the sensor infrastructure based on expected network traffic and alert volume.

  • Low volume (<10 Gbps, <1000 alerts/min): t3.medium or g1-small — single sensor instance.
  • Medium volume (10–50 Gbps, 1k–10k alerts/min): m5.large or n1-standard-2 — single sensor, upgrade CPU/memory.
  • High volume (50–500 Gbps, 10k–100k alerts/min): c5.2xlarge or n1-standard-4 — multi-sensor active-passive or active-active.
  • Very high volume (>500 Gbps): multi-sensor active-active with load balancing and dedicated backend.

Cómo se ve el éxito

Very high volume (>500 Gbps): multi-sensor active-active with load balancing and dedicated backend.

Paso 2

Tuning configuration for your workload

Adjust sensor parameters to match your priorities (real-time vs. accuracy vs. cost).

  • Real-time priority: MIN_SEVERITY=medium, FLUSH_INTERVAL_SECONDS=10, PACKET_SAMPLING_RATE=1.0 (no sampling).
  • Cost-optimized: MIN_SEVERITY=high, FLUSH_INTERVAL_SECONDS=300, PACKET_SAMPLING_RATE=0.1 (10% sampling).
  • High-volume ingestion: set MAX_EVENTS_PER_BATCH=5000, BATCH_TIMEOUT_SECONDS=30.
  • Reduce API load: set SCAN_INTERVAL_SECONDS=60 for periodic ingestion vs. continuous.

Cómo se ve el éxito

Reduce API load: set SCAN_INTERVAL_SECONDS=60 for periodic ingestion vs. continuous.

Paso 3

Sensor type comparison and selection

Choose between Suricata, Zeek, and eBPF based on use case and resource constraints.

  • Suricata: 15–20 Gbps per core, best for malware/IDS detection, highest memory (4–8 GB for 50 Gbps).
  • Zeek: 5–10 Gbps per core, best for protocol analysis and behavior profiling, moderate memory (2–4 GB).
  • eBPF: 50–100 Gbps per core, best for runtime events and system call monitoring, lowest memory (500 MB–1 GB).

Cómo se ve el éxito

eBPF: 50–100 Gbps per core, best for runtime events and system call monitoring, lowest memory (500 MB–1 GB).

Paso 4

Monitoring and alerting

Set up dashboards and alerts to track sensor health and performance.

  • Monitor: findings_ingested_total, capture_packets_dropped, cpu_usage, memory_usage, api_request_latency.
  • Alert on: cpu_usage > 80%, memory_usage > 85%, capture_packets_dropped > 1%, api_errors_5xx > 10/min.
  • Enable CloudWatch (AWS), Cloud Monitoring (GCP), or Monitor (Azure) agent on sensor VM.
  • Export metrics to your SIEM or observability platform for centralized alerting.

Cómo se ve el éxito

Export metrics to your SIEM or observability platform for centralized alerting.

Solo demostración

Esta configuración está diseñada para facilitar el uso. Para desplegar clientes de escaneo a escala, planifique su arquitectura de despliegue en consecuencia o contáctenos para obtener las mejores prácticas empresariales.

Ejecuta esto

sensor-environment-vars.env

bash
# Production real-time configuration
SENSOR_TYPE=suricata
MIN_SEVERITY=medium
SCAN_INTERVAL_SECONDS=30
PACKET_SAMPLING_RATE=1.0
FLUSH_INTERVAL_SECONDS=10
MAX_EVENTS_PER_BATCH=1000
BATCH_TIMEOUT_SECONDS=10

Cómo se ve el éxito

  • Sensor CPU utilization stays below 80% during normal traffic patterns.
  • Memory usage is stable and does not exceed the allocated instance size.
  • Ingestion latency (from capture to platform) is less than 30 seconds.
Network Sensor Scaling and Performance | Docs de BlackShield