BLACKSHIELD

सार्वजनिक गाइड

Network Sensor Troubleshooting

Diagnostic runbook for container startup, traffic capture, API connectivity, health checks, resource usage, and ingestion gaps. लक्षित पाठक: Operations engineers, DevOps teams, security operations teams. सामान्य सेटअप समय: 5-15 minutes.

troubleshooting

Use this if

Diagnostic runbook for container startup, traffic capture, API connectivity, health checks, resource usage, and ingestion gaps.

Audience
Operations engineers, DevOps teams, security operations teams
Typical time
5-15 minutes

शुरू करने से पहले

  • You have SSH or console access to the sensor instance.
  • Docker and basic Linux utilities (tcpdump, curl, netstat) are available on the sensor.
  • You know the sensor's API key and platform API URL.

Guide walkthrough

चरण 1

Container startup issues

Diagnose problems with sensor container initialization and API key retrieval.

  • Check container status: docker ps -a | grep network-sensor
  • Review startup logs: docker logs [container-id] — look for Secrets Manager errors, API connectivity failures, or invalid configuration.
  • Verify API key is in Secrets Manager or environment: aws secretsmanager get-secret-value --secret-id blackshield/api-key
  • Confirm network connectivity to the platform API: curl -v https://api.blackshield.chaplau.com/health

What success looks like

Confirm network connectivity to the platform API: curl -v https://api.blackshield.chaplau.com/health

चरण 2

Traffic capture and mirroring

Verify packets are reaching the sensor and being processed.

  • Check mirroring configuration: AWS (describe-traffic-mirror-sessions), GCP (gcloud compute packet-mirrorings list), Azure (az network vnet tap list)
  • Capture packets on sensor: sudo tcpdump -i eth0 'udp port 4789 or vxlan' -c 10 — should see VXLAN encapsulated traffic.
  • If no packets, check source VM traffic: trigger curl or ping from a monitored workload and re-run tcpdump.
  • Verify sensor interface is NOT in promisc mode race condition: ip link show [interface]

What success looks like

Verify sensor interface is NOT in promisc mode race condition: ip link show [interface]

चरण 3

API connectivity and authentication

Ensure the sensor can reach and authenticate with the platform.

  • Test connectivity: curl -s -H 'Authorization: Bearer [API_KEY]' https://api.blackshield.chaplau.com/api/v1/health
  • Check sensor logs for '401 Unauthorized' or '403 Forbidden'.
  • Verify API key is active in Settings > API Keys and has Ingestion scope.
  • Test with different network path (via NAT, proxy, etc.) if behind firewall.

What success looks like

Test with different network path (via NAT, proxy, etc.) if behind firewall.

चरण 4

Health check and readiness

Verify the sensor health endpoint and readiness probe.

  • Health check: curl -s http://localhost:8080/health — should return 200 with 'ok' status.
  • Readiness: curl -s http://localhost:8080/ready — confirms sensor is ready to ingest.
  • If health fails, check disk space (docker exec [id] df -h), memory (docker stats), and CPU.
  • Review container resource limits: docker inspect [id] | grep -A 10 HostConfig

What success looks like

Review container resource limits: docker inspect [id] | grep -A 10 HostConfig

चरण 5

Resource usage and tuning

Monitor CPU, memory, and disk usage; adjust configuration if needed.

  • Real-time stats: docker stats [container-id] — look for CPU % and memory %.
  • High CPU: reduce MIN_SEVERITY, set PACKET_SAMPLING_RATE < 1, or use SENSOR_FILTER_RULES.
  • High memory: reduce MAX_ALERTS_BUFFER_SIZE, increase FLUSH_INTERVAL_SECONDS.
  • Disk full: docker exec [id] du -sh /var/log/ — remove old scan logs or increase EBS/disk size.

What success looks like

Disk full: docker exec [id] du -sh /var/log/ — remove old scan logs or increase EBS/disk size.

चरण 6

Ingestion gaps and missing findings

Diagnose why expected findings are not appearing in the platform.

  • Check sensor metrics: docker exec [id] curl http://localhost:8080/metrics | grep ingestion
  • Verify MIN_SEVERITY is not filtering out findings: set MIN_SEVERITY=low temporarily to test.
  • Check for 429 (rate limit) or 503 (service degraded) errors in sensor logs.
  • Confirm findings are being sent: docker exec [id] tcpdump -i eth0 host api.blackshield.chaplau.com dst port 443 -A | grep -i 'blackshield'

What success looks like

Confirm findings are being sent: docker exec [id] tcpdump -i eth0 host api.blackshield.chaplau.com dst port 443 -A | grep -i 'blackshield'

यह चलाएँ

troubleshooting-commands.sh

bash
#!/bin/bash
# Quick troubleshooting commands

docker ps -a | grep network-sensor
docker logs -f "$(docker ps -q | head -1)" | tail -50
curl -s -H "Authorization: Bearer $BLACKSHIELD_API_KEY" https://api.blackshield.chaplau.com/api/v1/health

What success looks like

  • Container is running and not restarting continuously.
  • Sensor can successfully authenticate to the platform API.
  • Network telemetry is flowing to the platform with no dropped packets.
Network Sensor Troubleshooting | BlackShield Docs