Backend

Node.js Worker Threads for CPU Tasks

When to use CPU-bound tasks (crypto, compression, image/video processing) that would block event loop. Avoid for short, tiny tasks—overhead may outweigh benefit. Patterns Use a pool (e.g., piscina/workerpool) with bounded size; queue tasks. Pass data via Transferable when large buffers; avoid heavy serialization. Propagate cancellation/timeouts; surface errors to main thread. Observability Track queue length, task duration, worker utilization, crashes. Monitor event loop lag to confirm offload benefits. Safety Validate inputs in main thread; avoid untrusted code in workers. Cleanly shut down pool on SIGTERM; drain and close workers. Checklist CPU tasks isolated to workers; pool sized to cores. Transferables used for big buffers; timeouts set. Metrics for queue/utilization/errors in place.

Scaling Node.js: Fastify vs Express

Performance notes Fastify: schema-driven, AJV validation, low-overhead routing; typically better RPS and lower p99s. Express: mature ecosystem; middleware can add overhead; great for quick prototypes. Bench considerations Compare with same validation/parsing; disable unnecessary middleware in Express. Test under keep-alive and HTTP/1.1 & HTTP/2 where applicable. Measure event loop lag and heap; not just RPS. Migration tips Start new services with Fastify; for existing Express apps, migrate edge routes first. Replace middleware with hooks/plugins; map Express middlewares to Fastify equivalents. Validate payloads with JSON schema for speed and safety. Operational tips Use pino (Fastify default) for structured logs; avoid console log overhead. Keep plugin count minimal; watch async hooks cost. Load test with realistic payload sizes; profile hotspots before/after migration. Takeaway: Fastify wins on raw throughput and structured DX; Express still fine for smaller apps, but performance-critical services benefit from Fastify’s lower overhead.

Rate Limiting Java REST APIs

Approaches Token bucket for burst+steady control; sliding window for fairness. Enforce at edge (gateway/ingress) plus app-level for per-tenant safety. Spring implementation Use filters/interceptors with Redis/Lua for atomic buckets. Key by tenant/user/IP; return 429 with Retry-After. Expose metrics per key and rule; alert on near-capacity. Considerations Separate auth failures from rate limits; avoid blocking login endpoints too aggressively. Keep rule configs dynamic; hot-reload from config store. Combine with circuit breakers/timeouts for upstream dependencies. Checklist Edge and app-level limits defined. Redis-based atomic counters/buckets with TTL. Metrics + logs for limit decisions; alerts in place.

Go High-Performance Concurrency Playbook

Principles Keep goroutine count bounded; size pools to CPU/core and downstream QPS. Apply backpressure: bounded channels + select with default to shed load early. Context everywhere: cancel on timeouts/parent cancellation; close resources. Prefer immutability; minimize shared state. Use channels for coordination. Patterns Worker pool with buffered jobs; fan-out/fan-in via contexts. Rate limit with time.Ticker or golang.org/x/time/rate. Semaphore via buffered channel for limited resources (DB, disk, external API). Sync cheatsheet sync.Mutex for critical sections; avoid long hold times. sync.WaitGroup for bounded concurrent tasks; errgroup for cancellation on first error. sync.Map only for high-concurrency, write-light cases; prefer map+mutex otherwise. Instrument & guard pprof: net/http/pprof + CPU/mem profiles in staging under load. Trace blocking: GODEBUG=schedtrace=1000,scavtrace=1 when diagnosing. Metrics: goroutines, GC pause, allocations, queue depth, worker utilization. Checklist Context per request; timeouts set at ingress. Bounded goroutines; pools sized and observable. Backpressure on queues; drop/timeout strategy defined. pprof/metrics enabled in non-prod and behind auth in prod. Load tests for saturation behavior and graceful degradation.

Laravel Queues with Horizon: Reliable Setup

Workers & balancing Define queue priorities; dedicate workers per queue (emails, webhooks, default). Use balance strategies (simple, auto) and cap max processes per supervisor. Reliability Set retry/backoff per job; push non-idempotent tasks carefully. Configure timeout and retry_after (keep retry_after > max job time). Use Redis with persistence; enable horizon:supervisors monitors. Observability Horizon dashboard: throughput, runtime, failures, retries. Alert on rising failures and long runtimes; log payload/context for failed jobs. Prune failed jobs with retention policy; send to DLQ when needed. Deployment Restart Horizon on deploy to pick up code; use horizon:terminate. Ensure supervisor/systemd restarts Horizon if it dies. Checklist Queues prioritized; supervisors sized. Retries/backoff and timeouts set; DLQ plan. Monitoring/alerts configured; failed job retention in place.

Spring Boot Observability: Metrics, Traces, Logs

Metrics Use Micrometer + Prometheus: management.endpoints.web.exposure.include=prometheus,health,info. Add JVM+Tomcat/db pool meters; set percentiles for latencies. Create SLIs: request latency, error rate, saturation (threads/connections), GC pauses. Traces Spring Boot 3 ships with OTel starter: add spring-boot-starter-actuator + micrometer-tracing-bridge-otel + exporter (OTLP/Zipkin/Jaeger). Propagate headers (traceparent); ensure async executors use ContextPropagatingExecutor. Sample smartly: lower rates on noisy paths; raise for errors. Logs Use JSON layout; include traceId/spanId for correlation. Avoid verbose INFO in hot paths; keep payload size bounded. Dashboards & alerts Latency/error SLO dashboards per endpoint. DB pool saturation, thread pool queue depth, GC pause, heap used %, 5xx rate. Alerts on SLO burn rates; include exemplars linking metrics → traces → logs. Checklist Actuator endpoints secured and exposed only where needed. OTLP exporter configured; sampling tuned. Trace/log correlation verified in staging. Dashboards + alerts reviewed with oncall.

Tuning Kafka Consumers (Java)

Core settings max.poll.interval.ms sized to processing time; max.poll.records to batch size. fetch.min.bytes/fetch.max.wait.ms to trade latency vs throughput. enable.auto.commit=false; commit sync/async after processing batch. Concurrency Prefer multiple consumer instances over massive max.poll.records. For CPU-bound steps, hand off to bounded executor; avoid blocking poll thread. Ordering & retries Keep partition affinity when ordering matters; use DLT for poison messages. Backoff with jitter on retries; limit attempts per message. Observability Metrics: lag per partition, commit latency, rebalances, processing time, error rates. Log offsets and partition for errors; trace batch sizes. Checklist Poll loop never blocks; work delegated to bounded pool. Commits after successful processing; DLT in place. Lag and rebalance metrics monitored.

Go Profiling in Production with pprof

Capturing safely Expose /debug/pprof behind auth/VPN; or run curl -sK -H "Authorization: Bearer ...". CPU profile: go tool pprof http://host/debug/pprof/profile?seconds=30. Heap profile: .../heap; Goroutines: .../goroutine?debug=2. For containers: kubectl port-forward then grab profiles; avoid prod CPU throttle when profiling. Reading CPU profiles Look at flat vs. cumulative time; identify hot functions. Flamegraph: go tool pprof -http=:8081 cpu.pprof. Check GC activity and syscalls; watch for mutex contention. Reading heap profiles Compare live allocations vs. in-use objects; watch large []byte and map growth. Look for leaks via rising heap over time; diff profiles between runs. Goroutine dumps Spot leaked goroutines (blocked on channel/lock/I/O). Common culprits: missing cancel, unbounded worker creation, stuck time.After. Best practices Add pprof only when needed in prod; default on in staging. Sample under load close to real traffic. Keep artifacts: store profiles with build SHA + timestamp; compare after releases. Combine with metrics (alloc rate, GC pauses, goroutines) to validate fixes.

Circuit Breakers with Resilience4j

Core settings Sliding window (count/time), failure rate threshold, slow-call threshold, minimum calls. Wait duration in open state; half-open permitted calls; automatic transition. Patterns Wrap HTTP/DB/queue clients; combine with timeouts/retries/bulkheads. Tune per dependency; differentiate fast-fail vs. tolerant paths. Provide fallback only when safe/idempotent. Observability Export metrics: state changes, calls/success/failure/slow, not permitted count. Log state transitions; add exemplars linking to traces. Alert on frequent open/half-open oscillation. Checklist Per-downstream breaker with tailored thresholds. Timeouts and retries composed correctly (timeout → breaker → retry). Metrics/logs/traces wired; alerts on open rate.

Java GC Tuning: G1 and ZGC in Practice

Choosing G1: balanced latency/throughput for heaps 4–64GB; predictable pauses. ZGC: sub-10ms pauses on large heaps; great for latency-sensitive APIs; slightly higher CPU. Baseline flags G1: -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:+ParallelRefProcEnabled -XX:+AlwaysPreTouch ZGC: -XX:+UseZGC -XX:+ZGenerational -XX:+AlwaysPreTouch Set -Xms = -Xmx for stable footprint; size heap from prod RSS data. Metrics to watch Pause p95/p99, GC CPU %, allocation rate, remembered set size (G1), heap occupancy. STW reasons: promotion failure, humongous allocations (G1), metaspace growth. Common fixes Reduce humongous allocations: avoid giant byte[]; use chunked buffers. Lower pause targets only after measuring; avoid over-constraining MaxGCPauseMillis. Cap thread counts: -XX:ParallelGCThreads, -XX:ConcGCThreads if CPU saturated. For ZGC, ensure kernel pages hugepage-friendly; watch NUMA pinning. Checklist Heap sized from live data; -Xms = -Xmx. GC logs on (JDK 17+): -Xlog:gc*:tags,level,time,uptime:file=gc.log:utctime,filesize=20M,files=10 Dashboards for pause/CPU/allocation. Load test changes before prod; compare pause histograms release to release.