Performance Tuning#

Vulnerability-Lookup is a multi-layered system. Tuning each layer appropriately is important for production deployments handling large volumes of vulnerability data.

This guide covers the main components: Gunicorn (web server), Kvrocks (primary storage), Valkey/Redis (cache), PostgreSQL (user accounts and metadata), and optional PgBouncer (connection pooling).

Production Reference Architecture#

A production deployment serving high traffic typically places several edge layers in front of Vulnerability-Lookup. The diagram below shows the topology used by the CIRCL/GCVE production instances and is a good reference for any deployment that needs to handle many concurrent connections.

        flowchart LR
    Client(["Client<br/>HTTP/2 + TLS"])
    HA["HAProxy<br/>TLS termination<br/>L4/L7 load balancing"]
    V["Varnish<br/>HTTP reverse-proxy cache"]
    NGX_D["Nginx<br/>open-data dumps"]
    NGX_S["Nginx<br/>static-asset CDN"]
    VL["Vulnerability-Lookup<br/>Gunicorn + Flask"]
    PGB["PgBouncer<br/>connection pooling"]
    PG[("PostgreSQL")]
    KV[("Kvrocks<br/>primary storage")]
    VAL[("Valkey")]
    DUMPS[("Daily Kvrocks<br/>NDJSON dumps")]
    STATIC[("Flask /static/<br/>CSS · JS · images")]

    Client -->|HTTPS| HA
    HA -->|HTTP/2| V
    HA -->|"HTTP/2<br/>/dumps/"| NGX_D
    HA -->|"HTTP/2<br/>/static"| NGX_S
    V -->|HTTP/1.1| VL
    NGX_D --> DUMPS
    NGX_S --> STATIC
    VL -->|SQL| PGB
    PGB --> PG
    VL -->|RESP| KV
    KV -.->|replication| VAL

Compared to the simpler Nginx-only setup described later in this document, this stack adds four dedicated front-end services — HAProxy, Varnish, and two Nginx instances (one for the open-data dumps under /dumps/, one acting as a CDN for the Flask application’s /static assets) — each solving a specific scaling problem.

HAProxy (TLS termination and load balancing)#

HAProxy sits at the edge of the deployment and is the only component exposed directly to clients. It serves several roles:

TLS termination: HAProxy handles the HTTPS/TLS handshake, decrypts incoming traffic, and forwards plain HTTP to Varnish over a private interface. Centralising TLS in HAProxy keeps certificate management in one place and offloads cryptographic work from Varnish and Gunicorn.
HTTP/2 support: Clients connect with HTTP/2, which multiplexes many requests over a single TLS connection. HAProxy speaks HTTP/2 both to the client and to the immediate backends (Varnish and the dumps Nginx), keeping a small number of multiplexed connections all the way to the cache tier. Varnish then converts to HTTP/1.1 before reaching Gunicorn, which does not need to deal with HTTP/2 framing.
Load balancing: When Vulnerability-Lookup or Varnish runs as multiple instances (horizontal scaling), HAProxy distributes requests across them using a configurable algorithm (roundrobin, leastconn, etc.) and performs active health checks to remove unhealthy backends from rotation.
Connection management: HAProxy maintains long-lived keep-alive connections to the backends and absorbs sudden bursts of short-lived client connections. This is particularly valuable in front of cache and application layers, which have lower concurrency ceilings than a dedicated load balancer.
PROXY protocol: HAProxy can forward the original client IP to the backend using the PROXY protocol — already supported by Gunicorn through the --proxy-protocol flag — so that the application sees the real client IP for logging, rate limiting, and access control.

Example configuration#

The CIRCL/GCVE production HAProxy splits traffic into three backends: the application cache (Varnish, with the application server as a backup), a static-file backend for the open-data dumps, and a CDN backend for the Flask application’s static assets (CSS/JS/images served from /static/).

frontend web
  mode http
  bind [::]:80 accept-proxy
  bind [::]:443 ssl crt /etc/haproxy/certs/ strict-sni accept-proxy

  # clean header
  http-request del-header X-Forwarded-Proto
  http-request del-header X-Forwarded-For

  # redirect to https
  http-request redirect scheme https code 308 unless { ssl_fc }
  http-request add-header X-Forwarded-Proto https if { ssl_fc } # tell the backend we are using https

  default_backend cache
  acl vulnerability hdr(host) -i vulnerability.circl.lu
  acl vulnerability-dump path_beg /dumps/
  acl vulnerability-cdn path_beg /static/

  use_backend vulnerability-cdn if vulnerability-cdn
  use_backend vulnerability-dump if vulnerability-dump
  use_backend cache if vulnerability


backend cache
  option httpchk
  http-check send meth GET uri /api/system/configInfo hdr Host vulnerability.circl.lu
  option forwardfor
  server varnish localhost:6081 check proto h2
  server vulnerability [2001:db8::1]:10001 check backup

backend vulnerability-dump
  option httpchk
  http-check send meth GET uri / hdr Host vulnerability.circl.lu
  http-request set-path "%[path,regsub(^/dumps/,/)]"
  option forwardfor
  server vulnerability [2001:db8::2]:80 check

backend vulnerability-cdn
  option httpchk
  http-check send meth GET uri / hdr Host vulnerability.circl.lu
  option forwardfor
  server vulnerability [2001:db8::3]:80 check

A few details worth noting:

The cache backend marks the application server ([2001:db8::1]:10001) as backup: HAProxy only routes there if Varnish goes unhealthy, so the cache failure mode is degraded performance, not an outage.
server varnish localhost:6081 check proto h2 — HAProxy speaks HTTP/2 to Varnish, matching the protocol shown in the diagram above.
http-request set-path "%[path,regsub(^/dumps/,/)]" strips the /dumps/ prefix before reaching the Nginx static-file server, so the files can live at the root of that server’s document root.
The health check uses /api/system/configInfo (see website/web/api/v1/system.py).

Varnish (HTTP reverse-proxy cache)#

Varnish is an HTTP reverse-proxy cache placed between HAProxy and the application. Its purpose is to serve repeated GET requests for the same vulnerability data directly from memory, without invoking Gunicorn, Kvrocks, or PostgreSQL.

Why this matters for Vulnerability-Lookup:

Vulnerability data is largely read-only and highly cacheable. A given record (e.g., GET /vuln/cve-2024-1234) changes infrequently, while many clients query the same records repeatedly. Caching these responses for even a few minutes can collapse tens of thousands of identical requests into a handful of cache hits.
API endpoints benefit too. Aggregations such as recent vulnerabilities, source statistics, or KEV exports can be cached with short TTLs while still remaining sufficiently fresh.
Latency. A Varnish cache hit responds in microseconds — much faster than serialising JSON from Kvrocks through Flask.
Burst protection. Varnish absorbs traffic spikes (intentional or accidental) and prevents thundering-herd effects from reaching the Gunicorn worker pool.

When configuring Varnish, define cacheability via Cache-Control response headers in the Flask application (preferred) or via VCL rules. Good candidates for caching are stable read endpoints such as /vuln/<id>, /api/v1/vulnerability/<id>, and static assets. Endpoints that modify state (POST/PUT/DELETE), authenticated views, and personalised dashboards must bypass the cache.

Example configuration#

A minimal VCL that mirrors the CIRCL production setup. It forwards uncacheable authenticated API requests (those carrying x-api-key), strips most cookies so that anonymous responses share a single cache key, gzip-compresses textual payloads, and gives both Varnish and the client a 15-minute / 15-minute TTL on cacheable responses (with a 2-hour grace window for stale-while-revalidate).

backend default {
    .host = "2001:db8::1";
    .port = "10001";
    .first_byte_timeout = 30s;
    .between_bytes_timeout = 20s;
}

sub vcl_recv {
    # don't cache authenticated api requests
    if (req.http.x-api-key) {
        return (pass);
    }
    # remove cookie to force cache for special bootstrap content
    if (req.url ~ "^/bootstrap($|/.*)") {
        unset req.http.cookie;
    }
    # remove unwanted cookie to cache more requests
    if (req.http.Cookie) {
        set req.http.Cookie = ";" + req.http.Cookie;
        set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
        set req.http.Cookie = regsuball(req.http.Cookie, ";(vulnerability-lookup)=", "; \1=");
        set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
        set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

        if (req.http.Cookie == "") {
            unset req.http.Cookie;
        }
    }
}

sub vcl_backend_response {
    set beresp.ttl = 15m;
    set beresp.grace = 2h;
    # Allow browser to cache requests longer
    unset beresp.http.Cache-Control;
    set beresp.http.cache-control = "max-age=900";
    # compress text content
    if (beresp.http.content-type ~ "(application/json|text/html|image/svg|text/css|text/javascript)") {
        set beresp.do_gzip = true;
    }
    # increase cache time
    if (!beresp.http.last-modified) {
        set beresp.http.last-modified = now;
    }
}

Nginx (static-file backends)#

Two dedicated Nginx instances handle the static-file traffic that should never touch Varnish or Gunicorn. Both are routed by HAProxy on path prefixes (/dumps/ and /static).

Open-data dumps (/dumps/) — serves the daily NDJSON exports of the Kvrocks database produced by bin/dump.py (one file per feed plus comments, bundles, sightings, and KEV entries — see Command-Line Interface). These exports are published as open data for archival and research; they are static files written once a day and therefore do not need to go through the application stack:

Serving the files directly from disk with sendfile is much cheaper than proxying through the application stack.
Decoupling bulk downloads from the application protects the Gunicorn worker pool from being tied up by long-running file transfers.
A directory listing (autoindex on) lets researchers discover available feeds without needing an additional index endpoint.

The location /dumps/ block to add to the Nginx server is shown in Reverse Proxy (Nginx) below.

Static-asset CDN (/static) — serves the CSS, JavaScript, and image assets shipped with the Flask application. In the simpler single-Nginx setup described later, these are served by the same Nginx that proxies the application. In the CIRCL production topology the /static prefix is split off to a separate Nginx instance so that asset traffic — which is by far the most frequent request type for a typical browser session — never competes with dynamic requests for HAProxy or Varnish backend connections.

Storage and database backends#

The right-hand side of the diagram shows the data layer:

PgBouncer → PostgreSQL — relational data (users, comments, bundles, sightings, watchlists). See PgBouncer and PostgreSQL below.
Kvrocks → Valkey — Kvrocks is the primary vulnerability store; in this setup it replicates to a Valkey instance over the Redis replication protocol, providing a hot standby and additional read capacity. See Kvrocks and Valkey/Redis.

The RESP label on the Vulnerability-Lookup → Kvrocks edge refers to the REdis Serialization Protocol, the TCP wire format that Kvrocks shares with Redis and Valkey. The application talks to Kvrocks through a redis-py client on the standard Kvrocks TCP port (default 10002).

The local cache layer (Valkey/Redis on a Unix socket) used by the application itself is omitted from the diagram for clarity — it is described in the Valkey/Redis section.

Gunicorn (Web Server)#

Gunicorn serves the Flask application. Its settings are controlled via config/generic.json and applied in bin/start_website.py.

Workers#

The website_workers setting in config/generic.json controls the number of Gunicorn worker processes. A common starting point:

workers = (2 × CPU_CORES) + 1

For I/O-bound workloads (typical for Vulnerability-Lookup), you can go higher since the gevent async worker class is used. Monitor memory usage — each worker is a separate process.

Timeouts#

The default request timeout is 300 seconds (5 minutes). This accommodates large API responses (e.g., full vulnerability exports). The graceful shutdown timeout is 2 seconds.

Key Gunicorn flags used:

--worker-class gevent       # async worker for concurrent I/O
--timeout 300               # max request processing time
--graceful-timeout 2        # grace period on shutdown
--reuse-port                # SO_REUSEPORT for load balancing across workers
--proxy-protocol            # preserve client IPs behind a reverse proxy

Kvrocks (Primary Storage)#

Kvrocks is the primary data store for all vulnerability records. Its configuration is in storage/kvrocks.conf.

Worker Threads#

workers 8

The number of I/O threads processing client commands. Increase on machines with more CPU cores and high concurrency. A reasonable value is the number of CPU cores available to the Kvrocks process.

Connection Limits#

maxclients 10000
tcp-backlog 511

maxclients sets the maximum number of concurrent client connections. Ensure the OS file descriptor limit (ulimit -n) is higher than this value.

For tcp-backlog, also verify the kernel setting:

sysctl net.core.somaxconn        # should be >= tcp-backlog
sysctl net.ipv4.tcp_max_syn_backlog

RocksDB Tuning#

Kvrocks uses RocksDB as its storage engine. Key parameters:

rocksdb.block_cache_size 4096           # block cache in MB (default 4 GB)
rocksdb.max_open_files 8096             # file descriptors for SST files
rocksdb.write_buffer_size 64            # memtable size in MB
rocksdb.max_write_buffer_number 4       # concurrent memtables
rocksdb.max_background_jobs 4           # compaction and flush threads
rocksdb.target_file_size_base 128       # SST file target size in MB
rocksdb.max_total_wal_size 512          # WAL size limit in MB

Recommendations for large instances:

block_cache_size: Increase to fit the hot dataset in memory. On a dedicated server with plenty of RAM, allocate 25-50% of available memory.
max_open_files: Set to -1 to keep all files open (avoids open/close overhead).
write_buffer_size: Increase for write-heavy workloads (e.g., initial bulk import). Larger buffers reduce write amplification.
max_background_jobs: Increase on multi-core systems to speed up compaction.

Disk I/O Throttling#

max-io-mb 0                 # 0 = no limit on flush/compaction write rate
max-replication-mb 0        # 0 = no limit on replication rate

Set max-io-mb to a non-zero value if compaction I/O impacts foreground latency.

Valkey/Redis (Cache Layer)#

The cache layer uses a Unix domain socket for lowest-latency local communication. Its configuration is in cache/cache.conf.

port 0                      # TCP disabled — Unix socket only
unixsocket cache.sock
unixsocketperm 700
tcp-keepalive 300

Since the cache runs locally on a Unix socket (no TCP overhead), it is already optimized for latency. The main tuning concern is memory:

Monitor memory usage with redis-cli -s cache/cache.sock INFO memory.
Set maxmemory and an eviction policy (e.g., allkeys-lru) if the cache should not grow unbounded.

PostgreSQL#

PostgreSQL stores user accounts, comments, bundles, sightings, and watchlists. Tune based on your expected user base and concurrent connections.

# Connection / memory
max_connections = 200                     # keep moderate when using PgBouncer
shared_buffers = 64GB                     # ~25% of RAM
work_mem = 64MB                           # per query memory; adjust if heavy queries
maintenance_work_mem = 4GB

# WAL / checkpoints
wal_buffers = 16MB
checkpoint_completion_target = 0.9
max_wal_size = 4GB
min_wal_size = 1GB

# Autovacuum
autovacuum = on
autovacuum_max_workers = 10
autovacuum_naptime = 10s
autovacuum_vacuum_cost_limit = 2000

# Logging
log_connections = on
log_disconnections = on
log_min_duration_statement = 2000         # log slow queries > 2s

Note

The shared_buffers value should be adjusted based on your actual server RAM. A common guideline is ~25% of available memory.

PgBouncer (Optional)#

PgBouncer is a connection pooler that sits between the application and PostgreSQL. It reduces the overhead of establishing new connections and manages a pool of persistent connections. This is recommended when running many Gunicorn workers (each worker may hold its own database connections).

[databases]
vulnlookup = host=127.0.0.1 port=5432 dbname=vulnlookup

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
admin_users = vuln-lookup

pool_mode = transaction                     # best for web apps
default_pool_size = 200                     # number of server connections per DB
min_pool_size = 50
reserve_pool_size = 50
reserve_pool_timeout = 5.0                  # seconds to wait for reserve
max_client_conn = 5000

# Logging / stats
logfile = /var/log/pgbouncer/pgbouncer.log
log_connections = 1
log_disconnections = 1

Note

When using PgBouncer, point DB_CONFIG_DICT in config/website.py to the PgBouncer port (6432) instead of PostgreSQL directly (5432).

Note

Here default_pool_size (200) equals PostgreSQL’s max_connections (200). Consider reserving a few connections for direct admin access by either increasing max_connections or slightly reducing default_pool_size.

SQLAlchemy Connection Pool#

The SQLAlchemy engine options in config/website.py control the application-level connection pool to PostgreSQL:

Without PgBouncer (default):

SQLALCHEMY_ENGINE_OPTIONS = {
    "pool_size": 100,               # persistent connections in the pool
    "max_overflow": 50,             # extra connections during traffic spikes
    "pool_timeout": 30,             # seconds to wait for a connection
    "pool_recycle": 1800,           # recycle connections every 30 minutes
    "pool_pre_ping": True,          # verify connection liveness before use
}

Warning

Each Gunicorn worker process creates its own connection pool. The total number of possible connections is pool_size × website_workers. For example, with pool_size=100 and 49 workers, the application could open up to 4,900 connections, far exceeding PostgreSQL’s max_connections. Without PgBouncer, reduce pool_size so that pool_size × website_workers stays well below max_connections.

With PgBouncer (let PgBouncer manage the pool):

SQLALCHEMY_ENGINE_OPTIONS = {
    "pool_size": 0,                 # no persistent connections; rely on PgBouncer
    "max_overflow": 0,
    "pool_pre_ping": True,
    "pool_timeout": 30,
    "pool_recycle": 3600,
}

Logging#

Reducing log verbosity in production decreases disk I/O and improves throughput:

config/generic.json: set loglevel to WARNING or ERROR
config/website.py: set LOG_LEVEL to WARNING
storage/kvrocks.conf: log-level warning (default)
Per-feeder log levels in config/modules.cfg: change level = DEBUG to level = WARNING for production

The logging configuration in config/logging.json uses rotating file handlers (1 MB per file, 5 backups) which bounds disk usage.

Operating System Tuning#

File Descriptors#

Kvrocks, Valkey, and Gunicorn all benefit from high file descriptor limits:

# /etc/security/limits.conf  (or systemd unit override)
* soft nofile 65536
* hard nofile 65536

Network Stack#

For high-concurrency deployments:

sysctl -w net.core.somaxconn=4096
sysctl -w net.ipv4.tcp_max_syn_backlog=4096
sysctl -w net.core.netdev_max_backlog=4096
sysctl -w vm.overcommit_memory=1              # recommended for Redis/Kvrocks

Transparent Huge Pages (THP)#

Disable THP for Kvrocks and Valkey to avoid latency spikes:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

Reverse Proxy (Nginx)#

When running behind Nginx, a minimal performance-oriented configuration:

upstream vulnlookup {
    server 127.0.0.1:10001;
    keepalive 64;
}

server {
    listen 443 ssl http2;
    server_name vulnerability.example.org;

    client_max_body_size 10m;

    location / {
        proxy_pass http://vulnlookup;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        proxy_connect_timeout 10;
        proxy_read_timeout 300;           # match Gunicorn timeout
        proxy_send_timeout 300;
    }

    # Open-data NDJSON dumps produced daily by ``bin/dump.py``.
    # Served as static files; never proxied to Gunicorn.
    location /dumps/ {
        alias /path/to/vulnerability-lookup/dumps/;   # $VULNERABILITY_LOOKUP_HOME/dumps/

        autoindex on;                     # public directory listing
        autoindex_exact_size off;
        autoindex_localtime on;

        # Serve .ndjson with a useful content type
        types {
            application/x-ndjson  ndjson;
        }
        default_type application/octet-stream;

        sendfile on;                      # zero-copy for large dumps
        tcp_nopush on;
        gzip_static on;                   # serve pre-compressed .ndjson.gz if present

        # Dumps are regenerated daily; allow downstream caches to keep them briefly.
        expires 1h;
        add_header Cache-Control "public";

        # Only GET/HEAD make sense for a static dump tree.
        limit_except GET HEAD { deny all; }
    }
}

Note

Gunicorn is started with --proxy-protocol. If your reverse proxy supports the PROXY protocol, enable it. Otherwise, rely on X-Forwarded-For headers and consider removing --proxy-protocol from bin/start_website.py.

Note

The alias path under location /dumps/ must point at the dumps/ directory under the Vulnerability-Lookup home ($VULNERABILITY_LOOKUP_HOME/dumps/), which is where bin/dump.py writes the NDJSON exports. The Nginx process must have read access to this directory.

Summary#

Layer	Key Settings	Tune When
Gunicorn	`website_workers`, `--timeout`, `--worker-class gevent`	High concurrent users, slow responses
Kvrocks	`workers`, `maxclients`, `rocksdb.block_cache_size`	Large dataset, slow queries, high write load
Valkey/Redis	`maxmemory`, eviction policy	Cache memory growing unbounded
PostgreSQL	`max_connections`, `shared_buffers`, `work_mem`	Slow user/account queries, high concurrency
PgBouncer	`pool_mode`, `default_pool_size`, `max_client_conn`	Many workers exhausting PostgreSQL connections
SQLAlchemy	`pool_size`, `max_overflow`, `pool_pre_ping`	Connection errors, stale connections
OS	`nofile`, `somaxconn`, THP	Connection refused errors, latency spikes

Performance Tuning

Contents

Performance Tuning#

Production Reference Architecture#

HAProxy (TLS termination and load balancing)#

Example configuration#

Varnish (HTTP reverse-proxy cache)#

Example configuration#

Nginx (static-file backends)#

Storage and database backends#

Gunicorn (Web Server)#

Workers#

Timeouts#

Kvrocks (Primary Storage)#

Worker Threads#

Connection Limits#

RocksDB Tuning#

Disk I/O Throttling#

Valkey/Redis (Cache Layer)#

PostgreSQL#

PgBouncer (Optional)#

SQLAlchemy Connection Pool#

Logging#

Operating System Tuning#

File Descriptors#

Network Stack#

Transparent Huge Pages (THP)#

Reverse Proxy (Nginx)#

Summary#