Files
sentinel/README.md

9.7 KiB

Key-IP Sentinel

Key-IP Sentinel is a FastAPI-based reverse proxy that enforces first-use IP binding for model API keys before traffic reaches a downstream New API service.

Features

  • First-use bind with HMAC-SHA256 token hashing, Redis cache-aside, and PostgreSQL CIDR matching.
  • Streaming reverse proxy built on httpx.AsyncClient and FastAPI StreamingResponse.
  • Trusted proxy IP extraction that only accepts X-Real-IP from configured upstream networks.
  • Redis-backed intercept alert counters with webhook delivery and PostgreSQL audit logs.
  • Admin API protected by JWT and Redis-backed login lockout.
  • Vue 3 + Element Plus admin console for dashboarding, binding operations, audit logs, and live runtime settings.
  • Docker Compose deployment with Nginx, app, Redis, and PostgreSQL.

Repository Layout

sentinel/
├── app/
├── db/
├── nginx/
├── frontend/
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
└── README.md

Runtime Notes

  • Redis stores binding cache, alert counters, daily dashboard metrics, and mutable runtime settings.
  • PostgreSQL stores authoritative token bindings and intercept logs.
  • Archive retention removes inactive bindings from the active table after ARCHIVE_DAYS. A later request from the same token will bind again on first use.
  • SENTINEL_FAILSAFE_MODE=closed rejects requests when both Redis and PostgreSQL are unavailable. open allows traffic through.
  • Binding rules support single (single IP or single CIDR), multiple (multiple discrete IPs), and all (allow all source IPs).

Sentinel and New API Relationship

Sentinel and New API are expected to run as two separate Docker Compose projects:

  • The Sentinel compose contains nginx, sentinel-app, redis, and postgres.
  • The New API compose contains your existing New API service and its own dependencies.
  • The two stacks communicate through a shared external Docker network.

Traffic flow:

Client / SDK
    |
    |  request to Sentinel public endpoint
    v
Sentinel nginx  ->  sentinel-app  ->  New API service  ->  model backend
                         |
                         +-> redis / postgres

The key point is: clients should call Sentinel, not call New API directly, otherwise IP binding will not take effect.

Use one external network name for both compose projects. This repository currently uses:

shared_network

In the Sentinel compose:

  • sentinel-app joins shared_network
  • nginx exposes the public entrypoint
  • DOWNSTREAM_URL points to the New API service name on that shared network

In the New API compose:

  • The New API container must also join shared_network
  • The New API service name must match what Sentinel uses in DOWNSTREAM_URL

Example:

  • New API compose service name: new-api
  • New API internal container port: 3000
  • Sentinel .env: DOWNSTREAM_URL=http://new-api:3000

If your New API service is named differently, change DOWNSTREAM_URL accordingly, for example:

DOWNSTREAM_URL=http://my-newapi:3000

Local Development

Backend

  1. Install uv and ensure Python 3.13 is available.
  2. Create the environment and sync dependencies:
uv sync
  1. Copy .env.example to .env and update secrets plus addresses.
  2. Start PostgreSQL and Redis.
  3. Run the API:
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 7000

Frontend

  1. Install dependencies:
cd frontend
npm install
  1. Start Vite dev server:
npm run dev

The Vite config proxies /admin/api/* to http://127.0.0.1:7000.

If you prefer the repository root entrypoint, uv run main.py now starts the same FastAPI app on APP_PORT (default 7000).

Dependency Management

  • Local Python development uses uv via pyproject.toml.
  • Container builds still use requirements.txt because the Dockerfile is intentionally minimal and matches the delivery requirements.

Production Deployment

1. Create the shared Docker network

Create the external network once on the Docker host:

docker network create shared_network

Both compose projects must reference this exact same external network name.

2. Make sure New API joins the shared network

In the New API project, add the external network to the New API service.

Minimal example:

services:
  new-api:
    image: your-new-api-image
    networks:
      - default
      - shared_network

networks:
  shared_network:
    external: true

Important:

  • new-api here is the service name that Sentinel will resolve on the shared network.
  • The port in DOWNSTREAM_URL must be the container internal port, not the host published port.
  • If New API already listens on 3000 inside the container, use http://new-api:3000.

3. Prepare Sentinel environment

  1. Copy .env.example to .env.
  2. Replace SENTINEL_HMAC_SECRET, ADMIN_PASSWORD, and ADMIN_JWT_SECRET.
  3. Verify DOWNSTREAM_URL points to the New API service name on shared_network.
  4. Keep PG_DSN aligned with the fixed PostgreSQL container password in docker-compose.yml, or update both together.

Example .env for Sentinel:

DOWNSTREAM_URL=http://new-api:3000
REDIS_ADDR=redis://redis:6379
REDIS_PASSWORD=
PG_DSN=postgresql+asyncpg://sentinel:password@postgres:5432/sentinel
SENTINEL_HMAC_SECRET=replace-with-a-random-32-byte-secret
ADMIN_PASSWORD=replace-with-a-strong-password
ADMIN_JWT_SECRET=replace-with-a-random-jwt-secret
TRUSTED_PROXY_IPS=172.24.0.0/16
SENTINEL_FAILSAFE_MODE=closed
APP_PORT=7000
ALERT_WEBHOOK_URL=
ALERT_THRESHOLD_COUNT=5
ALERT_THRESHOLD_SECONDS=300
ARCHIVE_DAYS=90

Notes:

  • TRUSTED_PROXY_IPS should match the Docker subnet used by the Sentinel internal network.
  • If Docker recreates the compose network with a different subnet, update this value.

4. Build the Sentinel frontend bundle

cd frontend
npm install
npm run build
cd ..

This produces frontend/dist, which Nginx serves at /admin/ui/.

5. Confirm Sentinel compose prerequisites

  • Build the frontend first. If frontend/dist is missing, /admin/ui/ cannot be served by Nginx.
  • Ensure the external Docker network shared_network already exists before starting Sentinel.

6. Start the Sentinel stack

docker compose up --build -d

Services:

  • http://<host>/ forwards model API traffic through Sentinel.
  • http://<host>/admin/ui/ serves the admin console.
  • http://<host>/admin/api/* serves the admin API.
  • http://<host>/health exposes the app health check.

7. Verify cross-compose connectivity

After both compose stacks are running:

  1. Open http://<host>:8016/health and confirm it returns {"status":"ok"}.
  2. Open http://<host>:8016/admin/ui/ and log in with ADMIN_PASSWORD.
  3. Send a real model API request to Sentinel, not to New API directly.
  4. Check the Bindings page and confirm the token appears with a recorded binding rule.

Example test request:

curl http://<host>:8016/v1/models \
  -H "Authorization: Bearer <your_api_key>"

If your client still points directly to New API, Sentinel will not see the request and no binding will be created.

Which Port Should Clients Use?

With the current example compose in this repository:

  • Sentinel public port: 8016
  • New API internal container port: usually 3000

That means:

  • For testing now, clients should call http://<host>:8016/...
  • Sentinel forwards internally to http://new-api:3000

Do not point clients at host port 3000 if that bypasses Sentinel.

How To Go Live Without Changing Client Config

If you want existing clients to stay unchanged, Sentinel must take over the original external entrypoint that clients already use.

Typical cutover strategy:

  1. Keep New API on the shared internal Docker network.
  2. Stop exposing New API directly to users.
  3. Expose Sentinel on the old public host/port instead.
  4. Keep DOWNSTREAM_URL pointing to the internal New API service on shared_network.

For example, if users currently call http://host:3000, then in production you should eventually expose Sentinel on that old public port and make New API internal-only.

The current 8016:80 mapping in docker-compose.yml is a local test mapping, not the only valid production setup.

Admin API Summary

  • POST /admin/api/login
  • GET /admin/api/dashboard
  • GET /admin/api/bindings
  • POST /admin/api/bindings/unbind
  • PUT /admin/api/bindings/ip
  • POST /admin/api/bindings/ban
  • POST /admin/api/bindings/unban
  • GET /admin/api/logs
  • GET /admin/api/logs/export
  • GET /admin/api/settings
  • PUT /admin/api/settings

All admin endpoints except /admin/api/login require Authorization: Bearer <jwt>.

Key Implementation Details

  • app/proxy/handler.py keeps the downstream response fully streamed, including SSE responses.
  • app/core/ip_utils.py never trusts client-supplied X-Forwarded-For.
  • app/services/binding_service.py batches last_used_at updates every 5 seconds through an asyncio.Queue.
  • app/services/alert_service.py pushes webhooks once the Redis counter reaches the configured threshold.
  • app/services/archive_service.py prunes stale bindings on a scheduler interval.

Suggested Smoke Checks

  1. GET /health returns {"status":"ok"}.
  2. A first request with a new bearer token creates a binding in PostgreSQL and Redis.
  3. A second request from the same IP is allowed and refreshes last_used_at.
  4. A request from a different IP is rejected with 403 and creates an intercept_logs record, unless the binding rule is all.
  5. /admin/api/login returns a JWT and the frontend can load /admin/api/dashboard.