Key-IP Sentinel

Key-IP Sentinel is a FastAPI-based reverse proxy that enforces first-use IP binding for model API keys before traffic reaches a downstream New API service.

Features

  • First-use bind with HMAC-SHA256 token hashing, Redis cache-aside, and PostgreSQL CIDR matching.
  • Streaming reverse proxy built on httpx.AsyncClient and FastAPI StreamingResponse.
  • Trusted proxy IP extraction that only accepts X-Real-IP from configured upstream networks.
  • Redis-backed intercept alert counters with webhook delivery and PostgreSQL audit logs.
  • Admin API protected by JWT and Redis-backed login lockout.
  • Vue 3 + Element Plus admin console for dashboarding, binding operations, audit logs, and live runtime settings.
  • Docker Compose deployment with Nginx, app, Redis, and PostgreSQL.

Repository Layout

sentinel/
├── app/
├── db/
├── nginx/
├── frontend/
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
└── README.md

Runtime Notes

  • Redis stores binding cache, alert counters, daily dashboard metrics, and mutable runtime settings.
  • PostgreSQL stores authoritative token bindings and intercept logs.
  • Archive retention removes inactive bindings from the active table after ARCHIVE_DAYS. A later request from the same token will bind again on first use.
  • SENTINEL_FAILSAFE_MODE=closed rejects requests when both Redis and PostgreSQL are unavailable. open allows traffic through.
  • Binding rules support single (single IP or single CIDR), multiple (multiple discrete IPs), and all (allow all source IPs).

Sentinel and New API Relationship

Sentinel and New API are expected to run as two separate Docker Compose projects:

  • The Sentinel compose contains nginx, sentinel-app, redis, and postgres.
  • The New API compose contains your existing New API service and its own dependencies.
  • The two stacks communicate through a shared external Docker network.

Traffic flow:

Client / SDK
    |
    |  request to Sentinel public endpoint
    v
Sentinel nginx  ->  sentinel-app  ->  New API service  ->  model backend
                         |
                         +-> redis / postgres

The key point is: clients should call Sentinel, not call New API directly, otherwise IP binding will not take effect.

Use one external network name for both compose projects. This repository currently uses:

shared_network

In the Sentinel compose:

  • sentinel-app joins shared_network
  • nginx exposes the public entrypoint
  • DOWNSTREAM_URL points to the New API service name on that shared network

In the New API compose:

  • The New API container must also join shared_network
  • The New API service name must match what Sentinel uses in DOWNSTREAM_URL

Example:

  • New API compose service name: new-api
  • New API internal container port: 3000
  • Sentinel .env: DOWNSTREAM_URL=http://new-api:3000

If your New API service is named differently, change DOWNSTREAM_URL accordingly, for example:

DOWNSTREAM_URL=http://my-newapi:3000

Common New API Connection Patterns

In practice, you may run New API in either of these two ways.

Pattern A: Production machine, New API in its own compose

This is the recommended production arrangement.

New API keeps its own compose project and typically joins:

  • default
  • shared_network

That means New API can continue to use its own internal compose network for its own dependencies, while also exposing its service name to Sentinel through shared_network.

Example New API compose fragment:

services:
  new-api:
    image: your-new-api-image
    networks:
      - default
      - shared_network

networks:
  shared_network:
    external: true

With this setup, Sentinel still uses:

DOWNSTREAM_URL=http://new-api:3000

Pattern B: Test machine, New API started as a standalone container

On a test machine, you may not use a second compose project at all. Instead, you can start a standalone New API container with docker run, as long as that container also joins shared_network.

Example:

docker run -d \
  --name new-api \
  --network shared_network \
  your-new-api-image

Important:

  • The container name or reachable hostname must match what Sentinel uses in DOWNSTREAM_URL.
  • If the container is not named new-api, then adjust .env accordingly.
  • The port in DOWNSTREAM_URL is still the New API container's internal listening port.

Example:

DOWNSTREAM_URL=http://new-api:3000

or, if your standalone container is named differently:

DOWNSTREAM_URL=http://new-api-test:3000

Local Development

Backend

  1. Install uv and ensure Python 3.13 is available.
  2. Create the environment and sync dependencies:
uv sync
  1. Copy .env.example to .env and update secrets plus addresses.
  2. Start PostgreSQL and Redis.
  3. Run the API:
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 7000

Frontend

  1. Install dependencies:
cd frontend
npm install
  1. Start Vite dev server:
npm run dev

The Vite config proxies /admin/api/* to http://127.0.0.1:7000.

If you prefer the repository root entrypoint, uv run main.py now starts the same FastAPI app on APP_PORT (default 7000).

Dependency Management

  • Local Python development uses uv via pyproject.toml.
  • The container runtime image uses requirements.txt and intentionally installs only Python dependencies.
  • Application source code is mounted by Compose at runtime, so the offline host does not need to rebuild the image just to load the current backend code.

Offline Deployment Model

If your production machine has no internet access, the current repository should be used in this way:

  1. Build the key-ip-sentinel:latest image on a machine with internet access.
  2. Export that image as a tar archive.
  3. Import the archive on the offline machine.
  4. Place the repository files on the offline machine.
  5. Start the stack with docker compose up -d, not docker compose up --build -d.

This works because:

  • Dockerfile installs only Python dependencies into the image.
  • docker-compose.yml mounts ./app into the running sentinel-app container.
  • The offline machine only needs the prebuilt image plus the repository files.

Important limitation:

  • If you change Python dependencies in requirements.txt, you must rebuild and re-export the image on a connected machine.
  • If you only change backend application code under app/, you do not need to rebuild the image; restarting the container is enough.
  • frontend/dist must already exist before deployment, because Nginx serves the built frontend directly from the repository.
  • The base images used by this stack, such as nginx:alpine, redis:7-alpine, and postgres:16, must also be available on the offline host in advance.

Prepare images on a connected machine

Build and export the Sentinel runtime image:

docker build -t key-ip-sentinel:latest .
docker save -o key-ip-sentinel-latest.tar key-ip-sentinel:latest

Also export the public images used by Compose if the offline machine cannot pull them:

docker pull nginx:alpine
docker pull redis:7-alpine
docker pull postgres:16

docker save -o sentinel-support-images.tar nginx:alpine redis:7-alpine postgres:16

If the admin frontend is not already built, build it on the connected machine too:

cd frontend
npm install
npm run build
cd ..

Then copy these items to the offline machine:

  • the full repository working tree
  • key-ip-sentinel-latest.tar
  • sentinel-support-images.tar if needed

Import images on the offline machine

docker load -i key-ip-sentinel-latest.tar
docker load -i sentinel-support-images.tar

Start on the offline machine

After .env, frontend/dist, and shared_network are ready:

docker compose up -d

Production Deployment

1. Create the shared Docker network

Create the external network once on the Docker host:

docker network create shared_network

Both compose projects must reference this exact same external network name.

2. Make sure New API joins the shared network

In the New API project, add the external network to the New API service.

Minimal example:

services:
  new-api:
    image: your-new-api-image
    networks:
      - default
      - shared_network

networks:
  shared_network:
    external: true

Important:

  • new-api here is the service name that Sentinel will resolve on the shared network.
  • The port in DOWNSTREAM_URL must be the container internal port, not the host published port.
  • If New API already listens on 3000 inside the container, use http://new-api:3000.
  • On a production host, New API can keep both default and shared_network at the same time.
  • On a test host, you can skip a second compose project and use docker run, but the container must still join shared_network.

3. Prepare Sentinel environment

  1. Copy .env.example to .env.
  2. Replace SENTINEL_HMAC_SECRET, ADMIN_PASSWORD, and ADMIN_JWT_SECRET.
  3. Verify DOWNSTREAM_URL points to the New API service name on shared_network.
  4. Keep PG_DSN aligned with the fixed PostgreSQL container password in docker-compose.yml, or update both together.

Example .env for Sentinel:

DOWNSTREAM_URL=http://new-api:3000
REDIS_ADDR=redis://redis:6379
REDIS_PASSWORD=
PG_DSN=postgresql+asyncpg://sentinel:password@postgres:5432/sentinel
SENTINEL_HMAC_SECRET=replace-with-a-random-32-byte-secret
ADMIN_PASSWORD=replace-with-a-strong-password
ADMIN_JWT_SECRET=replace-with-a-random-jwt-secret
TRUSTED_PROXY_IPS=172.24.0.0/16
SENTINEL_FAILSAFE_MODE=closed
APP_PORT=7000
ALERT_WEBHOOK_URL=
ALERT_THRESHOLD_COUNT=5
ALERT_THRESHOLD_SECONDS=300
ARCHIVE_DAYS=90

Notes:

  • TRUSTED_PROXY_IPS should match the Docker subnet used by the Sentinel internal network.
  • If Docker recreates the compose network with a different subnet, update this value.

4. Build the Sentinel frontend bundle

cd frontend
npm install
npm run build
cd ..

This produces frontend/dist, which Nginx serves at /admin/ui/.

If the target host is offline, do this on a connected machine first and copy the resulting frontend/dist directory with the repository.

5. Confirm Sentinel compose prerequisites

  • Build the frontend first. If frontend/dist is missing, /admin/ui/ cannot be served by Nginx.
  • Ensure the external Docker network shared_network already exists before starting Sentinel.
  • Ensure key-ip-sentinel:latest, nginx:alpine, redis:7-alpine, and postgres:16 are already present on the host if the host cannot access the internet.

6. Start the Sentinel stack

docker compose up -d

Use docker compose up --build -d only on a connected machine where rebuilding the Sentinel image is actually intended.

Services:

  • http://<host>/ forwards model API traffic through Sentinel.
  • http://<host>/admin/ui/ serves the admin console.
  • http://<host>/admin/api/* serves the admin API.
  • http://<host>/health exposes the app health check.

7. Verify cross-compose connectivity

After both compose stacks are running:

  1. Open http://<host>:8016/health and confirm it returns {"status":"ok"}.
  2. Open http://<host>:8016/admin/ui/ and log in with ADMIN_PASSWORD.
  3. Send a real model API request to Sentinel, not to New API directly.
  4. Check the Bindings page and confirm the token appears with a recorded binding rule.

Example test request:

curl http://<host>:8016/v1/models \
  -H "Authorization: Bearer <your_api_key>"

If your client still points directly to New API, Sentinel will not see the request and no binding will be created.

Which Port Should Clients Use?

With the current example compose in this repository:

  • Sentinel public port: 8016
  • New API internal container port: usually 3000

That means:

  • For testing now, clients should call http://<host>:8016/...
  • Sentinel forwards internally to http://new-api:3000

Do not point clients at host port 3000 if that bypasses Sentinel.

How To Go Live Without Changing Client Config

If you want existing clients to stay unchanged, Sentinel must take over the original external entrypoint that clients already use.

Typical cutover strategy:

  1. Keep New API on the shared internal Docker network.
  2. Stop exposing New API directly to users.
  3. Expose Sentinel on the old public host/port instead.
  4. Keep DOWNSTREAM_URL pointing to the internal New API service on shared_network.

For example, if users currently call http://host:3000, then in production you should eventually expose Sentinel on that old public port and make New API internal-only.

The current 8016:80 mapping in docker-compose.yml is a local test mapping, not the only valid production setup.

Admin API Summary

  • POST /admin/api/login
  • GET /admin/api/dashboard
  • GET /admin/api/bindings
  • POST /admin/api/bindings/unbind
  • PUT /admin/api/bindings/ip
  • POST /admin/api/bindings/ban
  • POST /admin/api/bindings/unban
  • GET /admin/api/logs
  • GET /admin/api/logs/export
  • GET /admin/api/settings
  • PUT /admin/api/settings

All admin endpoints except /admin/api/login require Authorization: Bearer <jwt>.

Key Implementation Details

  • app/proxy/handler.py keeps the downstream response fully streamed, including SSE responses.
  • app/core/ip_utils.py never trusts client-supplied X-Forwarded-For.
  • app/services/binding_service.py batches last_used_at updates every 5 seconds through an asyncio.Queue.
  • app/services/alert_service.py pushes webhooks once the Redis counter reaches the configured threshold.
  • app/services/archive_service.py prunes stale bindings on a scheduler interval.

Suggested Smoke Checks

  1. GET /health returns {"status":"ok"}.
  2. A first request with a new bearer token creates a binding in PostgreSQL and Redis.
  3. A second request from the same IP is allowed and refreshes last_used_at.
  4. A request from a different IP is rejected with 403 and creates an intercept_logs record, unless the binding rule is all.
  5. /admin/api/login returns a JWT and the frontend can load /admin/api/dashboard.
Description
No description provided
Readme 255 KiB
Languages
Python 48.3%
Vue 35.7%
CSS 10.4%
JavaScript 5.2%
HTML 0.2%
Other 0.2%