14 KiB
Key-IP Sentinel
Key-IP Sentinel is a FastAPI-based reverse proxy that enforces first-use IP binding for model API keys before traffic reaches a downstream New API service.
Features
- First-use bind with HMAC-SHA256 token hashing, Redis cache-aside, and PostgreSQL CIDR matching.
- Streaming reverse proxy built on
httpx.AsyncClientand FastAPIStreamingResponse. - Trusted proxy IP extraction that only accepts
X-Real-IPfrom configured upstream networks. - Redis-backed intercept alert counters with webhook delivery and PostgreSQL audit logs.
- Admin API protected by JWT and Redis-backed login lockout.
- Vue 3 + Element Plus admin console for dashboarding, binding operations, audit logs, and live runtime settings.
- Docker Compose deployment with Nginx, app, Redis, and PostgreSQL.
Repository Layout
sentinel/
├── app/
├── db/
├── nginx/
├── frontend/
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
└── README.md
Runtime Notes
- Redis stores binding cache, alert counters, daily dashboard metrics, and mutable runtime settings.
- PostgreSQL stores authoritative token bindings and intercept logs.
- Archive retention removes inactive bindings from the active table after
ARCHIVE_DAYS. A later request from the same token will bind again on first use. SENTINEL_FAILSAFE_MODE=closedrejects requests when both Redis and PostgreSQL are unavailable.openallows traffic through.- Binding rules support
single(single IP or single CIDR),multiple(multiple discrete IPs), andall(allow all source IPs).
Sentinel and New API Relationship
Sentinel and New API are expected to run as two separate Docker Compose projects:
- The Sentinel compose contains
nginx,sentinel-app,redis, andpostgres. - The New API compose contains your existing New API service and its own dependencies.
- The two stacks communicate through a shared external Docker network.
Traffic flow:
Client / SDK
|
| request to Sentinel public endpoint
v
Sentinel nginx -> sentinel-app -> New API service -> model backend
|
+-> redis / postgres
The key point is: clients should call Sentinel, not call New API directly, otherwise IP binding will not take effect.
Recommended Deployment Topology
Use one external network name for both compose projects. This repository currently uses:
shared_network
In the Sentinel compose:
sentinel-appjoinsshared_networknginxexposes the public entrypointDOWNSTREAM_URLpoints to the New API service name on that shared network
In the New API compose:
- The New API container must also join
shared_network - The New API service name must match what Sentinel uses in
DOWNSTREAM_URL
Example:
- New API compose service name:
new-api - New API internal container port:
3000 - Sentinel
.env:DOWNSTREAM_URL=http://new-api:3000
If your New API service is named differently, change DOWNSTREAM_URL accordingly, for example:
DOWNSTREAM_URL=http://my-newapi:3000
Common New API Connection Patterns
In practice, you may run New API in either of these two ways.
Pattern A: Production machine, New API in its own compose
This is the recommended production arrangement.
New API keeps its own compose project and typically joins:
defaultshared_network
That means New API can continue to use its own internal compose network for its own dependencies, while also exposing its service name to Sentinel through shared_network.
Example New API compose fragment:
services:
new-api:
image: your-new-api-image
networks:
- default
- shared_network
networks:
shared_network:
external: true
With this setup, Sentinel still uses:
DOWNSTREAM_URL=http://new-api:3000
Pattern B: Test machine, New API started as a standalone container
On a test machine, you may not use a second compose project at all. Instead, you can start a standalone New API container with docker run, as long as that container also joins shared_network.
Example:
docker run -d \
--name new-api \
--network shared_network \
your-new-api-image
Important:
- The container name or reachable hostname must match what Sentinel uses in
DOWNSTREAM_URL. - If the container is not named
new-api, then adjust.envaccordingly. - The port in
DOWNSTREAM_URLis still the New API container's internal listening port.
Example:
DOWNSTREAM_URL=http://new-api:3000
or, if your standalone container is named differently:
DOWNSTREAM_URL=http://new-api-test:3000
Local Development
Backend
- Install
uvand ensure Python 3.13 is available. - Create the environment and sync dependencies:
uv sync
- Copy
.env.exampleto.envand update secrets plus addresses. - Start PostgreSQL and Redis.
- Run the API:
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 7000
Frontend
- Install dependencies:
cd frontend
npm install
- Start Vite dev server:
npm run dev
The Vite config proxies /admin/api/* to http://127.0.0.1:7000.
If you prefer the repository root entrypoint, uv run main.py now starts the same FastAPI app on APP_PORT (default 7000).
Dependency Management
- Local Python development uses
uvviapyproject.toml. - The container runtime image uses
requirements.txtand intentionally installs only Python dependencies. - Application source code is mounted by Compose at runtime, so the offline host does not need to rebuild the image just to load the current backend code.
Offline Deployment Model
If your production machine has no internet access, the current repository should be used in this way:
- Build the
key-ip-sentinel:latestimage on a machine with internet access. - Export that image as a tar archive.
- Import the archive on the offline machine.
- Place the repository files on the offline machine.
- Start the stack with
docker compose up -d, notdocker compose up --build -d.
This works because:
Dockerfileinstalls only Python dependencies into the image.docker-compose.ymlmounts./appinto the runningsentinel-appcontainer.- The offline machine only needs the prebuilt image plus the repository files.
Important limitation:
- If you change Python dependencies in
requirements.txt, you must rebuild and re-export the image on a connected machine. - If you only change backend application code under
app/, you do not need to rebuild the image; restarting the container is enough. frontend/distmust already exist before deployment, because Nginx serves the built frontend directly from the repository.- The base images used by this stack, such as
nginx:alpine,redis:7-alpine, andpostgres:16, must also be available on the offline host in advance.
Prepare images on a connected machine
Build and export the Sentinel runtime image:
docker build -t key-ip-sentinel:latest .
docker save -o key-ip-sentinel-latest.tar key-ip-sentinel:latest
Also export the public images used by Compose if the offline machine cannot pull them:
docker pull nginx:alpine
docker pull redis:7-alpine
docker pull postgres:16
docker save -o sentinel-support-images.tar nginx:alpine redis:7-alpine postgres:16
If the admin frontend is not already built, build it on the connected machine too:
cd frontend
npm install
npm run build
cd ..
Then copy these items to the offline machine:
- the full repository working tree
key-ip-sentinel-latest.tarsentinel-support-images.tarif needed
Import images on the offline machine
docker load -i key-ip-sentinel-latest.tar
docker load -i sentinel-support-images.tar
Start on the offline machine
After .env, frontend/dist, and shared_network are ready:
docker compose up -d
Production Deployment
1. Create the shared Docker network
Create the external network once on the Docker host:
docker network create shared_network
Both compose projects must reference this exact same external network name.
2. Make sure New API joins the shared network
In the New API project, add the external network to the New API service.
Minimal example:
services:
new-api:
image: your-new-api-image
networks:
- default
- shared_network
networks:
shared_network:
external: true
Important:
new-apihere is the service name that Sentinel will resolve on the shared network.- The port in
DOWNSTREAM_URLmust be the container internal port, not the host published port. - If New API already listens on
3000inside the container, usehttp://new-api:3000. - On a production host, New API can keep both
defaultandshared_networkat the same time. - On a test host, you can skip a second compose project and use
docker run, but the container must still joinshared_network.
3. Prepare Sentinel environment
- Copy
.env.exampleto.env. - Replace
SENTINEL_HMAC_SECRET,ADMIN_PASSWORD, andADMIN_JWT_SECRET. - Verify
DOWNSTREAM_URLpoints to the New API service name onshared_network. - Keep
PG_DSNaligned with the fixed PostgreSQL container password indocker-compose.yml, or update both together.
Example .env for Sentinel:
DOWNSTREAM_URL=http://new-api:3000
REDIS_ADDR=redis://redis:6379
REDIS_PASSWORD=
PG_DSN=postgresql+asyncpg://sentinel:password@postgres:5432/sentinel
SENTINEL_HMAC_SECRET=replace-with-a-random-32-byte-secret
ADMIN_PASSWORD=replace-with-a-strong-password
ADMIN_JWT_SECRET=replace-with-a-random-jwt-secret
TRUSTED_PROXY_IPS=172.24.0.0/16
SENTINEL_FAILSAFE_MODE=closed
APP_PORT=7000
ALERT_WEBHOOK_URL=
ALERT_THRESHOLD_COUNT=5
ALERT_THRESHOLD_SECONDS=300
ARCHIVE_DAYS=90
Notes:
TRUSTED_PROXY_IPSshould match the Docker subnet used by the Sentinel internal network.- If Docker recreates the compose network with a different subnet, update this value.
4. Build the Sentinel frontend bundle
cd frontend
npm install
npm run build
cd ..
This produces frontend/dist, which Nginx serves at /admin/ui/.
If the target host is offline, do this on a connected machine first and copy the resulting frontend/dist directory with the repository.
5. Confirm Sentinel compose prerequisites
- Build the frontend first. If
frontend/distis missing,/admin/ui/cannot be served by Nginx. - Ensure the external Docker network
shared_networkalready exists before starting Sentinel. - Ensure
key-ip-sentinel:latest,nginx:alpine,redis:7-alpine, andpostgres:16are already present on the host if the host cannot access the internet.
6. Start the Sentinel stack
docker compose up -d
Use docker compose up --build -d only on a connected machine where rebuilding the Sentinel image is actually intended.
Services:
http://<host>/forwards model API traffic through Sentinel.http://<host>/admin/ui/serves the admin console.http://<host>/admin/api/*serves the admin API.http://<host>/healthexposes the app health check.
7. Verify cross-compose connectivity
After both compose stacks are running:
- Open
http://<host>:8016/healthand confirm it returns{"status":"ok"}. - Open
http://<host>:8016/admin/ui/and log in withADMIN_PASSWORD. - Send a real model API request to Sentinel, not to New API directly.
- Check the
Bindingspage and confirm the token appears with a recorded binding rule.
Example test request:
curl http://<host>:8016/v1/models \
-H "Authorization: Bearer <your_api_key>"
If your client still points directly to New API, Sentinel will not see the request and no binding will be created.
Which Port Should Clients Use?
With the current example compose in this repository:
- Sentinel public port:
8016 - New API internal container port: usually
3000
That means:
- For testing now, clients should call
http://<host>:8016/... - Sentinel forwards internally to
http://new-api:3000
Do not point clients at host port 3000 if that bypasses Sentinel.
How To Go Live Without Changing Client Config
If you want existing clients to stay unchanged, Sentinel must take over the original external entrypoint that clients already use.
Typical cutover strategy:
- Keep New API on the shared internal Docker network.
- Stop exposing New API directly to users.
- Expose Sentinel on the old public host/port instead.
- Keep
DOWNSTREAM_URLpointing to the internal New API service onshared_network.
For example, if users currently call http://host:3000, then in production you should eventually expose Sentinel on that old public port and make New API internal-only.
The current 8016:80 mapping in docker-compose.yml is a local test mapping, not the only valid production setup.
Admin API Summary
POST /admin/api/loginGET /admin/api/dashboardGET /admin/api/bindingsPOST /admin/api/bindings/unbindPUT /admin/api/bindings/ipPOST /admin/api/bindings/banPOST /admin/api/bindings/unbanGET /admin/api/logsGET /admin/api/logs/exportGET /admin/api/settingsPUT /admin/api/settings
All admin endpoints except /admin/api/login require Authorization: Bearer <jwt>.
Key Implementation Details
app/proxy/handler.pykeeps the downstream response fully streamed, including SSE responses.app/core/ip_utils.pynever trusts client-suppliedX-Forwarded-For.app/services/binding_service.pybatcheslast_used_atupdates every 5 seconds through anasyncio.Queue.app/services/alert_service.pypushes webhooks once the Redis counter reaches the configured threshold.app/services/archive_service.pyprunes stale bindings on a scheduler interval.
Suggested Smoke Checks
GET /healthreturns{"status":"ok"}.- A first request with a new bearer token creates a binding in PostgreSQL and Redis.
- A second request from the same IP is allowed and refreshes
last_used_at. - A request from a different IP is rejected with
403and creates anintercept_logsrecord, unless the binding rule isall. /admin/api/loginreturns a JWT and the frontend can load/admin/api/dashboard.