# Key-IP Sentinel Key-IP Sentinel is a FastAPI-based reverse proxy that enforces first-use IP binding for model API keys before traffic reaches a downstream New API service. ## Features - First-use bind with HMAC-SHA256 token hashing, Redis cache-aside, and PostgreSQL CIDR matching. - Streaming reverse proxy built on `httpx.AsyncClient` and FastAPI `StreamingResponse`. - Trusted proxy IP extraction that only accepts `X-Real-IP` from configured upstream networks. - Redis-backed intercept alert counters with webhook delivery and PostgreSQL audit logs. - Admin API protected by JWT and Redis-backed login lockout. - Vue 3 + Element Plus admin console for dashboarding, binding operations, audit logs, and live runtime settings. - Docker Compose deployment with Nginx, app, Redis, and PostgreSQL. ## Repository Layout ```text sentinel/ ├── app/ ├── db/ ├── nginx/ ├── frontend/ ├── docker-compose.yml ├── Dockerfile ├── requirements.txt └── README.md ``` ## Runtime Notes - Redis stores binding cache, alert counters, daily dashboard metrics, and mutable runtime settings. - PostgreSQL stores authoritative token bindings and intercept logs. - Archive retention removes inactive bindings from the active table after `ARCHIVE_DAYS`. A later request from the same token will bind again on first use. - `SENTINEL_FAILSAFE_MODE=closed` rejects requests when both Redis and PostgreSQL are unavailable. `open` allows traffic through. ## Local Development ### Backend 1. Install `uv` and ensure Python 3.13 is available. 2. Create the environment and sync dependencies: ```bash uv sync ``` 3. Copy `.env.example` to `.env` and update secrets plus addresses. 4. Start PostgreSQL and Redis. 5. Run the API: ```bash uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 7000 ``` ### Frontend 1. Install dependencies: ```bash cd frontend npm install ``` 2. Start Vite dev server: ```bash npm run dev ``` The Vite config proxies `/admin/api/*` to `http://127.0.0.1:7000`. If you prefer the repository root entrypoint, `uv run main.py` now starts the same FastAPI app on `APP_PORT` (default `7000`). ## Dependency Management - Local Python development uses `uv` via [`pyproject.toml`](/d:/project/sentinel/pyproject.toml). - Container builds still use [`requirements.txt`](/d:/project/sentinel/requirements.txt) because the Dockerfile is intentionally minimal and matches the delivery requirements. ## Production Deployment ### 1. Prepare environment 1. Copy `.env.example` to `.env`. 2. Replace `SENTINEL_HMAC_SECRET`, `ADMIN_PASSWORD`, and `ADMIN_JWT_SECRET`. 3. Verify `DOWNSTREAM_URL` points to the internal New API service. 4. Keep `PG_DSN` aligned with the fixed PostgreSQL container password in `docker-compose.yml`, or update both together. ### 2. Build the frontend bundle ```bash cd frontend npm install npm run build cd .. ``` This produces `frontend/dist`, which Nginx serves at `/admin/ui/`. ### 3. Build prerequisites - Build the frontend first. If `frontend/dist` is missing, `/admin/ui/` cannot be served by Nginx. - Ensure the external Docker network `llm-shared-net` already exists if `DOWNSTREAM_URL=http://new-api:3000` should resolve across stacks. ### 4. Start the stack ```bash docker compose up --build -d ``` Services: - `http:///` forwards model API traffic through Sentinel. - `http:///admin/ui/` serves the admin console. - `http:///admin/api/*` serves the admin API. - `http:///health` exposes the app health check. ## Admin API Summary - `POST /admin/api/login` - `GET /admin/api/dashboard` - `GET /admin/api/bindings` - `POST /admin/api/bindings/unbind` - `PUT /admin/api/bindings/ip` - `POST /admin/api/bindings/ban` - `POST /admin/api/bindings/unban` - `GET /admin/api/logs` - `GET /admin/api/logs/export` - `GET /admin/api/settings` - `PUT /admin/api/settings` All admin endpoints except `/admin/api/login` require `Authorization: Bearer `. ## Key Implementation Details - `app/proxy/handler.py` keeps the downstream response fully streamed, including SSE responses. - `app/core/ip_utils.py` never trusts client-supplied `X-Forwarded-For`. - `app/services/binding_service.py` batches `last_used_at` updates every 5 seconds through an `asyncio.Queue`. - `app/services/alert_service.py` pushes webhooks once the Redis counter reaches the configured threshold. - `app/services/archive_service.py` prunes stale bindings on a scheduler interval. ## Suggested Smoke Checks 1. `GET /health` returns `{"status":"ok"}`. 2. A first request with a new bearer token creates a binding in PostgreSQL and Redis. 3. A second request from the same IP is allowed and refreshes `last_used_at`. 4. A request from a different IP is rejected with `403` and creates an `intercept_logs` record. 5. `/admin/api/login` returns a JWT and the frontend can load `/admin/api/dashboard`.