Files
sentinel/README.md

148 lines
4.8 KiB
Markdown

# Key-IP Sentinel
Key-IP Sentinel is a FastAPI-based reverse proxy that enforces first-use IP binding for model API keys before traffic reaches a downstream New API service.
## Features
- First-use bind with HMAC-SHA256 token hashing, Redis cache-aside, and PostgreSQL CIDR matching.
- Streaming reverse proxy built on `httpx.AsyncClient` and FastAPI `StreamingResponse`.
- Trusted proxy IP extraction that only accepts `X-Real-IP` from configured upstream networks.
- Redis-backed intercept alert counters with webhook delivery and PostgreSQL audit logs.
- Admin API protected by JWT and Redis-backed login lockout.
- Vue 3 + Element Plus admin console for dashboarding, binding operations, audit logs, and live runtime settings.
- Docker Compose deployment with Nginx, app, Redis, and PostgreSQL.
## Repository Layout
```text
sentinel/
├── app/
├── db/
├── nginx/
├── frontend/
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
└── README.md
```
## Runtime Notes
- Redis stores binding cache, alert counters, daily dashboard metrics, and mutable runtime settings.
- PostgreSQL stores authoritative token bindings and intercept logs.
- Archive retention removes inactive bindings from the active table after `ARCHIVE_DAYS`. A later request from the same token will bind again on first use.
- `SENTINEL_FAILSAFE_MODE=closed` rejects requests when both Redis and PostgreSQL are unavailable. `open` allows traffic through.
## Local Development
### Backend
1. Install `uv` and ensure Python 3.13 is available.
2. Create the environment and sync dependencies:
```bash
uv sync
```
3. Copy `.env.example` to `.env` and update secrets plus addresses.
4. Start PostgreSQL and Redis.
5. Run the API:
```bash
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 7000
```
### Frontend
1. Install dependencies:
```bash
cd frontend
npm install
```
2. Start Vite dev server:
```bash
npm run dev
```
The Vite config proxies `/admin/api/*` to `http://127.0.0.1:7000`.
If you prefer the repository root entrypoint, `uv run main.py` now starts the same FastAPI app on `APP_PORT` (default `7000`).
## Dependency Management
- Local Python development uses `uv` via [`pyproject.toml`](/d:/project/sentinel/pyproject.toml).
- Container builds still use [`requirements.txt`](/d:/project/sentinel/requirements.txt) because the Dockerfile is intentionally minimal and matches the delivery requirements.
## Production Deployment
### 1. Prepare environment
1. Copy `.env.example` to `.env`.
2. Replace `SENTINEL_HMAC_SECRET`, `ADMIN_PASSWORD`, and `ADMIN_JWT_SECRET`.
3. Verify `DOWNSTREAM_URL` points to the internal New API service.
4. Keep `PG_DSN` aligned with the fixed PostgreSQL container password in `docker-compose.yml`, or update both together.
### 2. Build the frontend bundle
```bash
cd frontend
npm install
npm run build
cd ..
```
This produces `frontend/dist`, which Nginx serves at `/admin/ui/`.
### 3. Build prerequisites
- Build the frontend first. If `frontend/dist` is missing, `/admin/ui/` cannot be served by Nginx.
- Ensure the external Docker network `llm-shared-net` already exists if `DOWNSTREAM_URL=http://new-api:3000` should resolve across stacks.
### 4. Start the stack
```bash
docker compose up --build -d
```
Services:
- `http://<host>/` forwards model API traffic through Sentinel.
- `http://<host>/admin/ui/` serves the admin console.
- `http://<host>/admin/api/*` serves the admin API.
- `http://<host>/health` exposes the app health check.
## Admin API Summary
- `POST /admin/api/login`
- `GET /admin/api/dashboard`
- `GET /admin/api/bindings`
- `POST /admin/api/bindings/unbind`
- `PUT /admin/api/bindings/ip`
- `POST /admin/api/bindings/ban`
- `POST /admin/api/bindings/unban`
- `GET /admin/api/logs`
- `GET /admin/api/logs/export`
- `GET /admin/api/settings`
- `PUT /admin/api/settings`
All admin endpoints except `/admin/api/login` require `Authorization: Bearer <jwt>`.
## Key Implementation Details
- `app/proxy/handler.py` keeps the downstream response fully streamed, including SSE responses.
- `app/core/ip_utils.py` never trusts client-supplied `X-Forwarded-For`.
- `app/services/binding_service.py` batches `last_used_at` updates every 5 seconds through an `asyncio.Queue`.
- `app/services/alert_service.py` pushes webhooks once the Redis counter reaches the configured threshold.
- `app/services/archive_service.py` prunes stale bindings on a scheduler interval.
## Suggested Smoke Checks
1. `GET /health` returns `{"status":"ok"}`.
2. A first request with a new bearer token creates a binding in PostgreSQL and Redis.
3. A second request from the same IP is allowed and refreshes `last_used_at`.
4. A request from a different IP is rejected with `403` and creates an `intercept_logs` record.
5. `/admin/api/login` returns a JWT and the frontend can load `/admin/api/dashboard`.