Compare commits

...

3 Commits

Author SHA1 Message Date
7ed6f70bab Fix binding token extraction and harden startup concurrency 2026-03-05 14:40:27 +08:00
feb99faaf3 Configure Linux host-network deployment 2026-03-04 19:37:09 +08:00
663999f173 feat:支持纯离线部署 2026-03-04 16:04:19 +08:00
11 changed files with 363 additions and 180 deletions

View File

@@ -5,10 +5,12 @@ PG_DSN=postgresql+asyncpg://sentinel:password@postgres:5432/sentinel
SENTINEL_HMAC_SECRET=replace-with-a-random-32-byte-secret
ADMIN_PASSWORD=replace-with-a-strong-password
ADMIN_JWT_SECRET=replace-with-a-random-jwt-secret
TRUSTED_PROXY_IPS=172.18.0.0/16
TRUSTED_PROXY_IPS=172.30.0.0/24
SENTINEL_FAILSAFE_MODE=closed
APP_PORT=7000
UVICORN_WORKERS=4
ALERT_WEBHOOK_URL=
ALERT_THRESHOLD_COUNT=5
ALERT_THRESHOLD_SECONDS=300
ARCHIVE_DAYS=90
ARCHIVE_SCHEDULER_LOCK_KEY=2026030502

View File

@@ -6,5 +6,4 @@ RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.13-slim-bookworm
WORKDIR /app
COPY --from=builder /install /usr/local
COPY app/ ./app/
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "7000", "--workers", "4"]
CMD ["sh", "-c", "uvicorn app.main:app --host 0.0.0.0 --port ${APP_PORT:-7000} --workers ${UVICORN_WORKERS:-4}"]

399
README.md
View File

@@ -1,18 +1,18 @@
# Key-IP Sentinel
Key-IP Sentinel is a FastAPI-based reverse proxy that enforces first-use IP binding for model API keys before traffic reaches a downstream New API service.
Key-IP Sentinel 是一个基于 FastAPI 的反向代理,用于在请求到达下游 New API 服务之前,对模型 API Key 执行“首次使用绑定来源 IP”的控制。
## Features
## 功能特性
- First-use bind with HMAC-SHA256 token hashing, Redis cache-aside, and PostgreSQL CIDR matching.
- Streaming reverse proxy built on `httpx.AsyncClient` and FastAPI `StreamingResponse`.
- Trusted proxy IP extraction that only accepts `X-Real-IP` from configured upstream networks.
- Redis-backed intercept alert counters with webhook delivery and PostgreSQL audit logs.
- Admin API protected by JWT and Redis-backed login lockout.
- Vue 3 + Element Plus admin console for dashboarding, binding operations, audit logs, and live runtime settings.
- Docker Compose deployment with Nginx, app, Redis, and PostgreSQL.
- 首次使用自动绑定,使用 HMAC-SHA256 token 做哈希,结合 Redis cache-aside PostgreSQL 存储绑定规则。
- 基于 `httpx.AsyncClient` FastAPI `StreamingResponse` 的流式反向代理,支持流式响应透传。
- 可信代理 IP 提取逻辑,只接受来自指定上游网络的 `X-Real-IP`
- 基于 Redis 的拦截计数、Webhook 告警,以及 PostgreSQL 审计日志。
- 管理后台登录使用 JWT并带有 Redis 登录失败锁定机制。
- 使用 Vue 3 + Element Plus 的管理后台,可查看看板、绑定、审计日志和运行时设置。
- 支持 Docker Compose 部署,包含 Nginx、应用、Redis 和 PostgreSQL
## Repository Layout
## 仓库结构
```text
sentinel/
@@ -26,83 +26,103 @@ sentinel/
└── README.md
```
## Runtime Notes
## 运行说明
- Redis stores binding cache, alert counters, daily dashboard metrics, and mutable runtime settings.
- PostgreSQL stores authoritative token bindings and intercept logs.
- Archive retention removes inactive bindings from the active table after `ARCHIVE_DAYS`. A later request from the same token will bind again on first use.
- `SENTINEL_FAILSAFE_MODE=closed` rejects requests when both Redis and PostgreSQL are unavailable. `open` allows traffic through.
- Binding rules support `single` (single IP or single CIDR), `multiple` (multiple discrete IPs), and `all` (allow all source IPs).
- Redis 用于存储绑定缓存、告警计数、每日看板指标和可变运行时设置。
- PostgreSQL 用于存储权威绑定记录和拦截日志。
- 归档保留机制会在绑定超过 `ARCHIVE_DAYS` 不活跃后,从活动表中移除;同一 token 后续再次请求时会重新进行首次绑定。
- `SENTINEL_FAILSAFE_MODE=closed` 表示在 Redis PostgreSQL 同时不可用时拒绝请求;`open` 表示放行。
- 绑定规则支持三种模式:`single`(单个 IP 或单个 CIDR)、`multiple`(多个离散 IP`all`(允许全部来源 IP
## Sentinel and New API Relationship
## Sentinel New API 的关系
Sentinel and New API are expected to run as **two separate Docker Compose projects**:
Sentinel New API 预期是以两套独立的 Docker Compose 项目部署:
- The **Sentinel compose** contains `nginx`, `sentinel-app`, `redis`, and `postgres`.
- The **New API compose** contains your existing New API service and its own dependencies.
- The two stacks communicate through a **shared external Docker network**.
- Sentinel 这套 compose 包含 `nginx``sentinel-app``redis``postgres`
- New API 那套 compose 包含你现有的 New API 服务及其自身依赖
- 两套服务通过一个共享的外部 Docker 网络通信
Traffic flow:
流量链路如下:
```text
Client / SDK
客户端 / SDK
|
| request to Sentinel public endpoint
| 请求发往 Sentinel 对外入口
v
Sentinel nginx -> sentinel-app -> New API service -> model backend
Sentinel nginx -> sentinel-app -> New API 服务 -> 模型后端
|
+-> redis / postgres
```
The key point is: **clients should call Sentinel, not call New API directly**, otherwise IP binding will not take effect.
最关键的一点是:客户端必须请求 Sentinel而不是直接请求 New API否则 IP 绑定不会生效。
## Recommended Deployment Topology
## Linux 上获取真实客户端 IP
Use one external network name for both compose projects. This repository currently uses:
如果你希望在 Linux 部署机上记录真实的局域网客户端 IP不要再通过 Docker bridge 的 `3000:80` 这种端口发布方式暴露公网入口。
推荐生产拓扑如下:
- `nginx` 使用 `network_mode: host`
- `nginx` 直接监听宿主机 `3000` 端口
- `sentinel-app` 保留在内部 bridge 网络,并使用固定 IP
- `sentinel-app` 同时加入 `shared_network`,用于访问 New API
- `new-api` 保持内部可达,不再直接暴露给客户端
这样设计的原因:
- Docker `ports:` 发布端口时,客户端入口这一跳通常会经过 NAT
- 这会导致容器内看到的是类似 `172.28.x.x` 的桥接地址,而不是真实客户端 IP
- `shared_network` 只负责 Sentinel 和 New API 之间的内部通信,不决定客户端入口进来的源地址
`nginx` 使用 `network_mode: host` 时,它直接接收宿主机上的真实入站连接,因此可以把真实来源 IP 通过 `X-Real-IP` 转发给 `sentinel-app`
## 推荐部署拓扑
两套 compose 使用同一个外部网络名。当前仓库约定如下:
```text
shared_network
```
In the Sentinel compose:
Sentinel 这套 compose 中:
- `sentinel-app` joins `shared_network`
- `nginx` exposes the public entrypoint
- `DOWNSTREAM_URL` points to the **New API service name on that shared network**
- `sentinel-app` 加入 `shared_network`
- `nginx` 通过 Linux 宿主机网络暴露外部入口
- `DOWNSTREAM_URL` 指向 `shared_network` 上 New API 的服务名
In the New API compose:
New API 那套 compose 中:
- The New API container must also join `shared_network`
- The New API service name must match what Sentinel uses in `DOWNSTREAM_URL`
- New API 容器也必须加入 `shared_network`
- New API 的服务名必须与 Sentinel `DOWNSTREAM_URL` 的主机名一致
Example:
例如:
- New API compose service name: `new-api`
- New API internal container port: `3000`
- Sentinel `.env`: `DOWNSTREAM_URL=http://new-api:3000`
- New API compose 服务名:`new-api`
- New API 容器内部监听端口:`3000`
- Sentinel `.env``DOWNSTREAM_URL=http://new-api:3000`
If your New API service is named differently, change `DOWNSTREAM_URL` accordingly, for example:
如果你的 New API 服务名不同,就相应修改 `DOWNSTREAM_URL`,例如:
```text
DOWNSTREAM_URL=http://my-newapi:3000
```
## Common New API Connection Patterns
## New API 的两种常见接入方式
In practice, you may run New API in either of these two ways.
实际部署中New API 通常有两种接法。
### Pattern A: Production machine, New API in its own compose
### 方式 A生产机上New API 运行在独立 compose
This is the recommended production arrangement.
这是推荐的生产方案。
New API keeps its own compose project and typically joins:
New API 继续使用它自己的 compose 项目,并通常同时加入:
- `default`
- `shared_network`
That means New API can continue to use its own internal compose network for its own dependencies, while also exposing its service name to Sentinel through `shared_network`.
这样一来New API 既可以继续使用它自己的内部网络访问自身依赖,又可以通过 `shared_network` 把服务名暴露给 Sentinel。
Example New API compose fragment:
示例 New API compose 片段:
```yaml
services:
@@ -117,17 +137,17 @@ networks:
external: true
```
With this setup, Sentinel still uses:
在这种情况下Sentinel 依旧使用:
```text
DOWNSTREAM_URL=http://new-api:3000
```
### Pattern B: Test machine, New API started as a standalone container
### 方式 B测试机上New API 直接通过 `docker run` 启动
On a test machine, you may not use a second compose project at all. Instead, you can start a standalone New API container with `docker run`, as long as that container also joins `shared_network`.
在测试机上,你不一定会使用第二套 compose也可以直接用 `docker run` 启动一个独立的 New API 容器,只要它加入 `shared_network` 即可。
Example:
示例:
```bash
docker run -d \
@@ -136,84 +156,157 @@ docker run -d \
your-new-api-image
```
Important:
注意:
- The container name or reachable hostname must match what Sentinel uses in `DOWNSTREAM_URL`.
- If the container is not named `new-api`, then adjust `.env` accordingly.
- The port in `DOWNSTREAM_URL` is still the New API container's internal listening port.
- 容器名或可解析主机名必须与 Sentinel `DOWNSTREAM_URL` 的主机名一致
- 如果容器名不是 `new-api`,就要同步修改 `.env`
- `DOWNSTREAM_URL` 里的端口仍然应当写容器内部监听端口
Example:
例如:
```text
DOWNSTREAM_URL=http://new-api:3000
```
or, if your standalone container is named differently:
如果容器名不同:
```text
DOWNSTREAM_URL=http://new-api-test:3000
```
## Local Development
## 本地开发
### Backend
### 后端
1. Install `uv` and ensure Python 3.13 is available.
2. Create the environment and sync dependencies:
1. 安装 `uv`,并确保本机具备 Python 3.13
2. 创建虚拟环境并同步依赖:
```bash
uv sync
```
3. Copy `.env.example` to `.env` and update secrets plus addresses.
4. Start PostgreSQL and Redis.
5. Run the API:
3. `.env.example` 复制为 `.env`,并填写密钥与连接地址
4. 启动 PostgreSQL Redis
5. 启动 API
```bash
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 7000
```
### Frontend
### 前端
1. Install dependencies:
1. 安装依赖:
```bash
cd frontend
npm install
```
2. Start Vite dev server:
2. 启动 Vite 开发服务器:
```bash
npm run dev
```
The Vite config proxies `/admin/api/*` to `http://127.0.0.1:7000`.
Vite 开发代理会把 `/admin/api/*` 转发到 `http://127.0.0.1:7000`
If you prefer the repository root entrypoint, `uv run main.py` now starts the same FastAPI app on `APP_PORT` (default `7000`).
如果你更习惯从仓库根目录启动,也可以直接执行 `uv run main.py`,它会以 `APP_PORT`(默认 `7000`)启动同一个 FastAPI 应用。
## Dependency Management
## 依赖管理
- Local Python development uses `uv` via [`pyproject.toml`](/d:/project/sentinel/pyproject.toml).
- Container builds still use [`requirements.txt`](/d:/project/sentinel/requirements.txt) because the Dockerfile is intentionally minimal and matches the delivery requirements.
- 本地 Python 开发依赖通过 [`pyproject.toml`](/c:/project/sentinel/pyproject.toml)`uv` 管理
- 容器运行镜像使用 [`requirements.txt`](/c:/project/sentinel/requirements.txt) 安装 Python 依赖
- 应用源码通过 Compose 在运行时挂载,因此离线机器不需要为后端代码改动频繁重建镜像
## Production Deployment
## 离线部署模型
### 1. Create the shared Docker network
如果你的生产机器无法访问外网,建议按下面的方式使用本仓库:
Create the external network once on the Docker host:
1. 在有网机器上构建 `key-ip-sentinel:latest` 镜像
2. 将镜像导出为 tar 包
3. 在离线机器上导入镜像
4. 将仓库文件整体复制到离线机器
5. 使用 `docker compose up -d` 启动,而不是 `docker compose up --build -d`
这套方式之所以可行,是因为:
- `Dockerfile` 只负责安装 Python 依赖
- `docker-compose.yml` 会在运行时挂载 `./app`
- 离线机器只需要预构建镜像和仓库文件即可运行
需要注意的限制:
- 如果你修改了 `requirements.txt`,必须回到有网机器重新构建并导出镜像
- 如果你只修改了 `app/` 下的后端代码,通常不需要重建镜像,重启容器即可
- `frontend/dist` 必须提前构建好,因为 Nginx 会直接从仓库目录提供前端静态文件
- `nginx:alpine``redis:7-alpine``postgres:16` 这些公共镜像,在离线机器上也必须事先准备好
### 在有网机器上准备镜像
构建并导出 Sentinel 运行镜像:
```bash
docker build -t key-ip-sentinel:latest .
docker save -o key-ip-sentinel-latest.tar key-ip-sentinel:latest
```
如果离线机器无法拉取公共镜像,也需要一并导出:
```bash
docker pull nginx:alpine
docker pull redis:7-alpine
docker pull postgres:16
docker save -o sentinel-support-images.tar nginx:alpine redis:7-alpine postgres:16
```
如果管理后台前端尚未构建,也请在有网机器上提前构建:
```bash
cd frontend
npm install
npm run build
cd ..
```
然后把以下内容复制到离线机器:
- 整个仓库工作目录
- `key-ip-sentinel-latest.tar`
- `sentinel-support-images.tar`(如果需要)
### 在离线机器上导入镜像
```bash
docker load -i key-ip-sentinel-latest.tar
docker load -i sentinel-support-images.tar
```
### 在离线机器上启动
`.env``frontend/dist``shared_network` 都准备好之后,执行:
```bash
docker compose up -d
```
## 生产部署
### 1. 创建共享 Docker 网络
在 Docker 主机上先创建一次外部网络:
```bash
docker network create shared_network
```
Both compose projects must reference this exact same external network name.
两套 compose 都必须引用这个完全相同的外部网络名。
### 2. Make sure New API joins the shared network
### 2. 确保 New API 加入共享网络
In the **New API** project, add the external network to the New API service.
在 New API 项目中,为 New API 服务加入这个外部网络。
Minimal example:
最小示例:
```yaml
services:
@@ -228,22 +321,22 @@ networks:
external: true
```
Important:
重要说明:
- `new-api` here is the **service name** that Sentinel will resolve on the shared network.
- The port in `DOWNSTREAM_URL` must be the **container internal port**, not the host published port.
- If New API already listens on `3000` inside the container, use `http://new-api:3000`.
- On a production host, New API can keep both `default` and `shared_network` at the same time.
- On a test host, you can skip a second compose project and use `docker run`, but the container must still join `shared_network`.
- 这里的 `new-api` 是 Sentinel 在共享网络中解析到的服务名
- `DOWNSTREAM_URL` 中的端口必须写容器内部监听端口,而不是宿主机映射端口
- 如果 New API 容器内部监听 `3000`,就写 `http://new-api:3000`
- 在生产机上New API 可以同时加入 `default` `shared_network`
- 在测试机上,也可以不使用第二套 compose而改用 `docker run`,但容器仍然必须加入 `shared_network`
### 3. Prepare Sentinel environment
### 3. 准备 Sentinel 环境变量
1. Copy `.env.example` to `.env`.
2. Replace `SENTINEL_HMAC_SECRET`, `ADMIN_PASSWORD`, and `ADMIN_JWT_SECRET`.
3. Verify `DOWNSTREAM_URL` points to the New API **service name on `shared_network`**.
4. Keep `PG_DSN` aligned with the fixed PostgreSQL container password in `docker-compose.yml`, or update both together.
1. `.env.example` 复制为 `.env`
2. 替换 `SENTINEL_HMAC_SECRET``ADMIN_PASSWORD``ADMIN_JWT_SECRET`
3. 确认 `DOWNSTREAM_URL` 指向 `shared_network` 上的 New API 服务名
4. 确认 `PG_DSN` `docker-compose.yml` 中 PostgreSQL 密码保持一致,如有修改需同时调整
Example `.env` for Sentinel:
Sentinel 的 `.env` 示例:
```text
DOWNSTREAM_URL=http://new-api:3000
@@ -253,7 +346,7 @@ PG_DSN=postgresql+asyncpg://sentinel:password@postgres:5432/sentinel
SENTINEL_HMAC_SECRET=replace-with-a-random-32-byte-secret
ADMIN_PASSWORD=replace-with-a-strong-password
ADMIN_JWT_SECRET=replace-with-a-random-jwt-secret
TRUSTED_PROXY_IPS=172.24.0.0/16
TRUSTED_PROXY_IPS=172.30.0.0/24
SENTINEL_FAILSAFE_MODE=closed
APP_PORT=7000
ALERT_WEBHOOK_URL=
@@ -262,12 +355,13 @@ ALERT_THRESHOLD_SECONDS=300
ARCHIVE_DAYS=90
```
Notes:
说明:
- `TRUSTED_PROXY_IPS` should match the Docker subnet used by the Sentinel internal network.
- If Docker recreates the compose network with a different subnet, update this value.
- `TRUSTED_PROXY_IPS` 应与 Sentinel 内部 bridge 网络的网段一致,用来信任 `nginx` 这一跳代理
- 如果 Docker 重新创建网络并导致网段变化,就需要同步修改
- 当前仓库中的生产 compose 已固定 `sentinel-net=172.30.0.0/24`,因此默认应写 `TRUSTED_PROXY_IPS=172.30.0.0/24`
### 4. Build the Sentinel frontend bundle
### 4. 构建 Sentinel 前端产物
```bash
cd frontend
@@ -276,74 +370,81 @@ npm run build
cd ..
```
This produces `frontend/dist`, which Nginx serves at `/admin/ui/`.
构建完成后会生成 `frontend/dist`Nginx 会将其作为 `/admin/ui/` 的静态站点目录。
### 5. Confirm Sentinel compose prerequisites
如果目标主机离线,请在有网机器上先完成这一步,并把 `frontend/dist` 一并复制过去。
- Build the frontend first. If `frontend/dist` is missing, `/admin/ui/` cannot be served by Nginx.
- Ensure the external Docker network `shared_network` already exists before starting Sentinel.
### 5. 确认 Sentinel compose 启动前提
### 6. Start the Sentinel stack
- 必须先构建前端;如果缺少 `frontend/dist`,则无法访问 `/admin/ui/`
- 必须提前创建外部网络 `shared_network`
- 如果主机无法联网,必须事先准备好 `key-ip-sentinel:latest``nginx:alpine``redis:7-alpine``postgres:16`
- 当前这份生产 compose 假定宿主机是 Linux因为对外入口使用了 `network_mode: host`
### 6. 启动 Sentinel 服务栈
```bash
docker compose up --build -d
docker compose up -d
```
Services:
只有在有网机器且你明确需要重建镜像时,才使用 `docker compose up --build -d`
- `http://<host>/` forwards model API traffic through Sentinel.
- `http://<host>/admin/ui/` serves the admin console.
- `http://<host>/admin/api/*` serves the admin API.
- `http://<host>/health` exposes the app health check.
服务入口如下:
### 7. Verify cross-compose connectivity
- `http://<host>:3000/`:模型 API 请求通过 Sentinel 转发
- `http://<host>:3000/admin/ui/`:管理后台前端
- `http://<host>:3000/admin/api/*`:管理后台 API
- `http://<host>:3000/health`:健康检查
After both compose stacks are running:
### 7. 验证跨 compose 通信与真实 IP
1. Open `http://<host>:8016/health` and confirm it returns `{"status":"ok"}`.
2. Open `http://<host>:8016/admin/ui/` and log in with `ADMIN_PASSWORD`.
3. Send a real model API request to Sentinel, not to New API directly.
4. Check the `Bindings` page and confirm the token appears with a recorded binding rule.
当两套服务都启动后:
Example test request:
1. 从另一台局域网机器访问 `http://<host>:3000/health`,确认返回 `{"status":"ok"}`
2. 打开 `http://<host>:3000/admin/ui/`,使用 `ADMIN_PASSWORD` 登录
3. 向 Sentinel 发送一条真实模型请求,而不是直接访问 New API
4.`Bindings` 页面确认 token 已出现并生成绑定规则
5. 确认记录下来的绑定 IP 是真实局域网客户端 IP而不是 Docker bridge 地址
示例测试请求:
```bash
curl http://<host>:8016/v1/models \
curl http://<host>:3000/v1/models \
-H "Authorization: Bearer <your_api_key>"
```
If your client still points directly to New API, Sentinel will not see the request and no binding will be created.
如果客户端仍然直接请求 New APISentinel 就看不到流量,也不会生成绑定。
## Which Port Should Clients Use?
## 客户端应该连接哪个端口
With the current example compose in this repository:
按当前仓库中的 Linux 生产 compose
- Sentinel public port: `8016`
- New API internal container port: usually `3000`
- Sentinel 对外端口:`3000`
- New API 容器内部端口:通常是 `3000`
That means:
这意味着:
- **For testing now**, clients should call `http://<host>:8016/...`
- **Sentinel forwards internally** to `http://new-api:3000`
- 客户端应当请求 `http://<host>:3000/...`
- Sentinel 会在内部转发到 `http://new-api:3000`
Do **not** point clients at host port `3000` if that bypasses Sentinel.
不要把客户端直接指向 New API 的宿主机端口,否则会绕过 Sentinel
## How To Go Live Without Changing Client Config
## 如何做到业务无感上线
If you want existing clients to stay unchanged, Sentinel must take over the **original external entrypoint** that clients already use.
如果你希望现有客户端配置完全不改Sentinel 就必须接管原来客户端已经在使用的那个对外地址和端口。
Typical cutover strategy:
典型切换方式如下:
1. Keep New API on the shared internal Docker network.
2. Stop exposing New API directly to users.
3. Expose Sentinel on the old public host/port instead.
4. Keep `DOWNSTREAM_URL` pointing to the internal New API service on `shared_network`.
1. 保留 New API 在内部共享网络中运行
2. 停止把 New API 直接暴露给最终用户
3. Sentinel 暴露原来的对外地址和端口
4. `DOWNSTREAM_URL` 持续指向 `shared_network` 上的内部 New API 服务
For example, if users currently call `http://host:3000`, then in production you should eventually expose Sentinel on that old public port and make New API internal-only.
例如,如果现有客户端一直使用 `http://host:3000`,那生产切换时就应让 Sentinel 接管这个 `3000`,并让 New API 变成内部服务。
The current `8016:80` mapping in [`docker-compose.yml`](/d:/project/sentinel/docker-compose.yml) is a **local test mapping**, not the only valid production setup.
当前仓库中的 [`docker-compose.yml`](/c:/project/sentinel/docker-compose.yml) 已经按这种 Linux 生产方式调整Nginx 直接使用宿主机网络监听 `3000`,而 New API 保持内部访问。
## Admin API Summary
## 管理后台 API 概览
- `POST /admin/api/login`
- `GET /admin/api/dashboard`
@@ -357,20 +458,24 @@ The current `8016:80` mapping in [`docker-compose.yml`](/d:/project/sentinel/doc
- `GET /admin/api/settings`
- `PUT /admin/api/settings`
All admin endpoints except `/admin/api/login` require `Authorization: Bearer <jwt>`.
`/admin/api/login` 外,所有管理接口都需要:
## Key Implementation Details
```text
Authorization: Bearer <jwt>
```
- `app/proxy/handler.py` keeps the downstream response fully streamed, including SSE responses.
- `app/core/ip_utils.py` never trusts client-supplied `X-Forwarded-For`.
- `app/services/binding_service.py` batches `last_used_at` updates every 5 seconds through an `asyncio.Queue`.
- `app/services/alert_service.py` pushes webhooks once the Redis counter reaches the configured threshold.
- `app/services/archive_service.py` prunes stale bindings on a scheduler interval.
## 关键实现细节
## Suggested Smoke Checks
- `app/proxy/handler.py` 会完整流式透传下游响应,包括 SSE
- `app/core/ip_utils.py` 不信任客户端自己传来的 `X-Forwarded-For`
- `app/services/binding_service.py` 会通过 `asyncio.Queue` 每 5 秒批量刷新一次 `last_used_at`
- `app/services/alert_service.py` 会在 Redis 计数达到阈值后推送 Webhook
- `app/services/archive_service.py` 会定时归档过期绑定
1. `GET /health` returns `{"status":"ok"}`.
2. A first request with a new bearer token creates a binding in PostgreSQL and Redis.
3. A second request from the same IP is allowed and refreshes `last_used_at`.
4. A request from a different IP is rejected with `403` and creates an `intercept_logs` record, unless the binding rule is `all`.
5. `/admin/api/login` returns a JWT and the frontend can load `/admin/api/dashboard`.
## 建议的冒烟检查
1. `GET /health` 返回 `{"status":"ok"}`
2. 使用一个新的 Bearer Token 发起首次请求后,应在 PostgreSQL 和 Redis 中创建绑定
3. 同一 IP 的第二次请求应被放行,并刷新 `last_used_at`
4. 来自不同 IP 的请求应返回 `403`,并写入 `intercept_logs`,除非绑定规则是 `all`
5. `/admin/api/login` 应返回 JWT前端应能正常加载 `/admin/api/dashboard`

View File

@@ -54,6 +54,7 @@ class Settings(BaseSettings):
admin_jwt_expire_hours: int = 8
archive_job_interval_minutes: int = 60
archive_batch_size: int = 500
archive_scheduler_lock_key: int = Field(default=2026030502, alias="ARCHIVE_SCHEDULER_LOCK_KEY")
metrics_ttl_days: int = 30
webhook_timeout_seconds: int = 5

View File

@@ -3,6 +3,7 @@ from __future__ import annotations
import hashlib
import hmac
from datetime import UTC, datetime, timedelta
from typing import Mapping
from fastapi import HTTPException, status
from jose import JWTError, jwt
@@ -34,6 +35,19 @@ def extract_bearer_token(authorization: str | None) -> str | None:
return token.strip()
def extract_request_token(headers: Mapping[str, str]) -> tuple[str | None, str | None]:
bearer_token = extract_bearer_token(headers.get("authorization"))
if bearer_token:
return bearer_token, "authorization"
for header_name in ("x-api-key", "api-key"):
header_value = headers.get(header_name)
if header_value and header_value.strip():
return header_value.strip(), header_name
return None, None
def verify_admin_password(password: str, settings: Settings) -> bool:
return hmac.compare_digest(password, settings.admin_password)

View File

@@ -14,7 +14,7 @@ from redis.asyncio import from_url as redis_from_url
from app.api import auth, bindings, dashboard, logs, settings as settings_api
from app.config import RUNTIME_SETTINGS_REDIS_KEY, RuntimeSettings, Settings, get_settings
from app.models import intercept_log, token_binding # noqa: F401
from app.models.db import close_db, ensure_schema_compatibility, get_session_factory, init_db
from app.models.db import close_db, ensure_schema_compatibility, get_engine, get_session_factory, init_db
from app.proxy.handler import router as proxy_router
from app.services.alert_service import AlertService
from app.services.archive_service import ArchiveService
@@ -70,6 +70,8 @@ def configure_logging() -> None:
root_logger.handlers.clear()
root_logger.addHandler(handler)
root_logger.setLevel(logging.INFO)
logging.getLogger("httpx").setLevel(logging.WARNING)
logging.getLogger("httpcore").setLevel(logging.WARNING)
configure_logging()
@@ -153,6 +155,7 @@ async def lifespan(app: FastAPI):
)
archive_service = ArchiveService(
settings=settings,
engine=get_engine(),
session_factory=session_factory,
binding_service=binding_service,
runtime_settings_getter=lambda: app.state.runtime_settings,

View File

@@ -6,6 +6,8 @@ from sqlalchemy.orm import DeclarativeBase
from app.config import Settings
SCHEMA_COMPATIBILITY_LOCK_KEY = 2026030501
class Base(DeclarativeBase):
pass
@@ -62,6 +64,10 @@ async def ensure_schema_compatibility() -> None:
"CREATE INDEX IF NOT EXISTS idx_token_bindings_ip ON token_bindings(bound_ip)",
]
async with engine.begin() as connection:
await connection.execute(
text("SELECT pg_advisory_xact_lock(:lock_key)"),
{"lock_key": SCHEMA_COMPATIBILITY_LOCK_KEY},
)
for statement in statements:
await connection.execute(text(statement))

View File

@@ -9,7 +9,7 @@ from fastapi.responses import JSONResponse, StreamingResponse
from app.config import Settings
from app.core.ip_utils import extract_client_ip
from app.core.security import extract_bearer_token
from app.core.security import extract_request_token
from app.dependencies import get_alert_service, get_binding_service, get_settings
from app.services.alert_service import AlertService
from app.services.binding_service import BindingService
@@ -56,7 +56,7 @@ async def reverse_proxy(
alert_service: AlertService = Depends(get_alert_service),
):
client_ip = extract_client_ip(request, settings)
token = extract_bearer_token(request.headers.get("authorization"))
token, token_source = extract_request_token(request.headers)
if token:
binding_result = await binding_service.evaluate_token_binding(token, client_ip)
@@ -75,6 +75,7 @@ async def reverse_proxy(
status_code=binding_result.status_code,
content={"detail": binding_result.detail},
)
logger.debug("Token binding check passed.", extra={"client_ip": client_ip, "token_source": token_source})
else:
await binding_service.increment_request_metric("allowed")

View File

@@ -5,9 +5,9 @@ from datetime import UTC, datetime, timedelta
from typing import Callable
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from sqlalchemy import delete, select
from sqlalchemy import delete, select, text
from sqlalchemy.exc import SQLAlchemyError
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker
from sqlalchemy.ext.asyncio import AsyncConnection, AsyncEngine, AsyncSession, async_sessionmaker
from app.config import RuntimeSettings, Settings
from app.models.token_binding import TokenBinding
@@ -20,33 +20,45 @@ class ArchiveService:
def __init__(
self,
settings: Settings,
engine: AsyncEngine,
session_factory: async_sessionmaker[AsyncSession],
binding_service: BindingService,
runtime_settings_getter: Callable[[], RuntimeSettings],
) -> None:
self.settings = settings
self.engine = engine
self.session_factory = session_factory
self.binding_service = binding_service
self.runtime_settings_getter = runtime_settings_getter
self.scheduler = AsyncIOScheduler(timezone="UTC")
self._leader_connection: AsyncConnection | None = None
async def start(self) -> None:
if self.scheduler.running:
return
self.scheduler.add_job(
self.archive_inactive_bindings,
trigger="interval",
minutes=self.settings.archive_job_interval_minutes,
id="archive-inactive-bindings",
replace_existing=True,
max_instances=1,
coalesce=True,
)
self.scheduler.start()
if not await self._acquire_leader_lock():
logger.info("Archive scheduler leader lock not acquired; skipping local scheduler start.")
return
try:
self.scheduler.add_job(
self.archive_inactive_bindings,
trigger="interval",
minutes=self.settings.archive_job_interval_minutes,
id="archive-inactive-bindings",
replace_existing=True,
max_instances=1,
coalesce=True,
)
self.scheduler.start()
except Exception:
await self._release_leader_lock()
raise
logger.info("Archive scheduler started on current worker.")
async def stop(self) -> None:
if self.scheduler.running:
self.scheduler.shutdown(wait=False)
await self._release_leader_lock()
async def archive_inactive_bindings(self) -> int:
runtime_settings = self.runtime_settings_getter()
@@ -82,3 +94,43 @@ class ArchiveService:
if total_archived:
logger.info("Archived inactive bindings.", extra={"count": total_archived})
return total_archived
async def _acquire_leader_lock(self) -> bool:
if self._leader_connection is not None:
return True
connection = await self.engine.connect()
try:
acquired = bool(
await connection.scalar(
text("SELECT pg_try_advisory_lock(:lock_key)"),
{"lock_key": self.settings.archive_scheduler_lock_key},
)
)
except Exception:
await connection.close()
logger.exception("Failed to acquire archive scheduler leader lock.")
return False
if not acquired:
await connection.close()
return False
self._leader_connection = connection
return True
async def _release_leader_lock(self) -> None:
if self._leader_connection is None:
return
connection = self._leader_connection
self._leader_connection = None
try:
await connection.execute(
text("SELECT pg_advisory_unlock(:lock_key)"),
{"lock_key": self.settings.archive_scheduler_lock_key},
)
except Exception:
logger.warning("Failed to release archive scheduler leader lock cleanly.")
finally:
await connection.close()

View File

@@ -2,32 +2,29 @@ services:
nginx:
image: nginx:alpine
container_name: sentinel-nginx
network_mode: host
restart: unless-stopped
ports:
- "8016:80"
depends_on:
- sentinel-app
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./frontend/dist:/etc/nginx/html/admin/ui:ro
networks:
- sentinel-net
sentinel-app:
build:
context: .
dockerfile: Dockerfile
image: key-ip-sentinel:latest
container_name: sentinel-app
restart: unless-stopped
env_file:
- .env
volumes:
- ./app:/app/app:ro
depends_on:
- redis
- postgres
networks:
- sentinel-net
- shared_network
sentinel-net:
ipv4_address: 172.30.0.10
shared_network:
redis:
image: redis:7-alpine
@@ -67,5 +64,8 @@ volumes:
networks:
sentinel-net:
driver: bridge
ipam:
config:
- subnet: 172.30.0.0/24
shared_network:
external: true

View File

@@ -1,4 +1,4 @@
worker_processes auto;
worker_processes 8;
events {
worker_connections 4096;
@@ -17,12 +17,12 @@ http {
limit_req_zone $binary_remote_addr zone=api:10m rate=60r/m;
upstream sentinel_app {
server sentinel-app:7000;
server 172.30.0.10:7000;
keepalive 128;
}
server {
listen 80;
listen 3000;
server_name _;
client_max_body_size 32m;
@@ -51,7 +51,7 @@ http {
proxy_pass http://sentinel_app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto http;
proxy_set_header Connection "";
}
@@ -60,7 +60,7 @@ http {
proxy_pass http://sentinel_app/health;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto http;
}
@@ -69,7 +69,7 @@ http {
proxy_pass http://sentinel_app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto http;
proxy_set_header Connection "";
}