165 lines
3.6 KiB
Markdown
165 lines
3.6 KiB
Markdown
# Edge GitOps - KServe on k3s with GPU
|
|
|
|
GitOps setup for deploying ML models using KServe on a k3s cluster with GPU support (DGX Spark).
|
|
|
|
## Prerequisites
|
|
|
|
- k3s cluster with GPU support
|
|
- kubectl configured to access the cluster
|
|
- Gitea instance for GitOps repository
|
|
- FluxCD CLI installed
|
|
|
|
## Architecture
|
|
|
|
```
|
|
edge-gitops/
|
|
├── clusters/
|
|
│ └── k3s-dgx/
|
|
│ ├── flux-system/ # FluxCD installation
|
|
│ ├── gpu-support/ # NVIDIA GPU Operator
|
|
│ ├── kserve/ # KServe installation
|
|
│ └── apps/ # ML model deployments
|
|
├── apps/ # Reusable app manifests
|
|
└── infrastructure/ # Base infrastructure
|
|
```
|
|
|
|
## Setup Instructions
|
|
|
|
### 1. Bootstrap FluxCD
|
|
|
|
```bash
|
|
flux bootstrap git \
|
|
--url=ssh://git@gitea.example.com/edge-gitops/edge-gitops.git \
|
|
--branch=main \
|
|
--path=clusters/k3s-dgx \
|
|
--components=source-controller,kustomize-controller,helm-controller,notification-controller
|
|
```
|
|
|
|
### 2. Configure Gitea SSH Key
|
|
|
|
Generate SSH key for FluxCD:
|
|
```bash
|
|
ssh-keygen -t ed25519 -N "" -f flux-gitea-key
|
|
```
|
|
|
|
Add the public key to your Gitea repository as a deploy key.
|
|
|
|
### 3. Update Repository Configuration
|
|
|
|
Edit `clusters/k3s-dgx/flux-system/gotk-sync.yaml` to match your Gitea URL:
|
|
```yaml
|
|
url: ssh://git@your-gitea-instance.com/edge-gitops/edge-gitops.git
|
|
```
|
|
|
|
### 4. Deploy the Stack
|
|
|
|
Commit and push the changes:
|
|
```bash
|
|
git add .
|
|
git commit -m "Initial GitOps setup for KServe on k3s"
|
|
git push origin main
|
|
```
|
|
|
|
FluxCD will automatically sync the changes to your cluster.
|
|
|
|
## Components
|
|
|
|
### GPU Support
|
|
- NVIDIA GPU Operator (v23.9.1)
|
|
- NVIDIA Device Plugin
|
|
- DCGM Exporter for monitoring
|
|
- GPU Node Feature Discovery
|
|
|
|
### KServe
|
|
- KServe Core (v0.12.0)
|
|
- GPU-enabled Serving Runtime
|
|
- Istio Gateway for networking
|
|
- Model Storage (PVC)
|
|
|
|
### Example Model
|
|
- Huihui-granite-4.1-30b-abliterated (Hugging Face)
|
|
- GPU-accelerated inference
|
|
- REST API endpoint
|
|
|
|
## Usage
|
|
|
|
### Deploy a New Model
|
|
|
|
1. Create a new InferenceService in `clusters/k3s-dgx/apps/`:
|
|
```yaml
|
|
apiVersion: serving.kserve.io/v1beta1
|
|
kind: InferenceService
|
|
metadata:
|
|
name: your-model
|
|
namespace: kserve
|
|
spec:
|
|
predictor:
|
|
model:
|
|
modelFormat:
|
|
name: huggingface
|
|
storageUri: "hf://your-org/your-model"
|
|
resources:
|
|
limits:
|
|
nvidia.com/gpu: "1"
|
|
```
|
|
|
|
2. Commit and push changes
|
|
|
|
### Test the Model
|
|
|
|
```bash
|
|
# Get the service URL
|
|
kubectl get inferenceservice huihui-granite -n kserve
|
|
|
|
# Test inference
|
|
curl -X POST http://your-service-url/v1/models/huihui-granite:predict \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"inputs": [{"name": "text", "shape": [1], "datatype": "BYTES", "data": ["Hello world"]}]}'
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
Check FluxCD status:
|
|
```bash
|
|
flux get all --all-namespaces
|
|
```
|
|
|
|
Check GPU status:
|
|
```bash
|
|
kubectl get nodes -o jsonpath='{.items[*].status.allocatable.nvidia\.com/gpu}'
|
|
```
|
|
|
|
Check KServe services:
|
|
```bash
|
|
kubectl get inferenceservices -n kserve
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### GPU Not Available
|
|
```bash
|
|
kubectl describe node | grep -A 5 nvidia.com/gpu
|
|
```
|
|
|
|
### KServe Pods Not Starting
|
|
```bash
|
|
kubectl logs -n kserve deployment/kserve-controller-manager
|
|
kubectl get pods -n kserve
|
|
```
|
|
|
|
### FluxCD Sync Issues
|
|
```bash
|
|
flux reconcile kustomization flux-system --with-source
|
|
flux logs
|
|
```
|
|
|
|
## Customization
|
|
|
|
### GPU Resources
|
|
Edit `clusters/k3s-dgx/apps/huihui-granite-inference.yaml` to adjust GPU allocation.
|
|
|
|
### Storage
|
|
Modify `clusters/k3s-dgx/kserve/model-storage-pvc.yaml` for different storage requirements.
|
|
|
|
### Networking
|
|
Update `clusters/k3s-dgx/kserve/istio-gateway.yaml` for custom ingress configuration. |