Envoy ExtProc Integration
The Semantic Router leverages Envoy's External Processing (ExtProc) filter to implement intelligent routing decisions. This integration provides a clean separation between traffic management (Envoy) and business logic (Semantic Router), enabling sophisticated routing capabilities while maintaining high performance.
Understanding Envoy ExtProc​
What is ExtProc?​
External Processing (ExtProc) is an Envoy filter that allows external services to participate in request and response processing. Unlike other extension mechanisms, ExtProc provides:
- Streaming Processing: Handle requests and responses as they flow through Envoy
- Full Control: Modify headers, body, and routing decisions
- Low Latency: Optimized gRPC communication between Envoy and external services
- Fault Tolerance: Built-in failure handling and timeout management
ExtProc vs Other Extension Methods​
Extension Method | Use Case | Latency | Flexibility | Complexity |
---|---|---|---|---|
HTTP Filters | Simple transformations | Lowest | Limited | Low |
WebAssembly (WASM) | Sandboxed logic | Low | Medium | Medium |
ExtProc | Complex business logic | Medium | High | Medium |
HTTP Callouts | External API calls | High | High | High |
Why ExtProc for Semantic Routing?
- Complex ML Models: Need full Python/Go ecosystem for BERT models
- Dynamic Decision Making: Requires sophisticated classification logic
- State Management: Needs caching and request tracking
- Observability: Requires comprehensive metrics and logging
ExtProc Protocol Architecture​
Communication Flow​
Processing Modes​
ExtProc can be configured to process different parts of the request/response lifecycle:
# Envoy ExtProc Configuration
processing_mode:
request_header_mode: "SEND" # Process request headers
response_header_mode: "SEND" # Process response headers
request_body_mode: "BUFFERED" # Process entire request body
response_body_mode: "BUFFERED" # Process entire response body
request_trailer_mode: "SKIP" # Skip request trailers
response_trailer_mode: "SKIP" # Skip response trailers
Mode Options:
SKIP
: Don't send to ExtProc (fastest)SEND
: Send headers/trailers onlyBUFFERED
: Send entire body (required for content analysis)STREAMED
: Send body in chunks (for streaming)
Semantic Router ExtProc Implementation​
Go Implementation Structure​
// Main ExtProc Server
type Server struct {
router *OpenAIRouter
server *grpc.Server
port int
}
// Router implements the ExtProc service interface
type OpenAIRouter struct {
Config *config.RouterConfig
CategoryDescriptions []string
Classifier *classification.Classifier
PIIChecker *pii.PolicyChecker
Cache *cache.SemanticCache
ToolsDatabase *tools.ToolsDatabase
pendingRequests map[string][]byte
pendingRequestsLock sync.Mutex
}
// Implements the ExtProc service interface
var _ ext_proc.ExternalProcessorServer = &OpenAIRouter{}