Skip to content

Commit

Permalink
fix: improve documentation and configuration - Update circuit breaker…
Browse files Browse the repository at this point in the history
… defaults, align metric names, improve docs clarity
  • Loading branch information
teilomillet committed Jan 2, 2025
1 parent fbc68c9 commit aedb65d
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 92 deletions.
175 changes: 98 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,109 +1,130 @@
# Hapax
# Hapax: AI Infrastructure

## Large Language Model Infrastructure, Simplified
Hapax is a production-ready AI infrastructure layer that ensures uninterrupted AI operations through intelligent provider management and automatic failover. Named after the Greek word ἅπαξ (meaning "once"), it embodies our core promise: configure once, then let it seamlessly manage your AI infrastructure.

Building with Large Language Models is complex. Multiple providers, varying APIs, inconsistent performance, unpredictable costs—these challenges consume more engineering time than the actual innovation.
## Common AI Infrastructure Challenges

Hapax offers a different approach.
Organizations face several critical challenges in managing their AI infrastructure. Service disruptions from AI provider outages create direct revenue impacts, while engineering teams dedicate significant resources to managing multiple AI providers. Teams struggle with limited visibility into AI usage across departments, compounded by complex integration requirements spanning different AI providers.

What if managing LLM infrastructure was as simple as editing a configuration file? What if switching providers, adding endpoints, or implementing fallback strategies could be done with minimal effort?
## Core Capabilities

Imagine a system that:
- Connects to multiple LLM providers seamlessly
- Provides automatic failover between providers
- Offers comprehensive monitoring and metrics
- Allows instant configuration updates without downtime
Hapax delivers a robust infrastructure layer through three core capabilities:

This is Hapax.
### Intelligent Provider Management
The system ensures continuous service through real-time health monitoring with configurable timeouts and check intervals. Automatic failover between providers maintains zero downtime, while a sophisticated three-state circuit breaker (closed, half-open, open) with configurable thresholds prevents cascade failures. Request deduplication using the singleflight pattern optimizes resource utilization.

### Real-World Flexibility in Action
### Production-Ready Architecture
The architecture prioritizes reliability through high-performance request routing and load balancing. Comprehensive error handling and request validation ensure data integrity, while structured logging with request tracing enables detailed debugging. Configurable timeout and rate limiting mechanisms protect system resources.

Imagine you're running a production service using OpenAI's GPT model. Suddenly, you want to:
- Add a new Anthropic Claude model endpoint
- Create a fallback strategy
- Implement detailed monitoring
### Security & Monitoring
Security is foundational, implemented through API key-based authentication and comprehensive request validation and sanitization. The monitoring system provides granular usage tracking per endpoint and detailed request logging for operational visibility.

With Hapax, this becomes simple:
## Usage Tracking & Monitoring

```yaml
# Simply append to your existing configuration
providers:
anthropic:
type: anthropic
models:
claude-3.5-haiku:
api_key: ${ANTHROPIC_API_KEY}
endpoint: /v1/anthropic/haiku
```
No downtime. No complex redeployment. Just configuration.
## Intelligent Provider Management
Hapax goes beyond simple API routing. It creates a resilient ecosystem for your LLM interactions:
Hapax provides built-in monitoring capabilities through Prometheus integration, offering comprehensive visibility into your AI infrastructure:

**Automatic Failover**: When one provider experiences issues, Hapax seamlessly switches to backup providers. Your service continues operating without interruption.
**Deduplication**: Prevent duplicate requests and unnecessary API calls. Hapax intelligently manages request caching and prevents redundant processing.
### Request Tracking
Monitor API usage through versioned endpoints:
```bash
# Standard endpoint structure
/v1/completions
/health # Global system health status
/v1/health # Versioned API health status
/metrics
```

**Provider Health Monitoring**: Continuously track provider performance. Automatically reconnect to primary providers once they're back online, ensuring optimal resource utilization.
### Prometheus Integration
The monitoring system tracks essential metrics including request counts and status by endpoint, request latencies, active request volume, error rates by provider, and circuit breaker states. Health check performance metrics and request deduplication statistics provide deep insights into system efficiency.

Each metric is designed for operational visibility:
- `hapax_http_requests_total` tracks request volume by endpoint and status
- `hapax_http_request_duration_seconds` measures request latency
- `hapax_http_active_requests` shows current load by endpoint
- `hapax_errors_total` monitors error rates by type
- `circuit_breaker_state` indicates provider health status
- `hapax_health_check_duration_seconds` validates provider responsiveness
- `hapax_deduplicated_requests_total` confirms request efficiency
- `hapax_rate_limit_hits_total` tracks rate limiting by client

### Access Management
Security is enforced through API key-based authentication, with per-endpoint rate limiting and comprehensive request validation and sanitization.

## Technical Implementation

```json
// Example: Completion Request
{
"messages": [
{"role": "system", "content": "You are a customer service assistant."},
{"role": "user", "content": "I need help with my order #12345"}
]
}
```

## Comprehensive Observability
When your primary provider experiences issues, Hapax:
1. Detects the failure through continuous health checks (1-minute intervals)
2. Activates the circuit breaker after 3 consecutive failures
3. Routes traffic to healthy backup providers in preference order
4. Maintains detailed metrics for operational visibility

Hapax isn't just a gateway—it's a complete monitoring and alerting system for your LLM infrastructure:
- Detailed Prometheus metrics
- Real-time performance tracking
- Comprehensive error reporting
- Intelligent alerting mechanisms
## Deployment Options

## API Versioning for Scalability
Deploy Hapax in minutes with our production-ready container:

Create multiple API versions effortlessly. Each endpoint can have its own configuration, allowing granular control and smooth evolutionary paths for your services.
```bash
docker run -p 8080:8080 \
-e OPENAI_API_KEY=your_key \
-e ANTHROPIC_API_KEY=your_key \
-e CONFIG_PATH=/app/config.yaml \
teilomillet/hapax:latest
```

Default configuration is provided but can be customized via `config.yaml`:
```yaml
routes:
- path: /v1/completions
handler: completion
version: v1
- path: /v2/completions
handler: advanced_completion
version: v2
circuitBreaker:
maxRequests: 100
interval: 30s
timeout: 10s
failureThreshold: 5

providerPreference:
- ollama
- anthropic
- openai
```
## Getting Started
## Integration Architecture
```bash
# Pull and start Hapax with default configuration
docker run -p 8080:8080 -e ANTHROPIC_API_KEY=your_api_key teilomillet/hapax:latest
Hapax provides comprehensive integration capabilities through multiple components:
# Or, to use a custom configuration with environment variables:
# 1. Extract the default configuration
docker run --rm teilomillet/hapax:latest cat /app/config.yaml > config.yaml
### REST API with Versioned Endpoints
The API architecture provides dedicated endpoints for core functionalities:
- `/v1/completions` handles AI completions,
- `/v1/health` provides versioned API health monitoring,
- `/health` offers global system health status.
- `/metrics` exposes Prometheus metrics for comprehensive monitoring.

# 2. Create a .env file to store your environment variables
echo "ANTHROPIC_API_KEY=your_api_key" > .env
### Comprehensive Monitoring
The monitoring infrastructure integrates Prometheus metrics across all critical components, enabling detailed tracking of request latencies, circuit breaker states, provider health status, and request deduplication. This comprehensive approach ensures complete operational visibility.

# 3. Start Hapax with your configuration and environment variables
docker run -p 8080:8080 \
-v $(pwd)/config.yaml:/app/config.yaml \
--env-file .env \
teilomillet/hapax:latest
```
### Health Checks
The health monitoring system operates with enterprise-grade configurability. Check intervals default to one minute with adjustable timeouts, while failure thresholds are tuned to prevent false positives. Health monitoring extends from individual providers to Docker container status, with granular per-provider health tracking.

### Production Safeguards
System integrity is maintained through multiple safeguards: request deduplication prevents redundant processing, automatic failover ensures continuous operation, circuit breaker patterns protect against cascade failures, and structured JSON logging with correlation IDs enables thorough debugging.

## What's Next
## Technical Requirements

Hapax is continuously evolving.
Running Hapax requires a Docker-compatible environment with network access to AI providers. The system operates efficiently with 1GB RAM, though 4GB is recommended for production deployments. Access credentials (API keys) are required for supported providers: OpenAI, Anthropic, etc./.

## Open Source
## Documentation

Licensed under Apache 2.0, Hapax is open for collaboration and customization.
Comprehensive documentation is available through multiple resources. The [Quick Start Guide](https://github.com/teilomillet/hapax/wiki) provides initial setup instructions, while detailed information about the API and security measures can be found in the [API Documentation](docs/api.md) and [Security Overview](docs/security.md). For operational insights, consult the [Monitoring Guide](docs/monitoring.md).

## Community & Support
## License

- **Discussions**: [GitHub Discussions](https://github.com/teilomillet/hapax/discussions)
- **Documentation**: [Hapax Wiki](https://github.com/teilomillet/hapax/wiki)
- **Issues**: [GitHub Issues](https://github.com/teilomillet/hapax/issues)
Licensed under Apache 2.0. See [LICENSE](LICENSE) for details.

## Our Vision
---

We believe LLM infrastructure should be simple, reliable, and adaptable. Hapax represents our commitment to making LLM integration accessible and powerful.
For detailed technical specifications, visit our [Technical Documentation](docs/technical.md).
2 changes: 1 addition & 1 deletion cmd/hapax/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ var (
version = flag.Bool("version", false, "Print version and exit")
)

const Version = "v0.0.20"
const Version = "v0.0.21"

func main() {
flag.Parse()
Expand Down
14 changes: 0 additions & 14 deletions server/validation/middleware.go
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,6 @@ func ValidateCompletion(next http.Handler) http.Handler {
// Request parsing with detailed error handling
var req CompletionRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
fmt.Printf("DEBUG: Request parsing error: %v\n", err)
sendError(
"Invalid request format",
[]ValidationErrorDetail{{
Expand All @@ -174,17 +173,8 @@ func ValidateCompletion(next http.Handler) http.Handler {
return
}

// Add debug logging
fmt.Printf("DEBUG: Request validation starting\n")
fmt.Printf("DEBUG: Raw request: %+v\n", req)
fmt.Printf("DEBUG: Messages count: %d\n", len(req.Messages))
for i, msg := range req.Messages {
fmt.Printf("DEBUG: Message[%d] - Role: '%s', Content: '%s'\n", i, msg.Role, msg.Content)
}

// Structured validation with detailed error collection
if err := validate.Struct(req); err != nil {
fmt.Printf("DEBUG: Validation Error: %v\n", err)
var details []ValidationErrorDetail
for _, err := range err.(validator.ValidationErrors) {
var errorMessage string
Expand Down Expand Up @@ -217,10 +207,6 @@ func ValidateCompletion(next http.Handler) http.Handler {
Value: fmt.Sprintf("%v", err.Value()),
}
details = append(details, detail)

// EXTREME LOGGING
fmt.Printf("FORCED ERROR - Field: '%s', Message: '%s', Code: '%s'\n",
detail.Field, detail.Message, detail.Code)
}

sendError(
Expand Down

0 comments on commit aedb65d

Please sign in to comment.