Reference Architecture
F7 Architecture
Event-driven microservices on AWS EKS. This page is the engineering reference for how to build within the F7 platform -- principles, patterns, rules, and tech stack decisions.
1. Architecture Overview
Layered view from client channels through API Gateway, domain services, event streaming, databases, to infrastructure
Client Channels (10)
Amazon API Gateway (HTTP API)
70% cheaper than REST APIAuthorization Flow
Core Domain Services (7 Domains, 40+ Services)
Event-Carried State Transfer (ECST). Thick event payloads. Outbox pattern mandatory. At-least-once delivery. All consumers must be idempotent.
Only when event-driven is not feasible. Requires circuit breaker (Polly), explicit timeout, and retry with exponential backoff. Pass X-User-* headers downstream.
Database Layer -- Per-Service Ownership (RDS PostgreSQL)
Infrastructure
2. Architecture Principles
8 non-negotiable principles every F7 service must follow
3. Tech Stack
Core technologies with justifications and approved alternatives
Polyglot Service Design
Each service chooses its own language, database engine, and cache -- within approved options. Deviations from defaults require Architecture Committee approval.
Language
.NET (default)
Go for ultra-low latency. Kotlin needs approval.
Database
PostgreSQL (default)
MongoDB for document-heavy. Needs approval.
Cache
Redis (default)
DynamoDB for serverless scaling scenarios.
| Category | Technology | Justification | Alternatives |
|---|---|---|---|
| Compute | .NET (default) | Team familiarity, rich ecosystem, CQRS/MediatR support, strong performance | Go (ultra-low latency), Kotlin (JVM ecosystem -- needs Architecture Committee approval) |
| Orchestration | AWS EKS (Kubernetes) | Container orchestration, auto-scaling, rolling deployments, service discovery via CoreDNS | None -- standard |
| API Gateway | Amazon API Gateway (HTTP API) | 70% cheaper than REST API type, lower latency, native Lambda authorizer integration | REST API type (only if WebSocket or usage plans needed) |
| Auth | Lambda Custom Authorizer (TOKEN type, RS256) | Centralized JWT validation, returns IAM policy + context headers, zero auth logic in services | None -- non-negotiable |
| Messaging | Amazon MSK (Kafka) | Event-driven backbone, at-least-once delivery, topic-based pub/sub, consumer groups | None -- standard |
| Database | Amazon RDS PostgreSQL | Shared schema with sharding keys (replacing schema-per-tenant MySQL), Multi-AZ, automated backups | MongoDB (document-heavy, needs Architecture Committee approval) |
| Cache | Amazon ElastiCache Redis (default) | Sub-millisecond latency, permission caching, session data, rate limiting | DynamoDB (serverless scaling scenarios) |
| Connection Pooling | PgBouncer | Efficient connection pooling for PostgreSQL, reduces DB connection overhead | None -- required with RDS PostgreSQL |
| Observability | Prometheus/Grafana, DataDog, AWS X-Ray | Metrics, logging, distributed tracing across all services | None -- all three are standard |
| Deployments | Argo Rollouts (Canary) | Progressive delivery, automated rollback, traffic splitting for safe deployments | None -- standard |
| Secrets | AWS Secrets Manager | Centralized secret storage, automatic rotation, IAM-based access control | None -- non-negotiable |
| Service Mesh | AWS App Mesh | mTLS between services, traffic management, observability sidecar injection | None -- standard |
4. Communication Patterns
Events are the default. HTTP is the exception. Use this decision matrix to choose.
Decision Rule
Default: Kafka event with ECST payload. Exception: Synchronous HTTP only when event-driven is technically not feasible (e.g., first-time data that has no prior event, or real-time query that cannot tolerate eventual consistency). Every sync call must have a circuit breaker, explicit timeout, and retry with exponential backoff.
Service A needs data from Service B (non-critical path)
Kafka Event (ECST)
Service A subscribes to events and stores a local projection. No runtime dependency.
Service A needs data from Service B (critical path, no local copy yet)
HTTP via CoreDNS
Exception case only. Use Kubernetes service discovery. Must have circuit breaker + timeout.
Multi-service business transaction (e.g., order fulfillment)
Saga (Choreography)
Each service reacts to events and publishes next event. Compensating actions for rollback.
External client request (mobile app, web, POS)
API Gateway + Lambda Authorizer
JWT validated at gateway. Headers injected. Routed to target service.
Notify 3rd-party integrations of state changes
Webhooks
Outbound HTTP callbacks to registered endpoints with retry logic.
Service-to-service call (internal, same cluster)
HTTP via Kubernetes CoreDNS
Service discovery via DNS. Pass X-User-Id, X-User-Email, X-User-Roles headers downstream.
Event Flow Example: Order Lifecycle
5. Identity & Auth Architecture
SSO, JWT validation, permission resolution, and the 5 non-negotiable auth design decisions
Organization Hierarchy
Top-level tenant entity
Org administrators
Brand within the organization
Physical or virtual location
Registered POS / Kiosk / KDS
Token Types
User Token
- Issued after SSO authentication via identity.foodics.com portal
- JWT format with RS256 signing. Long-lived, tied to portal session.
- Contains user identity, roles, and organization scope
Device Token
- Issued during device registration and pairing flow
- Contains device ID, branch, and business context
- Long-lived with automatic rotation
Personal Access Token / Access Client
- Manually issued by administrators for API integrations
- Delegated issuing via access clients for 3rd-party apps
5 Non-Negotiable Auth Design Decisions
SSO Provider is a replaceable black box
It only gives signed JWTs. The entire SSO provider can be swapped without changing any service code.
internal_user_id is the universal identifier
UUID from Identity Service. Every service, event, and header uses this single ID to refer to a user.
No service calls another for authorization at runtime
Permissions are cached locally via Kafka events. Redis cache (microseconds, 99.9% hit rate) -> PostgreSQL external schema (ms) -> Organisation Service API (startup/recovery only).
Identity and authorization are separate concerns
Identity Service = WHO the user is. Organisation Service = WHAT they can do. These are distinct services with distinct data.
A user is a person, not a role
The same person can be an employee in multiple organizations and a customer. Identity is person-centric, not role-centric.
Permission Resolution Priority
1. Redis Cache
Microseconds latency. 99.9% hit rate.
2. PostgreSQL External Schema
Milliseconds. Cache miss fallback.
3. Organisation Service API
Startup/recovery only. Never on hot path.
Key Auth Flows
API Gateway Header Injection
After JWT validation, the Lambda authorizer returns an IAM policy and context. The API Gateway uses parameter mapping to inject these headers into every request forwarded to services.
| Header | Source | Usage |
|---|---|---|
| X-User-Id | JWT sub claim (internal_user_id UUID) | Identify the requesting user. Used as audit trail. |
| X-User-Email | JWT email claim | Display name / notification target. |
| X-User-Roles | JWT roles claim | Permission checks against local cache. |
Service-to-Service Rule
When Service A calls Service B internally, it must pass X-User-Id, X-User-Email, and X-User-Roles headers downstream. No JWT is needed for internal calls -- headers are the identity contract.
6. Database Strategy
Shared schema with sharding keys, replacing schema-per-tenant MySQL. Strict column-type rules.
Migration: Schema-per-Tenant to Shared Schema
Legacy uses schema-per-tenant MySQL. F7 moves to shared schema with sharding keys in RDS PostgreSQL. Each service independently scaled. Customer data moveable between shards. Each service owns its schema exclusively.
| Rule | Specification | Reason |
|---|---|---|
| Primary Keys | UUIDs (uuid_generate_v4()) | Globally unique, no conflicts across shards or services |
| Money Columns | BIGINT (smallest currency unit) | No floating-point rounding errors. Store cents/halalas as integers |
| Timestamps | TIMESTAMPTZ | Timezone-aware. All stored in UTC. Client converts for display |
| Foreign Keys | Within same service only | Cross-service FKs create deployment coupling. Use plain ID columns for external refs |
| Outbox Table | Mandatory in every event-producing service | Atomic write + event publish. Relay process publishes from outbox to Kafka |
| ECST Projections | Dedicated projection tables for consumed events | Local read-only copies of external data. Updated idempotently from Kafka events |
| Connection Pooling | PgBouncer required | Efficient connection reuse. Prevents connection exhaustion under load |
| High Availability | Multi-AZ deployment | Automated failover. RPO near-zero. Automated backups + Performance Insights enabled |
| Float/Double for Money | PROHIBITED | Floating-point arithmetic is not exact. Will produce rounding errors in financial calculations |
Outbox Table (Every Event-Producing Service)
-- outbox table schema
id UUID PRIMARY KEY
aggregate_type VARCHAR(255)
aggregate_id UUID
event_type VARCHAR(255)
payload JSONB
created_at TIMESTAMPTZ
published_at TIMESTAMPTZ NULL
-- relay process polls WHERE published_at IS NULL
-- marks published_at after successful Kafka publish
ECST Projection Table (Every Event-Consuming Service)
-- example: Order Service stores Menu items locally
id UUID PRIMARY KEY
source_service VARCHAR(100)
source_entity_id UUID
data JSONB
last_event_id UUID
last_updated_at TIMESTAMPTZ
-- idempotent upsert: skip if last_event_id >= incoming
-- read-only local copy of external data
7. Core Design Patterns
Every pattern used across the F7 platform with implementation guidance
8. Forbidden Patterns
Zero tolerance. These patterns are never acceptable in F7 services. Violations block code review.
Shared database between services
Destroys service autonomy. Schema changes cascade. Tight coupling through data.
Do instead: Each service owns its database. Consume events via Kafka for cross-service data.
Direct cross-service database queries
Bypasses service boundaries. Creates hidden dependencies. Breaks encapsulation.
Do instead: Use ECST events to build local projection tables. Query your own DB.
JWT validation inside services
Gateway already validates JWT. Doing it again adds latency and couples services to auth implementation.
Do instead: Read X-User-Id, X-User-Email, X-User-Roles from request headers injected by API Gateway.
FLOAT/DOUBLE for monetary values
Floating-point arithmetic produces rounding errors. Unacceptable for financial data.
Do instead: Use BIGINT storing the smallest currency unit (e.g., cents, halalas). All math is integer.
Foreign keys across service databases
Cross-service FK creates deployment coupling and shared schema ownership.
Do instead: Reference external IDs as plain columns. Enforce consistency via events and eventual consistency.
Business logic in infrastructure
Violates Smart Endpoints, Dumb Pipes. Logic in Kafka consumers, DB triggers, or gateway transforms is untestable and invisible.
Do instead: All business logic in service application layer. Infrastructure is transport and storage only.
Synchronous calls on the critical path to other services
Creates runtime dependency. If the called service is down, your service is down.
Do instead: Subscribe to Kafka events and maintain local projection. Use sync HTTP only as documented exception with circuit breaker.
Publishing events without the Outbox Pattern
If the DB write succeeds but event publish fails (or vice versa), data is inconsistent.
Do instead: Always write events to the outbox table in the same DB transaction. Relay process publishes from outbox.
Breaking API changes on existing versions
Breaks existing clients. Violates backward compatibility. Destroys Strangler Fig coexistence.
Do instead: Create a new API version (/v2/). Maintain old version for 6+ months. New fields are always optional.
Calling another service for authorization at runtime
Adds latency to every request. Creates single point of failure for auth.
Do instead: Cache permissions locally via Kafka events. Resolution: Redis (99.9%) -> PostgreSQL -> Organisation Service (startup/recovery only).
Stored procedures for business logic
Business logic hidden in the database layer. Untestable, not version-controlled, violates separation of concerns.
Do instead: Business logic in application layer. Database is storage only.
Non-idempotent mutating endpoints
Retries cause duplicate side effects. Network failures become data corruption.
Do instead: Accept X-Idempotency-Key header on all mutating endpoints. Store and deduplicate.