Deep Dive: GraphQL Federation

GraphQL Federation is a powerful pattern for building distributed GraphQL systems where multiple independent services contribute to a single, unified GraphQL API. Unlike traditional monolithic GraphQL servers, federation enables teams to own and evolve their own GraphQL schemas while seamlessly connecting them through a gateway router that coordinates queries across services.

What is GraphQL Federation?

GraphQL Federation solves a critical challenge in microservices architectures: how do you provide a unified GraphQL interface when your business logic is split across multiple services? Rather than forcing all teams to work on a single massive schema, Federation enables each service to define and own its portion of the graph while a gateway composes and routes queries to the appropriate services.

The Supergraph Concept

The supergraph is the complete, unified GraphQL schema that clients interact with. It’s composed of multiple subgraphs, where each subgraph is an independent GraphQL service that owns specific types and fields. The gateway router orchestrates communication between subgraphs and handles the complex logic of query planning and execution.

When a client sends a query to the gateway, the gateway:

Parses and validates the query against the supergraph schema
Creates an execution plan that determines which subgraphs need to be queried
Resolves entity references across service boundaries using __resolveReference
Merges results from multiple services into a single response

Query Execution Flow

Here’s how a federated query flows through your system:

sequenceDiagram
    participant Client
    participant Gateway as Apollo Gateway
    participant SubgraphA as Subgraph A<br/>(Users Service)
    participant SubgraphB as Subgraph B<br/>(Orders Service)

    Client->>Gateway: POST /graphql<br/>query { user(id: "1") { id name orders { id total } } }
    Gateway->>Gateway: Parse & validate query<br/>Create execution plan
    Gateway->>SubgraphA: query { user(id: "1") { id name __typename } }
    SubgraphA-->>Gateway: { user: { id: "1" name: "Alice" __typename: "User" } }
    Gateway->>SubgraphB: query { orders(userIds: ["1"]) { id total } }
    SubgraphB-->>Gateway: { orders: [{ id: "o1" total: 99.99 }] }
    Gateway->>Gateway: Merge results
    Gateway-->>Client: { user: { id: "1" name: "Alice" orders: [...] } }

Core Principles

Owned schemas: Each team owns their subgraph schema and can deploy independently
Decentralized: No central team controls the entire graph; services manage their own types
Type expansion: Types can be extended across services using the @key directive
Entity resolution: Services resolve entities they own via __resolveReference
Transparent composition: Clients see a single, seamless API
Query planning: The gateway optimizes query execution across services
Scalability: Services scale independently without affecting the gateway’s core logic

Federation vs Schema Stitching

Before federation, developers used schema stitching—a pattern where a gateway would merge schemas from multiple services and manually resolve entity relationships. While stitching works, it has significant limitations:

Aspect	Schema Stitching	Federation
Type Extension	Manual resolver setup	Declarative `@key` directives
Entity Resolution	Custom resolver logic	Built-in `__resolveReference`
Type Safety	Limited; manual mapping	Strong; SDL-driven composition
Query Planning	Basic; potential N+1 issues	Intelligent; batches requests
Composition Checks	Manual or custom tooling	Apollo Composition Checks (CI/CD)
Development Experience	Verbose, error-prone	Declarative, type-safe
Performance Monitoring	Limited insight	Apollo managed federation support

Why Federation Won: Federation is purpose-built for distributed GraphQL systems. It eliminates the boilerplate of stitching while providing a clear mental model—every type has an owner, and relationships are explicitly declared. This makes federation ideal for organizations with multiple teams and services.

Defining Subgraph Schemas

Subgraph schemas use Federation directives to declare ownership, external fields, and dependencies. Let’s explore the key directives:

Key Federation Directives

@key — Marks a field (or fields) that uniquely identify an entity within a subgraph.

type User @key(fields: "id") {
  id: ID!
  email: String!
  name: String!
}

@external — Declares a field that exists in another subgraph; used when extending types.

type Order @key(fields: "id") {
  id: ID!
  userId: String! @external
  total: Float!
}

@requires — Indicates that a field requires other fields from the same service to be fetched first.

type User @key(fields: "id") {
  id: ID!
  email: String!
  displayName: String! @requires(fields: "email")
}

@provides — Declares that this resolver provides a field from another subgraph, reducing federation calls.

type Query {
  user(id: ID!): User
}

extend type Order {
  user: User @provides(fields: "email")
}

Example: Users Subgraph Schema

# users-subgraph/schema.graphql

type Query {
  user(id: ID!): User
  users(ids: [ID!]!): [User!]!
}

type User @key(fields: "id") {
  id: ID!
  email: String!
  username: String!
  createdAt: DateTime!
  profile: UserProfile!
}

type UserProfile {
  avatar: String
  bio: String
  socialLinks: [SocialLink!]!
}

type SocialLink {
  platform: String!
  url: String!
}

scalar DateTime

Example: Orders Subgraph Schema

# orders-subgraph/schema.graphql

type Query {
  order(id: ID!): Order
  orders(userId: ID!): [Order!]!
}

type Order @key(fields: "id") {
  id: ID!
  userId: String!
  items: [OrderItem!]!
  total: Float!
  status: OrderStatus!
  createdAt: DateTime!
  user: User!
}

extend type User @key(fields: "id") {
  id: ID! @external
  orders: [Order!]!
}

type OrderItem {
  productId: String!
  quantity: Int!
  price: Float!
}

enum OrderStatus {
  PENDING
  CONFIRMED
  SHIPPED
  DELIVERED
  CANCELLED
}

scalar DateTime

Implementing a Subgraph with Apollo Server

Let’s implement the Users subgraph in production-quality TypeScript:

// users-subgraph/src/index.ts

import { ApolloServer } from "@apollo/server";
import { startStandaloneServer } from "@apollo/server/standalone";
import { buildSubgraphSchema } from "@apollo/subgraph";
import gql from "graphql-tag";

// ============================================================================
// Type definitions
// ============================================================================

const typeDefs = gql`
  type Query {
    user(id: ID!): User
    users(ids: [ID!]!): [User!]!
  }

  type User @key(fields: "id") {
    id: ID!
    email: String!
    username: String!
    createdAt: DateTime!
    profile: UserProfile!
  }

  type UserProfile {
    avatar: String
    bio: String
    socialLinks: [SocialLink!]!
  }

  type SocialLink {
    platform: String!
    url: String!
  }

  scalar DateTime
`;

// ============================================================================
// Data models and database layer
// ============================================================================

interface UserProfile {
  avatar?: string;
  bio?: string;
  socialLinks: Array<{ platform: string; url: string }>;
}

interface User {
  id: string;
  email: string;
  username: string;
  createdAt: Date;
  profile: UserProfile;
}

// Mock database - replace with real DB in production
const userDatabase: Map<string, User> = new Map([
  [
    "user-1",
    {
      id: "user-1",
      email: "alice@example.com",
      username: "alice",
      createdAt: new Date("2024-01-15"),
      profile: {
        avatar: "https://example.com/avatars/alice.jpg",
        bio: "Software engineer interested in GraphQL",
        socialLinks: [
          { platform: "twitter", url: "https://twitter.com/alice" },
          { platform: "github", url: "https://github.com/alice" },
        ],
      },
    },
  ],
  [
    "user-2",
    {
      id: "user-2",
      email: "bob@example.com",
      username: "bob",
      createdAt: new Date("2024-02-20"),
      profile: {
        avatar: "https://example.com/avatars/bob.jpg",
        bio: "Product manager focused on developer tools",
        socialLinks: [{ platform: "linkedin", url: "https://linkedin.com/in/bob" }],
      },
    },
  ],
]);

// ============================================================================
// Resolvers
// ============================================================================

const resolvers = {
  Query: {
    user: async (_: unknown, { id }: { id: string }) => {
      const user = userDatabase.get(id);
      if (!user) {
        throw new Error(`User not found: ${id}`);
      }
      return user;
    },

    users: async (_: unknown, { ids }: { ids: string[] }) => {
      return ids
        .map((id) => userDatabase.get(id))
        .filter((user): user is User => user !== undefined);
    },
  },

  User: {
    __resolveReference: async (reference: { id: string }) => {
      const user = userDatabase.get(reference.id);
      if (!user) {
        throw new Error(`User not found: ${reference.id}`);
      }
      return user;
    },
  },

  DateTime: {
    __serialize: (value: Date) => value.toISOString(),
    __parseValue: (value: string) => new Date(value),
    __parseLiteral: (ast: any) => new Date(ast.value),
  },
};

// ============================================================================
// Server setup
// ============================================================================

async function startServer() {
  const schema = buildSubgraphSchema([{ typeDefs, resolvers }]);

  const server = new ApolloServer({
    schema,
    plugins: {
      didResolveOperation: async (context) => {
        console.log(`[Users] Executing query: ${context.operationName}`);
      },
      didEncounterErrors: async (context) => {
        context.errors.forEach((error) => {
          console.error(`[Users] GraphQL Error:`, error.message);
        });
      },
    },
  });

  const { url } = await startStandaloneServer(server, {
    listen: { port: 4001 },
  });

  console.log(`🚀 Users subgraph ready at ${url}`);
}

startServer().catch((error) => {
  console.error("Failed to start server:", error);
  process.exit(1);
});

// users-subgraph with Spring Boot + DGS Framework (Netflix DGS)
// pom.xml: com.netflix.graphql.dgs:graphql-dgs-spring-boot-starter

@DgsComponent
public class UserDataFetcher {

    // Mock database — replace with real DB
    private static final Map<String, User> USER_DB = Map.of(
        "user-1", new User("user-1", "alice@example.com", "alice",
            LocalDateTime.of(2024, 1, 15, 0, 0),
            new UserProfile("https://example.com/avatars/alice.jpg",
                "Software engineer interested in GraphQL",
                List.of(new SocialLink("twitter", "https://twitter.com/alice"),
                        new SocialLink("github", "https://github.com/alice")))),
        "user-2", new User("user-2", "bob@example.com", "bob",
            LocalDateTime.of(2024, 2, 20, 0, 0),
            new UserProfile("https://example.com/avatars/bob.jpg",
                "Product manager focused on developer tools",
                List.of(new SocialLink("linkedin", "https://linkedin.com/in/bob"))))
    );

    @DgsQuery
    public User user(@InputArgument String id) {
        var user = USER_DB.get(id);
        if (user == null) throw new DgsEntityNotFoundException("User not found: " + id);
        return user;
    }

    @DgsQuery
    public List<User> users(@InputArgument List<String> ids) {
        return ids.stream()
            .map(USER_DB::get)
            .filter(Objects::nonNull)
            .collect(Collectors.toList());
    }

    // Federation entity resolution
    @DgsEntityFetcher(name = "User")
    public User resolveReference(Map<String, Object> reference) {
        String id = (String) reference.get("id");
        var user = USER_DB.get(id);
        if (user == null) throw new DgsEntityNotFoundException("User not found: " + id);
        return user;
    }
}

// Data models
public record User(String id, String email, String username,
    LocalDateTime createdAt, UserProfile profile) {}
public record UserProfile(String avatar, String bio, List<SocialLink> socialLinks) {}
public record SocialLink(String platform, String url) {}

// Spring Boot main
@SpringBootApplication
public class UsersSubgraphApplication {
    public static void main(String[] args) {
        SpringApplication.run(UsersSubgraphApplication.class, args);
    }
}

# users-subgraph with FastAPI + Strawberry GraphQL
# pip install strawberry-graphql[fastapi] uvicorn

import strawberry
from strawberry.fastapi import GraphQLRouter
from strawberry.federation import Schema
from fastapi import FastAPI
from datetime import datetime
from typing import Optional

@strawberry.federation.type(keys=["id"])
class User:
    id: str
    email: str
    username: str
    created_at: datetime
    profile: "UserProfile"

    @classmethod
    def resolve_reference(cls, id: str) -> "User":
        user = USER_DB.get(id)
        if not user:
            raise ValueError(f"User not found: {id}")
        return user

@strawberry.type
class UserProfile:
    avatar: Optional[str]
    bio: Optional[str]
    social_links: list["SocialLink"]

@strawberry.type
class SocialLink:
    platform: str
    url: str

# Mock database
USER_DB: dict[str, User] = {
    "user-1": User(
        id="user-1", email="alice@example.com", username="alice",
        created_at=datetime(2024, 1, 15),
        profile=UserProfile(
            avatar="https://example.com/avatars/alice.jpg",
            bio="Software engineer interested in GraphQL",
            social_links=[
                SocialLink(platform="twitter", url="https://twitter.com/alice"),
                SocialLink(platform="github", url="https://github.com/alice"),
            ],
        ),
    ),
    "user-2": User(
        id="user-2", email="bob@example.com", username="bob",
        created_at=datetime(2024, 2, 20),
        profile=UserProfile(
            avatar="https://example.com/avatars/bob.jpg",
            bio="Product manager focused on developer tools",
            social_links=[SocialLink(platform="linkedin", url="https://linkedin.com/in/bob")],
        ),
    ),
}

@strawberry.type
class Query:
    @strawberry.field
    def user(self, id: str) -> User:
        user = USER_DB.get(id)
        if not user:
            raise ValueError(f"User not found: {id}")
        return user

    @strawberry.field
    def users(self, ids: list[str]) -> list[User]:
        return [u for uid in ids if (u := USER_DB.get(uid))]

schema = Schema(query=Query, enable_federation_2=True)
app = FastAPI()
app.include_router(GraphQLRouter(schema), prefix="/graphql")

// users-subgraph with ASP.NET Core + Hot Chocolate GraphQL
// NuGet: HotChocolate.AspNetCore, HotChocolate.ApolloFederation

[ExtendObjectType(OperationTypeNames.Query)]
public class UserQueries
{
    private static readonly Dictionary<string, User> UserDb = new()
    {
        ["user-1"] = new User("user-1", "alice@example.com", "alice",
            new DateTime(2024, 1, 15),
            new UserProfile("https://example.com/avatars/alice.jpg",
                "Software engineer interested in GraphQL",
                [new SocialLink("twitter", "https://twitter.com/alice"),
                 new SocialLink("github", "https://github.com/alice")])),
        ["user-2"] = new User("user-2", "bob@example.com", "bob",
            new DateTime(2024, 2, 20),
            new UserProfile("https://example.com/avatars/bob.jpg",
                "Product manager focused on developer tools",
                [new SocialLink("linkedin", "https://linkedin.com/in/bob")])),
    };

    public User? GetUser(string id) =>
        UserDb.TryGetValue(id, out var user) ? user : throw new Exception($"User not found: {id}");

    public IEnumerable<User> GetUsers(string[] ids) =>
        ids.Select(id => UserDb.GetValueOrDefault(id)).OfType<User>();
}

[Node]
[Key("id")]
public record User(string Id, string Email, string Username,
    DateTime CreatedAt, UserProfile Profile)
{
    // Federation entity resolver
    [NodeResolver]
    public static User? ResolveReference(string id) =>
        UserQueries.UserDb.GetValueOrDefault(id);
}

public record UserProfile(string? Avatar, string? Bio, List<SocialLink> SocialLinks);
public record SocialLink(string Platform, string Url);

// Program.cs
var builder = WebApplication.CreateBuilder(args);
builder.Services
    .AddGraphQLServer()
    .AddQueryType()
    .AddTypeExtension<UserQueries>()
    .AddType<User>()
    .AddApolloFederationV2();

var app = builder.Build();
app.MapGraphQL();
app.Run();

Entity Resolution and `__resolveReference`

Entity resolution is how federation knows how to fetch a specific entity from its owning service. The __resolveReference function is called when the gateway needs to resolve an entity reference (typically a reference to a type from another subgraph).

How `__resolveReference` Works

When the gateway encounters a type that needs to be resolved, it calls the owning subgraph with a reference object containing the @key fields:

// The gateway sends this to the Users subgraph
{
  __typename: "User"
  id: "user-1"
}

// The subgraph's __resolveReference receives this object
User: {
  __resolveReference: async (reference: { id: string }) => {
    // Fetch and return the full User object
    return userDatabase.get(reference.id);
  }
}

// DGS: entity fetcher is the __resolveReference equivalent
@DgsEntityFetcher(name = "User")
public User resolveReference(Map<String, Object> reference) {
    // The gateway sends: { "__typename": "User", "id": "user-1" }
    String id = (String) reference.get("id");
    // Fetch and return the full User object
    return userDatabase.get(id);
}

# Strawberry: resolve_reference classmethod is the __resolveReference equivalent
@strawberry.federation.type(keys=["id"])
class User:
    id: str

    @classmethod
    def resolve_reference(cls, id: str) -> "User":
        # The gateway sends: { "__typename": "User", "id": "user-1" }
        # Fetch and return the full User object
        return user_database.get(id)

// Hot Chocolate: [NodeResolver] is the __resolveReference equivalent
[Node]
[Key("id")]
public record User(string Id, string Email, string Username)
{
    // The gateway sends: { "__typename": "User", "id": "user-1" }
    [NodeResolver]
    public static User? ResolveReference(string id) =>
        // Fetch and return the full User object
        UserDatabase.TryGetValue(id, out var user) ? user : null;
}

Advanced Entity Resolution with DataLoader

For production systems, use DataLoader to batch entity resolution calls and prevent N+1 problems:

// users-subgraph/src/dataloader.ts

import DataLoader from "dataloader";

export type DataLoaders = {
  userLoader: DataLoader<string, User | null>;
};

export function createDataLoaders(): DataLoaders {
  return {
    userLoader: new DataLoader<string, User | null>(async (userIds) => {
      // Batch fetch users from database
      const users = await db.users.findByIds(userIds);
      
      // Return in same order as requested
      return userIds.map((id) => users.find((u) => u.id === id) || null);
    }),
  };
}

// Spring Boot DataLoader with DGS
import com.netflix.graphql.dgs.DgsDataLoader;
import org.dataloader.BatchLoader;
import java.util.concurrent.CompletableFuture;

@DgsDataLoader(name = "userLoader")
public class UserDataLoader implements BatchLoader<String, User> {
    private final UserRepository userRepository;

    @Override
    public CompletableFuture<List<User>> load(List<String> userIds) {
        return CompletableFuture.supplyAsync(() -> {
            // Batch fetch users from database
            Map<String, User> usersById = userRepository.findAllById(userIds)
                .stream()
                .collect(Collectors.toMap(User::getId, u -> u));
            // Return in same order as requested
            return userIds.stream()
                .map(id -> usersById.getOrDefault(id, null))
                .collect(Collectors.toList());
        });
    }
}

# Python DataLoader with strawberry-graphql
from strawberry.dataloader import DataLoader

async def load_users(user_ids: list[str]) -> list[User | None]:
    # Batch fetch users from database
    users = await db.users.find_by_ids(user_ids)
    users_by_id = {u.id: u for u in users}
    # Return in same order as requested
    return [users_by_id.get(uid) for uid in user_ids]

def create_dataloaders():
    return {"user_loader": DataLoader(load_fn=load_users)}

// Hot Chocolate DataLoader
public class UserByIdDataLoader : BatchDataLoader<string, User>
{
    private readonly IUserRepository _repository;

    public UserByIdDataLoader(IBatchScheduler batchScheduler, IUserRepository repository)
        : base(batchScheduler)
    {
        _repository = repository;
    }

    protected override async Task<IReadOnlyDictionary<string, User>> LoadBatchAsync(
        IReadOnlyList<string> keys, CancellationToken ct)
    {
        // Batch fetch users from database
        var users = await _repository.FindByIdsAsync(keys, ct);
        return users.ToDictionary(u => u.Id);
    }
}

Then use it in your __resolveReference:

User: {
  __resolveReference: async (
    reference: { id: string },
    _,
    context: { dataloaders: DataLoaders }
  ) => {
    const user = await context.dataloaders.userLoader.load(reference.id);
    if (!user) throw new Error(`User not found: ${reference.id}`);
    return user;
  }
}

@DgsEntityFetcher(name = "User")
public CompletableFuture<User> resolveReference(
        Map<String, Object> reference,
        DataFetchingEnvironment env) {
    String id = (String) reference.get("id");
    DataLoader<String, User> loader = env.getDataLoader("userLoader");
    return loader.load(id)
        .thenApply(user -> {
            if (user == null) throw new DgsEntityNotFoundException("User not found: " + id);
            return user;
        });
}

@strawberry.federation.type(keys=["id"])
class User:
    id: str

    @classmethod
    async def resolve_reference(cls, info: strawberry.types.Info, id: str) -> "User":
        user = await info.context["user_loader"].load(id)
        if not user:
            raise ValueError(f"User not found: {id}")
        return user

[ReferenceResolver]
public static async Task<User> ResolveReferenceAsync(
    string id,
    UserByIdDataLoader dataLoader,
    CancellationToken ct)
{
    var user = await dataLoader.LoadAsync(id, ct);
    if (user is null) throw new Exception($"User not found: {id}");
    return user;
}

Setting Up Apollo Gateway / Apollo Router

The gateway (or router) is the entry point that composes all subgraphs and routes queries. Apollo provides two options:

Apollo Gateway (Node.js)

// gateway/src/index.ts

import { ApolloGateway, IntrospectAndCompose } from "@apollo/gateway";
import { ApolloServer } from "@apollo/server";
import { startStandaloneServer } from "@apollo/server/standalone";

async function startGateway() {
  const gateway = new ApolloGateway({
    supergraphSdl: new IntrospectAndCompose({
      subgraphs: [
        { name: "users", url: "http://localhost:4001/graphql" },
        { name: "orders", url: "http://localhost:4002/graphql" },
      ],
      // Polling interval for schema changes
      pollIntervalInMs: 10000,
    }),
  });

  const server = new ApolloServer({
    gateway,
    // Context is shared with subgraphs
    context: async ({ req }) => ({
      userId: req.headers["x-user-id"],
      dataloaders: createDataLoaders(),
    }),
  });

  const { url } = await startStandaloneServer(server, {
    listen: { port: 4000 },
  });

  console.log(`🚀 Gateway ready at ${url}`);
}

startGateway().catch(console.error);

// Apollo Gateway equivalent in Java — use Apollo Router (Rust) in production
// For Java, Netflix DGS Gateway or Spring Cloud Gateway with GraphQL aggregation

@SpringBootApplication
@EnableFeignClients
public class GatewayApplication {

    @Bean
    public GraphQLSchema buildFederatedSchema() {
        // Using graphql-java federation for gateway composition
        var usersSchema = fetchSchemaFromSubgraph("http://localhost:4001/graphql");
        var ordersSchema = fetchSchemaFromSubgraph("http://localhost:4002/graphql");

        return FederatedSchemaBuilder.newFederatedSchema()
            .mergeSubgraph("users", usersSchema)
            .mergeSubgraph("orders", ordersSchema)
            .build();
    }

    public static void main(String[] args) {
        SpringApplication.run(GatewayApplication.class, args);
    }
}

// application.yml
// server.port: 4000
// subgraphs:
//   users: http://localhost:4001/graphql
//   orders: http://localhost:4002/graphql
//   poll-interval-ms: 10000

# Apollo Gateway equivalent in Python using Ariadne federation
# pip install ariadne starlette uvicorn httpx

from ariadne import make_executable_schema
from ariadne.contrib.federation import make_federated_schema
from starlette.applications import Starlette
from starlette.routing import Route
from ariadne.asgi import GraphQL
import httpx

async def start_gateway():
    # Fetch and compose schemas from subgraphs
    async with httpx.AsyncClient() as client:
        users_sdl = await fetch_subgraph_sdl(client, "http://localhost:4001/graphql")
        orders_sdl = await fetch_subgraph_sdl(client, "http://localhost:4002/graphql")

    # Build federated schema
    schema = make_federated_schema([users_sdl, orders_sdl])

    app = Starlette(routes=[
        Route("/graphql", GraphQL(schema, debug=True)),
    ])
    return app

async def fetch_subgraph_sdl(client: httpx.AsyncClient, url: str) -> str:
    response = await client.post(url, json={"query": "{ _service { sdl } }"})
    return response.json()["data"]["_service"]["sdl"]

// ASP.NET Core Federation Gateway using Hot Chocolate
// NuGet: HotChocolate.Stitching

var builder = WebApplication.CreateBuilder(args);

builder.Services
    .AddHttpClient("users", c => c.BaseAddress = new Uri("http://localhost:4001/graphql"))
    .AddHttpClient("orders", c => c.BaseAddress = new Uri("http://localhost:4002/graphql"));

builder.Services
    .AddGraphQLServer()
    .AddRemoteSchema("users")
    .AddRemoteSchema("orders")
    .AddTypeExtensionsFromFile("./stitching.graphql"); // Optional local extensions

var app = builder.Build();
app.MapGraphQL(); // Listens on port 4000 by default
app.Run();

Apollo Router (Production-Recommended)

Apollo Router is a high-performance Rust-based router that’s production-ready:

# router.yaml

supergraph:
  listen: 127.0.0.1:4000
  path: /graphql

subgraphs:
  users:
    routing_url: http://localhost:4001
  orders:
    routing_url: http://localhost:4002

plugins:
  authentication:
    subgraph:
      all:
        - propagate_header:
            named: "authorization"
  telemetry:
    apollo:
      api_key: ${APOLLO_KEY}

Query Planning and Execution

Query planning is where the gateway determines the optimal sequence of subgraph calls to satisfy a query. The planner:

Analyzes the query structure
Determines which subgraphs own which fields
Plans entity references across boundaries
Batches requests to minimize round trips
Merges results in the correct structure

Example Query Plan

For this query:

query {
  user(id: "user-1") {
    id
    username
    orders {
      id
      total
    }
  }
}

The query plan looks like:

1. [users] Query.user(id: "user-1") 
   → Returns User { id, username }

2. [orders] Reference Resolution
   → users.__resolveReference({ id: "user-1" })
   → Returns Order list for user

3. [merge] Combine results

Enable query plan debugging:

const server = new ApolloServer({
  gateway,
  plugins: {
    didResolveOperation: async (context) => {
      console.log("Query Plan:", JSON.stringify(context.requestContext.queryPlan, null, 2));
    },
  },
});

// Enable query plan tracing via DGS instrumentation
@Component
public class QueryPlanLogger implements DgsExecutionCustomizer {
    private static final Logger log = LoggerFactory.getLogger(QueryPlanLogger.class);

    @Override
    public ExecutionInput.Builder customize(ExecutionInput.Builder builder,
            ExecutionInput executionInput, DgsRequestData requestData) {
        // Log the incoming query for debugging
        log.debug("Executing query: {}", executionInput.getQuery());
        // Apollo Router exposes query plans via the x-apollo-query-plan header
        return builder;
    }
}

# Enable query plan debugging via Strawberry/Ariadne extensions
from strawberry.extensions import QueryDepthLimiter
from strawberry.types import ExecutionContext

class QueryPlanLogger:
    def on_executing_start(self) -> None:
        pass  # Hook into execution lifecycle

# For Apollo Router, query plans appear in logs with:
# RUST_LOG=apollo_router=debug
# or use the --dev flag for Apollo Router to expose query plan traces

// Enable query plan tracing with Hot Chocolate diagnostic events
public class QueryPlanDiagnosticEventListener : ExecutionDiagnosticEventListener
{
    private readonly ILogger<QueryPlanDiagnosticEventListener> _logger;

    public override IDisposable ExecuteRequest(IRequestContext context)
    {
        return new RequestScope(_logger, context);
    }

    private class RequestScope : IDisposable
    {
        private readonly IRequestContext _context;
        private readonly ILogger _logger;

        public RequestScope(ILogger logger, IRequestContext context)
        {
            _logger = logger;
            _context = context;
        }

        public void Dispose()
        {
            if (_context.Result is IQueryResult result)
            {
                // Log query plan from extensions
                if (result.Extensions?.TryGetValue("queryPlan", out var plan) == true)
                    _logger.LogDebug("Query Plan: {Plan}", plan);
            }
        }
    }
}

N+1 Problem and DataLoader Pattern

The classic N+1 problem occurs when you fetch a parent entity, then fetch children one by one:

The Problem

query {
  users(limit: 100) {      # 1 query
    id
    orders {               # N queries (1 per user)
      id
    }
  }
}

This results in 1 + N database queries across service boundaries.

The Solution: Batch Loading

Implement batch reference resolution on the orders subgraph:

// orders-subgraph/src/dataloaders.ts

import DataLoader from "dataloader";

export function createOrderDataLoaders() {
  const ordersByUserIdLoader = new DataLoader(async (userIds: string[]) => {
    // Single batch query for all user IDs
    const ordersByUserId = await db.orders.findByUserIds(userIds);
    
    // Return orders grouped by user ID in request order
    return userIds.map((userId) => ordersByUserId[userId] || []);
  }, {
    batchScheduleFn: (callback) => {
      // Delay batch by 5ms to allow accumulation
      setTimeout(callback, 5);
    },
  });

  return { ordersByUserIdLoader };
}

@DgsDataLoader(name = "ordersByUserIdLoader")
public class OrdersByUserIdDataLoader implements BatchLoader<String, List<Order>> {
    private final OrderRepository orderRepository;

    @Override
    public CompletableFuture<List<List<Order>>> load(List<String> userIds) {
        return CompletableFuture.supplyAsync(() -> {
            // Single batch query for all user IDs
            Map<String, List<Order>> ordersByUserId = orderRepository
                .findByUserIdIn(userIds)
                .stream()
                .collect(Collectors.groupingBy(Order::getUserId));
            // Return orders grouped by user ID in request order
            return userIds.stream()
                .map(uid -> ordersByUserId.getOrDefault(uid, List.of()))
                .collect(Collectors.toList());
        });
    }
}

async def load_orders_by_user_ids(user_ids: list[str]) -> list[list[Order]]:
    # Single batch query for all user IDs
    all_orders = await db.orders.find_by_user_ids(user_ids)
    orders_by_user: dict[str, list[Order]] = {}
    for order in all_orders:
        orders_by_user.setdefault(order.user_id, []).append(order)
    # Return orders grouped by user ID in request order
    return [orders_by_user.get(uid, []) for uid in user_ids]

def create_order_dataloaders():
    return {"orders_by_user_id_loader": DataLoader(load_fn=load_orders_by_user_ids)}

public class OrdersByUserIdDataLoader : GroupedDataLoader<string, Order>
{
    private readonly IOrderRepository _repository;

    public OrdersByUserIdDataLoader(IBatchScheduler batchScheduler, IOrderRepository repository)
        : base(batchScheduler)
    {
        _repository = repository;
    }

    protected override async Task<ILookup<string, Order>> LoadGroupedBatchAsync(
        IReadOnlyList<string> keys, CancellationToken ct)
    {
        // Single batch query for all user IDs
        var orders = await _repository.FindByUserIdsAsync(keys, ct);
        return orders.ToLookup(o => o.UserId);
    }
}

Then use it in your resolver:

User: {
  orders: async (
    reference: { id: string },
    _,
    context: { dataloaders: DataLoaders }
  ) => {
    return context.dataloaders.ordersByUserIdLoader.load(reference.id);
  },
}

@DgsData(parentType = "User", field = "orders")
public CompletableFuture<List<Order>> getUserOrders(
        DgsDataFetchingEnvironment env) {
    String userId = env.getSource().toString(); // or cast to User
    DataLoader<String, List<Order>> loader = env.getDataLoader("ordersByUserIdLoader");
    return loader.load(userId);
}

@strawberry.federation.type(keys=["id"])
class User:
    id: str

    @strawberry.field
    async def orders(self, info: strawberry.types.Info) -> list["Order"]:
        return await info.context["orders_by_user_id_loader"].load(self.id)

[ExtendObjectType(typeof(User))]
public class UserOrdersExtension
{
    public async Task<IEnumerable<Order>> GetOrders(
        [Parent] User user,
        OrdersByUserIdDataLoader dataLoader,
        CancellationToken ct) =>
        await dataLoader.LoadAsync(user.Id, ct);
}

Now the entire users query triggers:

1 query to fetch 100 users
1 batch query to fetch all orders for those 100 users
Total: 2 queries instead of 101

Authentication and Authorization in Federated Graphs

Authentication should flow from the gateway to all subgraphs. The gateway validates the token and passes identity information via context:

Token Validation in Gateway

// gateway/src/auth.ts

import jwt from "jsonwebtoken";

interface UserContext {
  userId: string;
  email: string;
  roles: string[];
}

export function authenticateRequest(authHeader?: string): UserContext | null {
  if (!authHeader) return null;

  const token = authHeader.replace("Bearer ", "");
  
  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET!) as UserContext;
    return decoded;
  } catch (error) {
    console.error("Token verification failed:", error);
    return null;
  }
}

// gateway/src/main/java/auth/JwtAuthFilter.java
import io.jsonwebtoken.*;
import org.springframework.web.filter.OncePerRequestFilter;

@Component
public class JwtAuthFilter extends OncePerRequestFilter {
    @Value("${jwt.secret}")
    private String jwtSecret;

    @Override
    protected void doFilterInternal(HttpServletRequest request,
            HttpServletResponse response, FilterChain filterChain)
            throws ServletException, IOException {
        String authHeader = request.getHeader("Authorization");
        if (authHeader != null && authHeader.startsWith("Bearer ")) {
            String token = authHeader.substring(7);
            try {
                Claims claims = Jwts.parserBuilder()
                    .setSigningKey(jwtSecret.getBytes())
                    .build()
                    .parseClaimsJws(token)
                    .getBody();
                var auth = new UsernamePasswordAuthenticationToken(
                    claims.getSubject(), null,
                    AuthorityUtils.createAuthorityList(
                        ((List<String>) claims.get("roles")).stream()
                            .map(r -> "ROLE_" + r).toArray(String[]::new)));
                SecurityContextHolder.getContext().setAuthentication(auth);
            } catch (JwtException e) {
                log.error("Token verification failed: {}", e.getMessage());
            }
        }
        filterChain.doFilter(request, response);
    }
}

# gateway/auth.py
import jwt
import os
from dataclasses import dataclass

@dataclass
class UserContext:
    user_id: str
    email: str
    roles: list[str]

def authenticate_request(auth_header: str | None) -> UserContext | None:
    if not auth_header:
        return None
    token = auth_header.replace("Bearer ", "")
    try:
        decoded = jwt.decode(
            token,
            os.getenv("JWT_SECRET", ""),
            algorithms=["HS256"],
        )
        return UserContext(
            user_id=decoded["userId"],
            email=decoded["email"],
            roles=decoded.get("roles", []),
        )
    except jwt.PyJWTError as e:
        print(f"Token verification failed: {e}")
        return None

// gateway/Auth/JwtAuthHandler.cs
using Microsoft.AspNetCore.Authentication.JwtBearer;
using Microsoft.IdentityModel.Tokens;
using System.Text;

public record UserContext(string UserId, string Email, List<string> Roles);

public static class AuthExtensions
{
    public static UserContext? AuthenticateRequest(string? authHeader)
    {
        if (authHeader is null) return null;
        var token = authHeader.Replace("Bearer ", "");
        try
        {
            var handler = new System.IdentityModel.Tokens.Jwt.JwtSecurityTokenHandler();
            var key = Encoding.UTF8.GetBytes(
                Environment.GetEnvironmentVariable("JWT_SECRET") ?? "");
            handler.ValidateToken(token, new TokenValidationParameters
            {
                ValidateIssuerSigningKey = true,
                IssuerSigningKey = new SymmetricSecurityKey(key),
                ValidateIssuer = false,
                ValidateAudience = false,
            }, out var validatedToken);
            var jwt = (System.IdentityModel.Tokens.Jwt.JwtSecurityToken)validatedToken;
            return new UserContext(
                jwt.Claims.First(c => c.Type == "userId").Value,
                jwt.Claims.First(c => c.Type == "email").Value,
                jwt.Claims.Where(c => c.Type == "roles").Select(c => c.Value).ToList()
            );
        }
        catch (Exception ex)
        {
            Console.Error.WriteLine($"Token verification failed: {ex.Message}");
            return null;
        }
    }
}

Context Propagation

const server = new ApolloServer({
  gateway,
  context: async ({ req }) => {
    const user = authenticateRequest(req.headers.authorization);
    
    return {
      user,
      authenticated: !!user,
      dataloaders: createDataLoaders(),
    };
  },
});

// Propagate authenticated user via GraphQL context
@Component
public class GraphQLContextBuilder implements DgsRequestCustomizer {
    @Override
    public DgsRequestData customize(DgsRequestData requestData, HttpServletRequest request) {
        var authHeader = request.getHeader("Authorization");
        var userContext = JwtAuthFilter.extractUserContext(request);
        return DgsRequestData.builder()
            .from(requestData)
            .extensions(Map.of(
                "user", userContext,
                "authenticated", userContext != null
            ))
            .build();
    }
}

# FastAPI context dependency
from fastapi import Request, Depends

async def get_graphql_context(request: Request) -> dict:
    auth_header = request.headers.get("authorization")
    user = authenticate_request(auth_header)
    return {
        "user": user,
        "authenticated": user is not None,
        "dataloaders": create_dataloaders(),
        "request": request,
    }

# In your GraphQL router setup:
app.include_router(GraphQLRouter(schema, context_getter=get_graphql_context))

// ASP.NET Core context propagation via middleware
public class GraphQLContextMiddleware
{
    private readonly RequestDelegate _next;

    public async Task InvokeAsync(HttpContext context)
    {
        var authHeader = context.Request.Headers.Authorization.ToString();
        var user = AuthExtensions.AuthenticateRequest(authHeader);

        context.Items["user"] = user;
        context.Items["authenticated"] = user is not null;

        await _next(context);
    }
}

// In Hot Chocolate, access via IResolverContext:
// var user = context.GetGlobalStateOrDefault<UserContext>("user");

Authorization in Subgraphs

Use field-level directives for authorization:

# orders-subgraph/schema.graphql

directive @auth(roles: [String!]!) on FIELD_DEFINITION

type Order @key(fields: "id") {
  id: ID!
  total: Float! @auth(roles: ["ADMIN", "ORDER_VIEWER"])
  items: [OrderItem!]!
}

Implement the directive:

import { getDirective, MapperKind, mapSchema } from "@graphql-tools/utils";

function authDirectiveTransformer(schema: GraphQLSchema) {
  return mapSchema(schema, {
    [MapperKind.OBJECT_FIELD]: (fieldConfig) => {
      const authDirective = getDirective(schema, fieldConfig, "auth")[0];
      
      if (!authDirective) return fieldConfig;

      const originalResolver = fieldConfig.resolve;

      return {
        ...fieldConfig,
        resolve: async (source, args, context, info) => {
          if (!context.user) {
            throw new Error("Unauthorized: authentication required");
          }

          const requiredRoles: string[] = authDirective.roles;
          const hasRole = requiredRoles.some((role) =>
            context.user.roles.includes(role)
          );

          if (!hasRole) {
            throw new Error(`Forbidden: requires one of ${requiredRoles.join(", ")}`);
          }

          return originalResolver?.(source, args, context, info);
        },
      };
    },
  });
}

// Spring Security method-level authorization on DGS resolvers
@PreAuthorize("hasAnyRole('ADMIN', 'ORDER_VIEWER')")
@DgsData(parentType = "Order", field = "total")
public double getTotal(DgsDataFetchingEnvironment env) {
    Order order = env.getSource();
    return order.getTotal();
}

// Custom authorization directive via DGS instrumentation
@Component
public class AuthDirectiveInstrumentation extends SimpleInstrumentation {
    @Override
    public DataFetcher<?> instrumentDataFetcher(DataFetcher<?> dataFetcher,
            InstrumentationFieldFetchParameters params) {
        var field = params.getEnvironment().getFieldDefinition();
        var authDirective = field.getDirective("auth");
        if (authDirective == null) return dataFetcher;

        @SuppressWarnings("unchecked")
        List<String> requiredRoles = (List<String>) authDirective.getArgument("roles").getValue();

        return env -> {
            var authentication = SecurityContextHolder.getContext().getAuthentication();
            if (authentication == null || !authentication.isAuthenticated()) {
                throw new AccessDeniedException("Unauthorized: authentication required");
            }
            boolean hasRole = authentication.getAuthorities().stream()
                .anyMatch(a -> requiredRoles.contains(a.getAuthority().replace("ROLE_", "")));
            if (!hasRole) {
                throw new AccessDeniedException(
                    "Forbidden: requires one of " + String.join(", ", requiredRoles));
            }
            return dataFetcher.get(env);
        };
    }
}

# Strawberry custom permission class for field-level auth
import strawberry
from strawberry.permission import BasePermission
from strawberry.types import Info

class IsAuthenticated(BasePermission):
    message = "Unauthorized: authentication required"

    def has_permission(self, source, info: Info, **kwargs) -> bool:
        return info.context.get("authenticated", False)

class HasRole(BasePermission):
    def __init__(self, *roles: str):
        self.roles = roles
        self.message = f"Forbidden: requires one of {', '.join(roles)}"

    def has_permission(self, source, info: Info, **kwargs) -> bool:
        user = info.context.get("user")
        if not user:
            return False
        return any(r in user.roles for r in self.roles)

@strawberry.type
class Order:
    id: str

    @strawberry.field(
        permission_classes=[IsAuthenticated, HasRole("ADMIN", "ORDER_VIEWER")]
    )
    def total(self) -> float:
        return self._total

// Hot Chocolate field authorization with custom directive
using HotChocolate.Authorization;

[Authorize(Roles = new[] { "ADMIN", "ORDER_VIEWER" })]
public double Total => _total;

// Or via custom authorization handler
public class AuthDirectiveHandler : AuthorizationHandler
{
    protected override ValueTask<AuthorizeResult> AuthorizeAsync(
        IMiddlewareContext context, AuthorizeDirective directive)
    {
        var user = context.GetGlobalStateOrDefault<UserContext>("user");

        if (user is null)
            return new(AuthorizeResult.NotAllowed);

        var requiredRoles = directive.Roles ?? [];
        bool hasRole = requiredRoles.Any(role => user.Roles.Contains(role));

        if (!hasRole)
            return new(AuthorizeResult.NotAllowed);

        return new(AuthorizeResult.Allowed);
    }
}

// Register in Program.cs:
// builder.Services.AddGraphQLServer()
//     .AddAuthorizationHandler<AuthDirectiveHandler>();

Schema Governance and Composition Checks

As your federated graph grows, schema governance becomes critical. Apollo provides composition checks that validate schema changes before they’re deployed.

Apollo Studio Integration

// .apollo/apollo.config.js

module.exports = {
  client: {
    service: {
      name: "my-federated-graph",
      url: "http://localhost:4000/graphql",
    },
  },
};

// application.yml (Spring Boot DGS federation config)
// apollo:
//   graph-ref: "my-federated-graph@current"
//   key: "${APOLLO_KEY}"
// subgraph:
//   name: "users"
//   url: "http://localhost:4000/graphql"

// Equivalent Java bean configuration:
@Configuration
public class ApolloConfig {
    @Value("${apollo.graph-ref:my-federated-graph@current}")
    private String graphRef;

    @Value("${subgraph.url:http://localhost:4000/graphql}")
    private String subgraphUrl;
}

# apollo_config.py — equivalent configuration for Strawberry federation
import os

APOLLO_CONFIG = {
    "client": {
        "service": {
            "name": os.getenv("APOLLO_GRAPH_REF", "my-federated-graph"),
            "url": os.getenv("SUBGRAPH_URL", "http://localhost:4000/graphql"),
        }
    }
}

// appsettings.json equivalent for Hot Chocolate federation
// {
//   "Apollo": {
//     "GraphRef": "my-federated-graph@current",
//     "Key": "<APOLLO_KEY>"
//   },
//   "Subgraph": {
//     "Name": "users",
//     "Url": "http://localhost:4000/graphql"
//   }
// }

// In Program.cs, bind the configuration:
builder.Services.Configure<ApolloOptions>(
    builder.Configuration.GetSection("Apollo"));

CI/CD Composition Check

# .github/workflows/schema-check.yml

name: GraphQL Schema Check

on:
  pull_request:
    paths:
      - "schema.graphql"

jobs:
  composition-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
        with:
          node-version: "18"

      - name: Install dependencies
        run: npm install

      - name: Run composition check
        run: npx apollo graph check --graph my-federated-graph
        env:
          APOLLO_KEY: ${{ secrets.APOLLO_KEY }}

Rules to enforce:

# apollo.config.yaml

composition:
  checkRules:
    - rule: BREAKING_SCHEMA_CHANGE
      level: ERROR
    - rule: TYPE_QUERY_ROOT_CHANGE
      level: ERROR
    - rule: DIRECTIVE_REMOVED
      level: WARN

Performance: Caching, Persisted Queries, and Query Limits

Response Caching

Set cache hints at the field level:

const resolvers = {
  User: {
    profile: {
      resolve: (user) => user.profile,
      extensions: {
        cacheControl: { maxAge: 3600 }, // 1 hour
      },
    },
    orders: {
      resolve: (user, _, { dataloaders }) => 
        dataloaders.ordersByUserIdLoader.load(user.id),
      extensions: {
        cacheControl: { maxAge: 300, scope: "PRIVATE" }, // 5 min, user-specific
      },
    },
  },
};

// Spring Boot with DGS — use @Cacheable for field-level caching
@DgsData(parentType = "User", field = "profile")
@Cacheable(value = "userProfiles", key = "#env.source.id")
public UserProfile getProfile(DgsDataFetchingEnvironment env) {
    User user = env.getSource();
    return user.getProfile(); // Cached for 1 hour via Spring Cache
}

@DgsData(parentType = "User", field = "orders")
public CompletableFuture<List<Order>> getOrders(DgsDataFetchingEnvironment env) {
    User user = env.getSource();
    // Loaded via DataLoader, cache hints set via Apollo Federation @cacheControl directive
    DataLoader<String, List<Order>> loader = env.getDataLoader("ordersByUserIdLoader");
    return loader.load(user.getId());
}

// In schema: add @cacheControl directive for Apollo Router
// type User { profile: UserProfile @cacheControl(maxAge: 3600) }

# Strawberry — cache hints via schema directive
import strawberry
from strawberry.types import Info

@strawberry.type
class User:
    id: str

    @strawberry.field(directives=[strawberry.directive_field("cacheControl", maxAge=3600)])
    def profile(self) -> "UserProfile":
        return self._profile  # 1 hour cache hint

    @strawberry.field
    async def orders(self, info: Info) -> list["Order"]:
        # 5 min, user-specific — set via @cacheControl(maxAge: 300, scope: PRIVATE)
        return await info.context["orders_by_user_id_loader"].load(self.id)

// Hot Chocolate — cache control via directives or response cache
using HotChocolate.Caching;

[ExtendObjectType(typeof(User))]
public class UserCacheExtensions
{
    [CacheControl(MaxAge = 3600)] // 1 hour
    public UserProfile GetProfile([Parent] User user) => user.Profile;

    [CacheControl(MaxAge = 300, Scope = CacheControlScope.Private)] // 5 min, private
    public async Task<IEnumerable<Order>> GetOrders(
        [Parent] User user,
        OrdersByUserIdDataLoader dataLoader,
        CancellationToken ct) =>
        await dataLoader.LoadAsync(user.Id, ct);
}

Persisted Queries

Persisted queries reduce payload size and improve security:

import { createPersistedQueryPlugin } from "@apollo/server/plugin/persisted-queries";
import { InMemoryLRUCache } from "@apollo/utils.keyvaluecache";

const server = new ApolloServer({
  schema,
  cache: new InMemoryLRUCache(),
  plugins: [
    createPersistedQueryPlugin({
      cache: new InMemoryLRUCache(),
    }),
  ],
});

// Apollo Router handles Automatic Persisted Queries (APQ) natively
// For DGS, enable APQ via Spring Cache
@Configuration
public class PersistedQueriesConfig {
    @Bean
    public CacheManager cacheManager() {
        var cacheConfig = RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofHours(24));
        return RedisCacheManager.builder(redisConnectionFactory())
            .withCacheConfiguration("persisted-queries", cacheConfig)
            .build();
    }
}

// Apollo Router router.yaml:
// apq:
//   enabled: true
//   subgraph:
//     all:
//       enabled: true

# Apollo Router handles APQ natively
# For Ariadne/Strawberry, implement a custom APQ middleware
import hashlib
import json
from starlette.middleware.base import BaseHTTPMiddleware

class AutoPersistedQueryMiddleware(BaseHTTPMiddleware):
    def __init__(self, app, cache=None):
        super().__init__(app)
        self.cache = cache or {}  # Use Redis in production

    async def dispatch(self, request, call_next):
        if request.method == "POST":
            body = await request.json()
            extensions = body.get("extensions", {})
            apq = extensions.get("persistedQuery", {})
            if apq and not body.get("query"):
                query_hash = apq.get("sha256Hash")
                cached_query = self.cache.get(query_hash)
                if cached_query:
                    # Inject cached query into request
                    body["query"] = cached_query
        return await call_next(request)

// Hot Chocolate — Automatic Persisted Queries
using HotChocolate.PersistedQueries;

// In Program.cs:
builder.Services
    .AddGraphQLServer()
    .UseAutomaticPersistedQueryPipeline()
    .AddInMemoryQueryStorage(); // Use Redis in production:
    // .AddRedisQueryStorage(redis);

// Client configuration (using Strawberry Shake or plain HttpClient):
// Add extensions: { persistedQuery: { version: 1, sha256Hash: "..." } }

Client-side:

import { createPersistedQueryLink } from "@apollo/client/link/persisted-queries";

const link = createPersistedQueryLink({
  useGETForHashedQueries: true,
}).concat(httpLink);

// Java Apollo client (using com.apollographql.apollo3)
ApolloClient client = ApolloClient.Builder()
    .serverUrl("http://localhost:4000/graphql")
    .httpEngine(DefaultHttpEngine.Builder()
        .addInterceptor(new AutoPersistedQueryInterceptor())
        .build())
    .build();

# Python client with gql
from gql import Client
from gql.transport.aiohttp import AIOHTTPTransport
import hashlib, json

class PersistedQueryTransport(AIOHTTPTransport):
    async def execute(self, document, *args, **kwargs):
        query_str = print_ast(document)
        query_hash = hashlib.sha256(query_str.encode()).hexdigest()
        kwargs.setdefault("extensions", {})["persistedQuery"] = {
            "version": 1, "sha256Hash": query_hash
        }
        return await super().execute(document, *args, **kwargs)

transport = PersistedQueryTransport(url="http://localhost:4000/graphql")
client = Client(transport=transport)

// Strawberry Shake client (NuGet: StrawberryShake.Transport.Http)
builder.Services
    .AddMyGraphQLClient()
    .ConfigureHttpClient(c => c.BaseAddress = new Uri("http://localhost:4000/graphql"))
    .AddHttpMessageHandler<PersistedQueryHttpMessageHandler>();

// Persisted queries are handled automatically by Strawberry Shake
// during code generation with --persisted-query flag

Query Complexity Analysis

Prevent expensive queries:

import { simpleEstimator, fieldExtensionsEstimator } from "graphql-query-complexity";

const server = new ApolloServer({
  schema,
  plugins: {
    didResolveOperation: (context) => {
      const complexity = getComplexity({
        schema,
        query: context.document,
        variables: context.variableValues,
        estimators: [fieldExtensionsEstimator(), simpleEstimator()],
      });

      if (complexity > 5000) {
        throw new Error(`Query too complex: ${complexity} > 5000`);
      }

      console.log(`Query complexity: ${complexity}`);
    },
  },
});

// DGS query complexity via custom instrumentation
@Component
public class QueryComplexityInstrumentation extends SimpleInstrumentation {
    private static final int MAX_COMPLEXITY = 5000;

    @Override
    public InstrumentationContext<ExecutionResult> beginExecuteOperation(
            InstrumentationExecuteOperationParameters params) {
        var document = params.getExecutionContext().getDocument();
        int complexity = calculateComplexity(document);

        if (complexity > MAX_COMPLEXITY) {
            throw new AbortExecutionException(
                String.format("Query too complex: %d > %d", complexity, MAX_COMPLEXITY));
        }

        log.info("Query complexity: {}", complexity);
        return super.beginExecuteOperation(params);
    }

    private int calculateComplexity(Document document) {
        // Use graphql-java ComplexityCalculator or implement custom logic
        return ComplexityCalculator.calculate(document);
    }
}

# Strawberry query complexity
from strawberry.extensions import QueryDepthLimiter
from strawberry.extensions.query_complexity import QueryComplexityExtension

schema = strawberry.Schema(
    query=Query,
    extensions=[
        QueryDepthLimiter(max_depth=10),
        QueryComplexityExtension(
            max_complexity=5000,
            estimators=[
                SimpleEstimator(default_complexity=1),
                FieldEstimator(),
            ],
        ),
    ],
)

// Hot Chocolate query complexity
using HotChocolate.Execution.Configuration;

builder.Services
    .AddGraphQLServer()
    .SetMaxAllowedComplexity(5000)
    .ModifyRequestOptions(opt =>
    {
        opt.Complexity.Enable = true;
        opt.Complexity.MaximumAllowed = 5000;
        opt.Complexity.DefaultComplexity = 1;
        opt.Complexity.DefaultResolverComplexity = 5;
    });

// Field-level complexity hints:
// [GraphQLComplexity(10)]
// public IEnumerable<Order> GetOrders() => ...

Federation vs REST Comparison

Aspect	GraphQL Federation	REST Microservices
Query Specificity	Fetch exactly what you need	Over/under-fetching
Multiple Resources	Single request, multiple fields	Multiple endpoints, multiple requests
Versioning	Additive schema evolution	API versioning (v1, v2, v3)
Type Safety	Strong schema, auto-generated types	Manual API contracts
Caching	Field-level caching, persisted queries	HTTP caching, cache invalidation
Monitoring	Detailed query insights	Basic request/response logging
Composition	Declarative federation directives	Service discovery, API gateways
Error Handling	Partial success possible	All-or-nothing responses
Learning Curve	Moderate (GraphQL + Federation concepts)	Low (standard REST patterns)
Production Maturity	Production-ready (Apollo)	Widely adopted, stable

When to choose Federation: You have multiple teams, need flexible querying, strong types, and a unified API surface.

When to choose REST: Simple services, standard CRUD operations, or teams unfamiliar with GraphQL.

Production Checklist

Before deploying your federated graph to production:

Architecture & Design

Document subgraph ownership (who owns each type/field)
Define clear boundaries between services
Plan for eventual consistency across services
Design for graceful degradation (partial service failures)
Set up dependency mapping (which subgraphs call which)

Implementation

Implement __resolveReference for all @key types
Add DataLoader for batch entity resolution
Use TypeScript for type safety across all subgraphs
Implement proper error handling and logging
Add request ID tracking for distributed tracing
Set up health check endpoints on all subgraphs

Performance

Enable query complexity analysis
Configure cache hints on frequently-accessed fields
Set up persisted queries for client applications
Use Apollo Router instead of Apollo Gateway (better performance)
Configure appropriate timeouts for subgraph calls
Load test with realistic query patterns
Monitor query execution times across services

Security

Implement authentication at the gateway level
Use authorization directives on sensitive fields
Validate all input at gateway and subgraph levels
Use HTTPS for all service-to-service communication
Rotate JWT secrets regularly
Implement rate limiting at gateway
Audit logs for all administrative changes

Monitoring & Operations

Set up distributed tracing (Jaeger, Datadog, etc.)
Monitor subgraph response times and error rates
Configure alerts for composition check failures
Set up schema change notifications
Implement circuit breakers for failing subgraphs
Monitor gateway memory and CPU usage
Set up dashboards for key metrics
Document runbooks for common issues

Deployment

Automate schema composition checks in CI/CD
Use semantic versioning for subgraph schemas
Plan zero-downtime deployments
Set up canary deployments for schema changes
Keep gateway and subgraph dependencies aligned
Document rollback procedures
Test schema changes in staging before production

Governance

Establish schema review process
Document federation directive usage standards
Set up automated linting for GraphQL schemas
Review composition check failures as a team
Maintain type coverage metrics
Schedule regular architecture reviews

Conclusion

GraphQL Federation provides a powerful, scalable approach to building distributed GraphQL systems. By decentralizing schema ownership and providing declarative composition, federation enables independent teams to move faster while maintaining a unified API surface.

The key to a successful federated graph is:

Clear ownership boundaries and entity resolution
Thoughtful performance optimization (batching, caching, query planning)
Strong type safety and schema governance
Comprehensive monitoring and observability
Security-first design with proper authentication and authorization

Whether you’re starting a new microservices architecture or refactoring an existing system, federation provides a solid foundation for scalable, maintainable GraphQL APIs.

GraphQL Federation Server-to-Server Communication Technologies