Skip to content

Controller architecture

High Level Overview

Physical layout and operating model

Legend:

  • Yellow box: namespace
  • Rounded box: processes
  • Rectangle: CR instances
  • Solid arrow (-->) with reconciles: a controller watches the resource and actively reconciles it
  • Solid arrow (-->) with creates: a controller creates the resource
  • Solid arrow (-->) with operates: a controller manages/deploys another process
  • Dotted arrow (-.->) with consumes: a process reads or references the resource as input without actively watching or reconciling it (i.e. the resource is treated as an input/lookup, not as a trigger for a reconcile loop)
flowchart LR subgraph hypershift cluster-operator([HyperShift Operator]) end subgraph user-clusters HostedClusterA NodePoolA end subgraph cluster-a control-plane-operator([Control Plane Operator]) capi-manager([CAPI Manager]) capi-provider([CAPI Provider]) HostedControlPlane ExternalInfraCluster cp-components([Control Plane Components]) capi-cluster[CAPICluster] capi-machine-template[CAPIInfrastructureMachineTemplate] capi-machineset[CAPI MachineSet] capi-machine[CAPI Machine] capi-provider-machine[CAPIInfrastructureMachine] end cluster-operator-->|reconciles|HostedClusterA cluster-operator-->|operates|control-plane-operator cluster-operator-->|operates|capi-manager cluster-operator-->|operates|capi-provider cluster-operator-->|creates|HostedControlPlane cluster-operator-->|creates|capi-cluster cluster-operator-->|creates|ExternalInfraCluster cluster-operator-->|reconciles|NodePoolA cluster-operator-->|creates|capi-machine-template cluster-operator-->|creates|capi-machineset control-plane-operator-->|operates|cp-components control-plane-operator-->|reconciles|HostedControlPlane capi-manager-->|reconciles|capi-cluster capi-manager-->|reconciles|capi-machineset capi-manager-->|creates|capi-machine capi-provider-->|reconciles|capi-machine capi-provider-->|creates|capi-provider-machine capi-provider-.->|consumes|capi-machine-template

Major Components

HyperShift Operator

The HyperShift Operator is a singleton within the management cluster that manages the lifecycle of hosted clusters represented by HostedCluster resources.

A single version of the the HyperShift Operator knows how to manage multiple hosted OCP versions.

The HyperShift Operator is responsible for:

  • Processing HostedCluster and NodePool resources and managing Control Plane Operator and Cluster API (CAPI) deployments which do the actual work of installing a control plane.
  • Managing the lifecycle of the hosted cluster by handling rollouts of new Control Plane Operator and CAPI deployments based on version changes to HostedCluster and NodePool resources.
  • Aggregating and surfacing information about clusters.

HostedCluster Controller

graph TD hosted-cluster-controller[HostedCluster Controller] --> reconcile([Reconcile HostedCluster]) reconcile --> is-deleted{{Deleted?}} is-deleted -->|Yes| teardown([Teardown]) is-deleted -->|No| sync([Sync]) teardown -->teardown-complete{{Teardown complete?}} teardown-complete -->|Yes| return teardown-complete -->|No| reconcile sync --> create-namespace([Create Namespace]) create-namespace --> deploy-cp-operator([Deploy Control Plane Operator]) deploy-cp-operator --> deploy-capi-manager([Deploy CAPI Manager]) deploy-capi-manager --> deploy-capi-provider([Deploy CAPI Provider]) deploy-capi-provider --> create-capi-cluster([Create CAPICluster]) create-capi-cluster --> create-hosted-control-plane([Create HostedControlPlane]) create-hosted-control-plane --> create-external-infra-cluster([Create ExternalInfraCluster]) create-external-infra-cluster -->has-initial-nodes{{HostedCluster has initial nodes?}} has-initial-nodes -->|Yes| create-node-pool([Create NodePool]) has-initial-nodes -->|No| return create-node-pool --> return return([End])

NodePool Controller

graph TD nodepool-controller[NodePool Controller] --> reconcile([Reconcile NodePool]) reconcile --> is-deleted{{Deleted?}} is-deleted -->|Yes| teardown([Teardown]) is-deleted -->|No| sync([Sync]) sync --> create-capi-machineset([Create CAPIMachineSet]) create-capi-machineset --> create-capi-infra-machine-template([Create CAPIInfrastructureMachineTemplate]) create-capi-infra-machine-template --> return teardown -->teardown-complete{{Teardown complete?}} teardown-complete -->|Yes| return teardown-complete -->|No| reconcile return([End])

ExternalInfraCluster Controller

graph TD external-infra-cluster-controller[ExternalInfraCluster Controller] --> reconcile([Reconcile ExternalInfraCluster]) reconcile --> is-deleted{{Deleted?}} is-deleted -->|Yes| teardown([Teardown]) is-deleted -->|No| sync([Sync]) teardown -->teardown-complete{{Teardown complete?}} teardown-complete -->|Yes| return teardown-complete -->|No| reconcile sync --> get-hosted-control-plane([Get HostedControlPlane]) get-hosted-control-plane -->is-hcp-ready{{Is HostedControlPlane ready?}} is-hcp-ready -->|No| reconcile is-hcp-ready -->|Yes| update-infra-status([Update ExternalInfraCluster status]) update-infra-status --> return return([End])

Control Plane Operator

The Control Plane Operator is deployed by the HyperShift Operator into a hosted control plane namespace and manages the rollout of a single version of the the hosted cluster's control plane.

The Control Plane Operator is versioned in lockstep with a specific OCP version and is decoupled from the management cluster's version.

The Control Plane Operator is responsible for:

  • Provisioning all the infrastructure required to host a control plane (whether this means creating or adopting existing infrastructure). This infrastructure may be management cluster resources, external cloud provider resources, etc.
  • Deploying an OCP control plane configured to run in the context of the provisioned infrastructure.
  • Implementing any versioned behavior necessary to rollout the new version (e.g. version specific changes at layers above OCP itself, like configuration or infrastructure changes).

HostedControlPlane Controller

graph TD hosted-control-plane-controller[HostedControlPlane Controller] --> reconcile([Reconcile HostedControlPlane]) reconcile --> is-deleted{{Deleted?}} is-deleted -->|Yes| teardown([Teardown]) is-deleted -->|No| sync([Sync]) teardown -->teardown-complete{{Teardown complete?}} teardown-complete -->|Yes| return teardown-complete -->|No| reconcile sync --> create-infra([Deploy Control Plane
Components]) create-infra --> create-config-operator([Deploy Hosted Cluster
Config Operator]) create-config-operator -->is-infra-ready{{Infra ready?}} is-infra-ready -->|Yes| update-hosted-controlplane-ready([Update HostedControlPlane status]) is-infra-ready -->|No| reconcile update-hosted-controlplane-ready --> return return([End])

Hosted Cluster Config Operator

The Hosted Cluster Config Operator is a control plane component maintained by HyperShift that's a peer to other control plane components (e.g., etcd, apiserver, controller-manager), and is managed by the Control Plane Operator in the same way as those other control plane components.

The Hosted Cluster Config Operator is versioned in lockstep with a specific OCP version and is decoupled from the management cluster's version.

The Hosted Cluster Config Operator is responsible for:

  • Reading CAs from the hosted cluster to configure the kube controller manager CA bundle running in the hosted control plane
  • Reconciling resources that live on the hosted cluster:
    • CRDs created by operators that are absent from the hosted cluster (RequestCount CRD created by cluster-kube-apiserver-operator)
    • Clearing any user changes to the ClusterVersion resource (all updates should be driven via HostedCluster API)
    • ClusterOperator stubs for control plane components that run outside.
    • Global Configuration that is managed via the HostedCluster API
    • Namespaces that are normally created by operators that are absent from the cluster.
    • RBAC that is normally created by operators that are absent from the cluster.
    • Registry configuration
    • Default ingress controller
    • Control Plane PKI (kubelet serving CA, control plane signer CA)
    • Konnectivity Agent
    • OpenShift APIServer resources (APIServices, Service, Endpoints)
    • OpenShift OAuth APIServer resources (APIServices, Service, Endpoints)
    • Monitoring Configuration (set node selector to non-master nodes)
    • Pull Secret
    • OAuth serving cert CA
    • OAuthClients required by the console
    • Cloud Credential Secrets (contain STS role for components that need cloud access)
    • OLM CatalogSources
    • OLM PackageServer resources (APIService, Service, Endpoints)

Resource dependency diagram

  • Dotted lines are dependencies (ownerRefs)
  • Solid lines are associations (e.g. infrastructureRefs or controlPlaneRefs on specs)
classDiagram HostedCluster HostedControlPlane ..> CAPICluster ExternalInfraCluster ..> CAPICluster CAPICluster ..> HostedCluster CAPICluster --> HostedControlPlane CAPICluster --> ExternalInfraCluster CAPIMachineSet ..> CAPICluster CAPIMachineSet --> CAPIInfrastructureMachineTemplate CAPIMachine ..>CAPIMachineSet CAPIMachine -->CAPIInfrastructureMachine CAPIInfrastructureMachine ..>CAPIMachine CAPIInfrastructureMachineTemplate ..>CAPICluster

Transformations

Trying to show how certain important resources are derived from others. These are resources created by our operators, not by CAPI.

classDiagram CAPICluster ..> HostedControlPlane CAPICluster ..> ExternalInfraCluster HostedControlPlane ..> HostedCluster ExternalInfraCluster ..> HostedCluster
classDiagram CAPIInfrastructureTemplate ..> NodePool CAPIInfrastructureTemplate ..> HostedCluster CAPIMachineSet ..> NodePool CAPIMachineSet ..> HostedCluster CAPIMachineSet ..> CAPIInfrastructureTemplate