Cluster Providers
A ClusterProvider is one of the three provider types in the openMCP architecture (the other two being PlatformService and ServiceProvider). ClusterProviders are responsible for managing kubernetes clusters and access to them, based on our cluster API.
This document aims to describe the tasks of a ClusterProvider and the contract that it needs to fulfill in order to work within the openMCP ecosystem.
Deploying a ClusterProvider
ClusterProviders are usually deployed via the provider deployment mechanism and need to stick to the corresponding contract.
Implementing a ClusterProvider
Provider Configuration
Most ClusterProviders will probably require some form of configuration. Since the provider deployment does not allow passing in configuration via an argument to the binary directly, they need to read the configuration from a k8s resource. Depending on the provider, it might even allow multiple configuration resources and/or reconcile them instead of just reading them statically.
Cluster Profiles
Out of the configuration(s), the ClusterProvider has to generate ClusterProfile
resources. They serve as some kind of service discovery and look like this:
apiVersion: clusters.openmcp.cloud/v1alpha1
kind: ClusterProfile
metadata:
name: default.gardener.mcpd-gcp-large
spec:
providerConfigRef:
name: mcpd-gcp-large
providerRef:
name: gardener
supportedVersions:
- version: 1.33.3
- deprecated: true
version: 1.33.2
- version: 1.32.7
- deprecated: true
version: 1.32.6
- deprecated: true
version: 1.32.5
- deprecated: true
version: 1.32.4
- deprecated: true
version: 1.32.3
- deprecated: true
version: 1.32.2
spec.providerRef
is the name of the ClusterProvider that created this ClusterProfile
. It should be filled with the value that the provider received via its --provider-name
argument.
spec.providerConfigRef
is the name of the provider configuration that is responsible for this profile. Whether this refers to an actual k8s resource, an internal value or just a static string depends on the provider implementation. It is used as a label value though and therefore has to match the corresponding regex.
spec.supportedVersions
is a list of kubernetes versions that are supported by this provider for this profile.
The name of the ClusterProfile can be freely chosen. In this example, it follows the format
X.Y.Z
, whereX
is the environment name,Y
is the name of the ClusterProvider, andZ
is the name of the provider configuration that created this profile. A naming scheme like this avoids potential conflicts between multiple ClusterProviders (or multiple instances of the same ClusterProvider).
ClusterProfile
resources are cluster-scoped and do not have a status.
Note that each ClusterProvider must at least generate one ClusterProfile
in order to be usable.
Cluster Management
The main purpose of ClusterProviders is the management of k8s clusters. Each ClusterProvider therefore needs a controller that reconciles the Cluster
resource, which looks like this:
apiVersion: clusters.openmcp.cloud/v1alpha1
kind: Cluster
metadata:
annotations:
clusters.openmcp.cloud/providerinfo: foobar
labels:
clusters.openmcp.cloud/k8sversion: 1.31.11
clusters.openmcp.cloud/provider: gardener
name: my-cluster
namespace: my-namespace
spec:
kubernetes:
version: 1.32.8
profile: default.myprovider.myprofile
purposes:
- my-purpose
tenancy: Shared
Some information about the different fields:
- The
clusters.openmcp.cloud/k8sversion
andclusters.openmcp.cloud/provider
labels are not set by default. The cluster provider can populate them to allow for easier filtering or better column information inkubectl get
.- Note that
spec.kubernetes.version
contains a desired k8s version, which does not have to match the actual k8s version that is displayed in the label.
- Note that
- The
clusters.openmcp.cloud/providerinfo
annotation can be used to hold additional provider-specific information. It is displayed as a column onkubectl get -o wide
. spec.kubernetes.version
can contain a desired k8s version. If not set, the provider has to derive it from its configuration. The provider can decide to either throw an error or choose a version if an invalid/unsupported version is specified.spec.profile
is the most important field for a ClusterProvider. It references theClusterProfile
that should be used for this cluster.- The referenced profile contains a reference to the ClusterProvider it belongs to. Since multiple ClusterProviders can run in parallel, this allows a ClusterProvider to determine whether it is responsible for this cluster resource or not.
- ClusterProviders must only ever act on
Cluster
resources that reference profiles belonging to themselves! - The profile is immutable.
- ClusterProviders must only ever act on
- This can also contain further configuration, e.g. for the Gardener ClusterProvider, each provider configuration (which is referenced in the profile) can specify a different Gardener landscape and/or project to use.
- The referenced profile contains a reference to the ClusterProvider it belongs to. Since multiple ClusterProviders can run in parallel, this allows a ClusterProvider to determine whether it is responsible for this cluster resource or not.
spec.purposes
andspec.tenancy
are mostly relevant for the scheduler and usually don't need to be evaluated by the ClusterProvider.
Reconciliation Logic
Before doing anything in a reconciliation, the ClusterProvider needs to check whether it is responsible for the Cluster
resource or not. For this, it has to check if it created the ClusterProfile
that is referenced in spec.profile
itself or if it was created by a different ClusterProvider. It can either keep track of created ClusterProfile
resources internally or compare spec.providerRef.name
in the profile to its own name (passed in via the --provider-name
argument). If the name differs, another ClusterProvider is responsible for this resource and the ClusterProvider must not touch it.
The rest of the reconciliation logic is pretty much provider specific: If the Cluster
resource has a deletion timestamp, delete the k8s cluster and everything that belongs to it and then remove the finalizer. Otherwise, ensure that there is a finalizer on the Cluster
resource and create/update the actual k8s cluster.
Status Reporting
Since creating, updating, or deleting k8s clusters can easily take several minutes, reporting the current status is very important here. It is recommended to make good use of the conditions that are part of the status. ClusterProviders must adhere to the general status reporting rules.
In addition to the common status, the Cluster
status contains a few more fields that can be set by the ClusterProvider:
apiServer
should be filled with the k8s cluster's apiserver endpoint, as soon as it is known.providerStatus
can hold arbitrary data and is meant for provider-specific information. Using it is optional and no other controller will evaluate the contents of this field.
Note that any kind of kubeconfig should not be part of the cluster's status - access to the cluster is managed via AccessRequest
resources.
Access Management
ClusterProviders are not only responsible for creating and deleting k8s clusters, but also for managing access to their clusters. Controllers and human users can request access to a cluster by creating an AccessRequest
resource which looks like this:
apiVersion: clusters.openmcp.cloud/v1alpha1
kind: AccessRequest
metadata:
name: my-access
namespace: my-namespace
labels:
# ClusterProviders must only act on AccessRequests where these two labels are set
# and the value of the first one matches their own provider name.
clusters.openmcp.cloud/provider: myprovider
clusters.openmcp.cloud/profile: default.myprovider.myprofile
spec:
clusterRef: # optional, takes precedence over requestRef if set
name: my-cluster
namespace: foo
requestRef: # optional, at least one of clusterRef and requestRef must be set
name: my-request
namespace: bar
token: # either token or oidc
permissions:
- name: foo # optional, not required usually
namespace: test # optional, results in Role if set and in ClusterRole otherwise
rules:
- apiGroups:
- "*"
resources:
- "*"
verbs:
- "*"
roleRefs:
- kind: ClusterRole
name: my-clusterrole
oidc: # either token or oidc
name: my-oidc-provider
issuer: https://oidc.example.com
clientID: my-client-id
usernameClaim: sub # optional
usernamePrefix: "my-user:"
groupsClaim: group # optional
groupsPrefix: "my-group:"
extraScopes:
- foo
roleBindings:
- subjects:
- kind: User
name: foo
- kind: Group
name: bar
roleRefs:
- kind: ClusterRole
name: my-cluster-role
- kind: Role
name: my-role
namespace: default
roles:
- name: my-admin
rules:
- apiGroups:
- "*"
resources:
- "*"
verbs:
- "*"
Note that, while the example shows both, an AccessRequest
must have exactly one of spec.token
and spec.oidc
set, not both.
Token-based Access
If spec.token
is set, a token-based access is requested. The ClusterProvider is expected to create a ServiceAccount
, create Role
(if namespace
is not empty) and ClusterRole
(if namespace
is empty) resources for each entry in spec.token.permissions
, and create RoleBinding
and ClusterRoleBinding
resources for each entry in spec.token.permissions
and each entry in spec.token.roleRefs
.
Since token-based access is based on standard RBAC and TokenRequest APIs, it should work on any k8s cluster and is expected to be supported by every ClusterProvider.
OIDC-based Access
If spec.oidc
is set, OIDC-based access is requested. Most fields within spec.oidc
are required for setting up the trust relationship.
extraScopes
is meant to be used for the oidc-login
kubectl plugin that handles OIDC authentication.
roleBindings
specifies (Cluster)RoleBindings that should be created, while roles
can be used to construct additonal (Cluster)Roles.
Note that not every ClusterProvider might support OIDC-based access and requesting it could result in an error or a denied request.
The
spec.oidc
field contains a nested struct namedOIDCProviderConfig
that has aDefault()
method. Whenever reading data from this field, it is strongly recommended to have run theDefault()
method first, because it will take care of setting some defaults, such as appending a:
suffix to the username and groups prefixes, if it doesn't exist.
The Preparation of AccessRequests
From a 'raw' AccessRequest
, it is not immediately obvious which ClusterProvider is responsible:
If spec.clusterRef
is not set, first the ClusterRequest
that is referenced in spec.requestRef
needs to be fetched. From there, the Cluster
needs to be fetched, which again leads to the ClusterProfile
and only then the provider knows whether it is responsible or not.
To avoid having to implement this flow in every ClusterProvider and have all ClusterProviders executing it whenever any AccessRequest
changes, there exists a 'generic' AccessRequest controller that takes over this task. This generic controller reacts only on AccessRequest
resources that do not have both the clusters.openmcp.cloud/provider
and the clusters.openmcp.cloud/profile
labels.
It modifies the AccessRequest
in the following way:
- It adds the
clusters.openmcp.cloud/provider
label with the provider name (extracted from theClusterProfile
) as value. - It adds the
clusters.openmcp.cloud/profile
label with theClusterProfile
name as value. - If
spec.clusterRef
is empty, it resolves theClusterRequest
reference and fillsspec.clusterRef
with the information from the ClusterRequest's status.
This means that the AccessRequest controller in a ClusterProvider must only act on AccessRequests that have both of the aforementioned labels set. They can then expect spec.clusterRef
to be set and don't need to check for spec.requestRef
.
It is recommended to use event filtering to avoid reconciling AccessRequests that belong to another provider or have not yet been prepared by the generic controller. The controller-utils library contains a HasLabelPredicate
filter that can be used for both, verifying existence of a label as well as checking if it has a specific value:
import (
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/predicate"
ctrlutils "github.com/openmcp-project/controller-utils/pkg/controller"
clustersv1alpha1 "github.com/openmcp-project/openmcp-operator/api/clusters/v1alpha1"
)
// SetupWithManager sets up the controller with the Manager.
func (r *AccessRequestReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&clustersv1alpha1.AccessRequest{}).
WithEventFilter(predicate.And(
// this checks whether the provider label exists and has the correct value
// 'providerName' holds the value that was passed into the ClusterProvider via the '--provider-name' argument
ctrlutils.HasLabelPredicate(clustersv1alpha1.ProviderLabel, providerName),
// this just checks whether the label exists, independent from its value
ctrlutils.HasLabelPredicate(clustersv1alpha1.ProfileLabel, ""),
// <potentially more event filters>
)).
Complete(r)
}