You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
controller-runtime (use by the operator) uses a cached client.
This causes issues when quick, successive reconciliations of a resource happen and the subsequent iterations do not get an up to date view on an updated/patched object. For example:
calls an API (either external or kubeapi-server's) to create a resource
the request was successfull, object's status is patched with ID, controller returns from reconciliation loop (update/patch will cause a requeue)
another reconciliation happens
controller (using controller-runtime's client) gets the object and sees that there is no ID in status, attempts to call create again
this can result in
an error if the call specificies unique createria (like ID, or name which has to be unique)
or duplicated resources
Up until now, this has been taken care of by calling one of the Reduce*() helper functions which are aware of the objects' schema and their priorities so that surplus objects can be removed without consequences.
With integration with external APIs (e.g. Konnect API in #370) this does not work as the resource is created in external API and more importantly the create API calls will result in HTTP 409 Conflicts.
pmalek
changed the title
Quick subsequent reconciliations cause errors
Quick subsequent reconciliations cause errors caused by using cached client
Oct 11, 2024
A possible way forward (at least for the resources that are created against an external API like Konnect) we can implement a fallback lookup when 409 Conflict is returned. We could add filtering based on Kubernetes UID so that we're not "adopting" (separate issue #460) resources mapped from elsewhere. This could look like this (example for KonnectGatewayControlPlane):
var sdkConflictError *sdkkonnecterrs.ConflictError
if errors.As(err, &sdkConflictError) {
reqList := operations.ListControlPlanesRequest{
FilterNameEq: lo.ToPtr(cp.Spec.Name),
Labels: lo.ToPtr(KubernetesUIDLabelKey + ":" + string(cp.GetUID())),
}
if cp.Spec.ClusterType != nil {
reqList.FilterClusterTypeEq = lo.ToPtr(string(*cp.Spec.ClusterType))
}
respList, err := sdk.ListControlPlanes(ctx, reqList)
if err != nil {
return err
}
for _, listCP := range respList.ListControlPlanesResponse.Data {
if listCP.Name != nil && *listCP.Name == req.Name {
cp.Status.SetKonnectID(*listCP.ID)
break
}
}
when working with external APIs (like Konnect, hence this might be related to ☔ Konnect managed entities (post GA) #827) we can perform a lookup on conflicts (like proposed in the comment above and how it has been implemented in 1.4.0)
Problem statement
controller-runtime
(use by the operator) uses a cached client.This causes issues when quick, successive reconciliations of a resource happen and the subsequent iterations do not get an up to date view on an updated/patched object. For example:
Up until now, this has been taken care of by calling one of the
Reduce*()
helper functions which are aware of the objects' schema and their priorities so that surplus objects can be removed without consequences.With integration with external APIs (e.g. Konnect API in #370) this does not work as the resource is created in external API and more importantly the create API calls will result in
HTTP 409 Conflict
s.Exemplar logs
Proposed solution
Make use of expectations https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/controller_utils.go
Related material
https://www.youtube.com/watch?v=wMqzAOp15wo&t=748s
Acceptance criteria
The text was updated successfully, but these errors were encountered: