-
Notifications
You must be signed in to change notification settings - Fork 212
[WIP] OCPBUGS-65621: add dedicated service account to crb, cvo and version pod #1266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[WIP] OCPBUGS-65621: add dedicated service account to crb, cvo and version pod #1266
Conversation
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Excluded labels (none allowed) (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ehearne-redhat The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
|
/test e2e-aws-ovn-upgrade |
|
@ehearne-redhat: The specified target(s) for The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test e2e-aws-ovn-techpreview |
|
/retest |
| roleRef: | ||
| kind: ClusterRole | ||
| name: system:openshift:scc:privileged | ||
| apiGroup: rbac.authorization.k8s.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure whether this is needed or not... I initially added it because I was looking at the wrong test e2e-upgrade-into-change, and did not see the update-payload-dedicated-sa service account on version- pod in must gather files. However, I did see them in e2e-upgrade-out-of-change .
I'm thinking it might be necessary as the updatepayload pod contains the openshift.io/required-scc: privileged annotation . So, I'll leave it for the reviewer to decide. I can test without too if that suits.
|
/retest |
|
@ehearne-redhat: This pull request references Jira Issue OCPBUGS-65621, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@ehearne-redhat: This pull request references Jira Issue OCPBUGS-65621, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@ehearne-redhat: This pull request references Jira Issue OCPBUGS-65621, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| @@ -8,7 +8,21 @@ metadata: | |||
| roleRef: | |||
| kind: ClusterRole | |||
| name: cluster-admin | |||
| apiGroup: rbac.authorization.k8s.io | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, did we not need this before for some reason? Maybe we'd been relying on there only being one group that contained a ClusterRole kind? 😅 If you have any context on how this worked without this line, having that context in a commit message for future devs would be nice. Seems like there should be something in-cluster alerting on (Cluster)RoleBindings that leave off such an important property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have too much context on how it worked without this line. I added it to see if it would resolve test errors I got from cluster-version-operator pods. It seemed like an important property, so I added it back. I can also test without this line too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namespace/openshift-cluster-version node/ci-op-yy1vhb3d-0bd36-c4ngb-master-2 pod/cluster-version-operator-5b954c7498-xprpt uid/51a23f9d-39b8-4a29-bebd-30fb7de19b57 container/cluster-version-operator restarted 30 times at:
non-zero exit at 2025-12-01 15:56:52.358476306 +0000 UTC m=+612.085122721: cause/Error code/255 reason/ContainerExit ernalversions/factory.go:125" type="*v1.FeatureGate"
E1201 15:56:22.959953 1 reflector.go:205] "Failed to watch" err="failed to list *v1.FeatureGate: featuregates.config.openshift.io is forbidden: User \"system:serviceaccount:openshift-cluster-version:cvo-dedicated-sa\" cannot list resource \"featuregates\" in API group \"config.openshift.io\" at the cluster scope" logger="UnhandledError" reflector="github.com/openshift/client-go/config/informers/externalversions/factory.go:125" type="*v1.FeatureGate"I keep getting this error, for example, in e2e-agnostic-ovn-upgrade-into-change and e2e-agnostic-ovn-upgrade-out-of-change .
install/0000_00_cluster-version-operator_02_service_account.yaml
Outdated
Show resolved
Hide resolved
| namespace: openshift-cluster-version | ||
| annotations: | ||
| kubernetes.io/description: Dedicated Service Account for the Cluster Version Operator. | ||
| include.release.openshift.io/self-managed-high-availability: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into HyperShift, but I expect we'll need a ServiceAccount in the hosted-Kube-API there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look further into this and get back to you. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They appear to be using their own service account and setup --> https://github.com/openshift/hypershift/blob/main/control-plane-operator/controllers/hostedcontrolplane/v2/assets/cluster-version-operator/deployment.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I can't get my changes to work it might not be a bad idea to replicate their own setup. Thanks for pointing me in the right direction! :)
|
/test e2e-hypershift |
|
Will address failing tests tomorrow. |
|
/retest |
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRole | ||
| metadata: | ||
| name: cvo-required-config-reader | ||
| rules: | ||
| - apiGroups: ["config.openshift.io"] | ||
| resources: ["featuregates", "clusteroperators", "clusterversions", "proxies", "infrastructures"] | ||
| verbs: ["get", "list", "watch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core issue of a one failing upgrade job may be a missing cluster profile annotation.
The upgrade-out-of-change jobs are failing on the following error message:
"Failed to watch" err="failed to list *v1.FeatureGate: featuregates.config.openshift.io is forbidden: User \"system:serviceaccount:openshift-cluster-version:cluster-version-operator\" cannot list resource \"featuregates\" in API group \"config.openshift.io\" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io \"cvo-required-config-reader\" not found" logger="UnhandledError" reflector="github.com/openshift/client-go/config/informers/externalversions/factory.go:125" type="*v1.FeatureGate"
The PR's ClusterRoleBinding cvo-required-config-binding is applied because it contains the cluster profile. However, the role referenced in the binding's roleRef is not applied because it does not have a cluster profile. This could result in the RBAC: clusterrole.rbac.authorization.k8s.io \"cvo-required-config-reader\" not found error. The kubeapi server allows to create a RoleBinding, which references a non-existing Role.
The upgrade-into-change jobs are failing on the following error message:
"Failed to watch" err="failed to list *v1.FeatureGate: featuregates.config.openshift.io is forbidden: User \"system:serviceaccount:openshift-cluster-version:default\" cannot list resource \"featuregates\" in API group \"config.openshift.io\" at the cluster scope" logger="UnhandledError" reflector="github.com/openshift/client-go/config/informers/externalversions/factory.go:125" type="*v1.FeatureGate"
This one is funny. I have some theories, but I'll need to have another look tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidHurta thank you so much for your insight! I'll try your suggestion and see if it helps move the out-of change job :)
|
/retest |
|
/retest |
1 similar comment
|
/retest |
|
/retest |
5 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
@ehearne-redhat: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What
openshift-cluster-versionnamespace.Why