From 5b62cc43984adf2396c25a4a81a93fc9dc25321d Mon Sep 17 00:00:00 2001 From: Tongtong Zhou Date: Thu, 12 Feb 2026 12:53:28 +0800 Subject: [PATCH] HYPERFLEET-485 - test: add test cases for adapter, API validation, async processing, concurrent processing, and nodepool --- test-design/testcases/adapter.md | 333 +++++++++++++- test-design/testcases/api-validation.md | 434 ++++++++++++++++++ test-design/testcases/async-processing.md | 386 ++++++++++++++++ .../testcases/concurrent-processing.md | 114 +++++ test-design/testcases/nodepool.md | 202 ++++++++ .../testcases/nodepool_lifecycle_negative.md | 108 +++++ 6 files changed, 1576 insertions(+), 1 deletion(-) create mode 100644 test-design/testcases/api-validation.md create mode 100644 test-design/testcases/async-processing.md create mode 100644 test-design/testcases/concurrent-processing.md create mode 100644 test-design/testcases/nodepool_lifecycle_negative.md diff --git a/test-design/testcases/adapter.md b/test-design/testcases/adapter.md index adce183..9c79753 100644 --- a/test-design/testcases/adapter.md +++ b/test-design/testcases/adapter.md @@ -3,7 +3,10 @@ ## Table of Contents 1. [Adapter framework can detect and report failures to cluster API endpoints](#test-title-adapter-framework-can-detect-and-report-failures-to-cluster-api-endpoints) -2. [Adapter framework can detect and handle resource timeouts](#test-title-adapter-framework-can-detect-and-handle-resource-timeouts) +2. [Adapter Job failure correctly reports Health=False](#test-title-adapter-job-failure-correctly-reports-healthfalse) +3. [Adapter framework can detect and handle resource timeouts](#test-title-adapter-framework-can-detect-and-handle-resource-timeouts) +4. [Adapter crash recovery and message redelivery](#test-title-adapter-crash-recovery-and-message-redelivery) +5. [API handles incomplete adapter status reports gracefully](#test-title-api-handles-incomplete-adapter-status-reports-gracefully) --- @@ -80,6 +83,120 @@ curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ --- +## Test Title: Adapter Job failure correctly reports Health=False + +### Description + +This test validates that when an adapter Job runs but exits with a non-zero exit code (simulated via `SIMULATE_RESULT=failure`), the adapter framework correctly detects the job failure and reports `Health=False` back to the HyperFleet API. This is distinct from invalid K8s resource errors — this tests the scenario where the Job is created and scheduled successfully but the workload itself fails. + +**Note:** There are two distinct adapter failure modes that should be differentiated: +- **Job execution failure** (this test): Job is created successfully (`Applied=True`) but the workload fails (`Health=False`). The `data` field should contain error details (exit code, error message). +- **Job creation failure** (covered by test 1 "Adapter framework can detect and report failures"): The Job or K8s resource cannot be created at all (`Applied=False`, `Health=False`). + +Both failure modes should be verified for Cluster and NodePool resource types, as the adapter framework logic applies to both. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions +1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources +2. HyperFleet API, Sentinel, and Adapter services are deployed and running successfully +3. Example-adapter supports `SIMULATE_RESULT=failure` environment variable + +--- + +### Test Steps + +#### Step 1: Record original adapter configuration and set failure mode +**Action:** +- Record current `SIMULATE_RESULT` value for restoration +- Set adapter to failure mode: +```bash +kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=failure +kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet --timeout=60s +``` + +**Expected Result:** +- Adapter deployment updated and new pod is Running + +#### Step 2: Create a cluster to trigger adapter processing +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "failure-test", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' +``` + +**Expected Result:** +- API returns successful response with cluster ID + +#### Step 3: Wait for job execution and verify job failure +**Action:** +- Wait for adapter to process the event and job to run +- Check job status: +```bash +kubectl get jobs -n ${CLUSTER_ID} -o json | jq '.items[0].status.conditions[]? | select(.type=="Failed")' +``` + +**Expected Result:** +- Job exists in the cluster namespace +- Job status shows `Failed=True` +- Job exit code is non-zero + +#### Step 4: Verify adapter status report +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ + | jq '.items[0].conditions' +``` + +**Expected Result:** +- `Applied` condition has `status: "True"` (Job was created successfully before failing) +- `Available` condition has `status: "False"` (work not completed; currently returns empty string — see Issue #25) +- `Health` condition has `status: "False"` with reason indicating job failure +- `data` field contains error details: + - Exit code (non-zero) + - Error message describing the failure + +#### Step 5: Verify top-level resource status reflects failure +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} | jq '.status.conditions' +``` + +**Expected Result:** +- `Ready` condition remains `status: "False"` (cluster did not reach ready state) +- `Available` condition remains `status: "False"` +- Cluster does not transition to Ready while adapter reports failure + +#### Step 6: Restore adapter to normal mode +**Action:** +```bash +kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=success +kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet --timeout=60s +``` + +**Expected Result:** +- Adapter restored to normal operation + +--- + ## Test Title: Adapter framework can detect and handle resource timeouts ### Description @@ -172,3 +289,217 @@ curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ ``` --- + + +## Test Title: Adapter crash recovery and message redelivery + +### Description + +This test validates that when an adapter crashes during event processing, the message broker correctly redelivers the message after the adapter recovers. This ensures that no events are lost due to adapter failures and the system maintains eventual consistency. + +**Note:** This test requires SIMULATE_RESULT=crash mode in the example-adapter, which may not be implemented. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions +1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) +2. HyperFleet API, Sentinel, and Adapter services are deployed +3. Google Pub/Sub is configured with appropriate acknowledgment deadline and retry policy + +--- + +### Test Steps + +#### Step 1: Configure adapter to crash on event receipt +**Action:** +- Set adapter environment variable to crash mode: +```bash +kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=crash +kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet +``` + +**Expected Result:** +- Adapter deployment updated +- New adapter pod starts + +#### Step 2: Create a cluster to trigger event +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "crash-test", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' +``` + +**Expected Result:** +- API returns successful response with cluster ID +- Sentinel publishes event to broker + +#### Step 3: Verify adapter crashes on event receipt +**Action:** +- Monitor adapter pod status: +```bash +kubectl get pods -n hyperfleet -l app.kubernetes.io/instance=example-adapter -w +``` + +**Expected Result:** +- Adapter pod crashes (CrashLoopBackOff or Error state) +- Pod restarts automatically + +#### Step 4: Restore adapter to normal mode +**Action:** +```bash +kubectl set env deployment -n hyperfleet -l app.kubernetes.io/instance=example-adapter SIMULATE_RESULT=success +kubectl rollout status deployment/example-adapter-hyperfleet-adapter -n hyperfleet +``` + +**Expected Result:** +- Adapter deployment updated +- New adapter pod starts and remains Running + +#### Step 5: Verify message redelivery and processing +**Action:** +- Wait for adapter to process redelivered message +- Check cluster status: +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.' +``` + +**Expected Result:** +- Adapter eventually processes the cluster event +- Status is reported (may show Applied=True, Available progressing) +- No event is lost + +--- + +### Notes + +- Message redelivery behavior depends on Google Pub/Sub configuration: + - Uses acknowledgment deadline and retry policy + - If adapter crashes before acknowledging message, Pub/Sub will redeliver after the acknowledgment deadline expires +- Sentinel may also republish events during its polling cycle if generation > observed_generation + +--- + +## Test Title: API handles incomplete adapter status reports gracefully + +### Description + +This test validates that the HyperFleet API can gracefully handle adapter status reports that are missing expected fields. When an adapter reports status with incomplete or missing condition fields (e.g., missing `reason`, `message`, or `observed_generation`), the API should accept the report without crashing and store what is available, rather than rejecting the entire status update. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions +1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources +2. HyperFleet API is deployed and running successfully +3. A cluster resource has been created and its cluster_id is available + +--- + +### Test Steps + +#### Step 1: Submit a status report with missing optional fields +**Action:** +- Send a status report with minimal fields (missing `reason`, `message`): +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ + -H "Content-Type: application/json" \ + -d '{ + "adapter": "test-incomplete-adapter", + "conditions": [ + { + "type": "Applied", + "status": "True" + } + ] + }' +``` + +**Expected Result:** +- API accepts the status report (HTTP 200/201) +- API does not return an error or crash +- Status is stored with available fields; missing fields default to empty or null + +#### Step 2: Submit a status report with missing observed_generation +**Action:** +- Send a status report without `observed_generation`: +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ + -H "Content-Type: application/json" \ + -d '{ + "adapter": "test-no-generation-adapter", + "conditions": [ + { + "type": "Applied", + "status": "True", + "reason": "ResourceApplied", + "message": "Resource applied successfully" + } + ] + }' +``` + +**Expected Result:** +- API accepts the status report +- `observed_generation` defaults to 0 or null +- Cluster conditions are not corrupted + +#### Step 3: Submit a status report with empty conditions array +**Action:** +- Send a status report with an empty conditions list: +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses \ + -H "Content-Type: application/json" \ + -d '{ + "adapter": "test-empty-conditions-adapter", + "conditions": [] + }' +``` + +**Expected Result:** +- API either accepts the report with empty conditions, or returns a clear validation error (HTTP 400) +- API does not crash or return HTTP 500 + +#### Step 4: Verify cluster state is not corrupted +**Action:** +- Retrieve the cluster and its statuses: +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} | jq '.conditions' +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.' +``` + +**Expected Result:** +- Cluster conditions remain consistent and valid +- Previously reported statuses from other adapters are not affected +- Incomplete status entries are stored but do not interfere with cluster Ready/Available evaluation + +--- diff --git a/test-design/testcases/api-validation.md b/test-design/testcases/api-validation.md new file mode 100644 index 0000000..3c629b6 --- /dev/null +++ b/test-design/testcases/api-validation.md @@ -0,0 +1,434 @@ +# Feature: API Input Validation + +## Table of Contents + +1. [API validates cluster name format (RFC 1123)](#test-title-api-validates-cluster-name-format-rfc-1123) +2. [API validates cluster name length](#test-title-api-validates-cluster-name-length) +3. [API validates nodepool name length (15 characters)](#test-title-api-validates-nodepool-name-length-15-characters) +4. [API validates required spec field](#test-title-api-validates-required-spec-field) +5. [API validates Kind field](#test-title-api-validates-kind-field) +6. [API validates JSON format](#test-title-api-validates-json-format) +7. [API handles database connection failure gracefully](#test-title-api-handles-database-connection-failure-gracefully) + +--- + +## Test Title: API validates cluster name format (RFC 1123) + +### Description + +This test validates that the HyperFleet API correctly rejects cluster creation requests with invalid name formats that don't comply with RFC 1123 DNS label naming conventions. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully +2. API is accessible via port-forward or ingress + +--- + +### Test Steps + +#### Step 1: Send POST request with invalid name format +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "Invalid_Name_With_Underscore", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' +``` + +**Expected Result:** +- API returns HTTP 400 Bad Request +- Response contains validation error: +```json +{ + "code": "HYPERFLEET-VAL-000", + "detail": "name must match pattern ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$", + "status": 400, + "title": "Validation Failed" +} +``` + +--- + +## Test Title: API validates cluster name length + +### Description + +This test validates that the HyperFleet API correctly rejects cluster creation requests with names exceeding the maximum allowed length. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully + +--- + +### Test Steps + +#### Step 1: Send POST request with name exceeding max length +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "this-is-a-very-long-cluster-name-that-exceeds-the-maximum-allowed-length-for-cluster-names", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' +``` + +**Expected Result:** +- API returns HTTP 400 Bad Request +- Response contains validation error about name length + +--- + +## Test Title: API validates nodepool name length (15 characters) + +### Description + +This test validates that the HyperFleet API correctly rejects nodepool creation requests with names exceeding 15 characters. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully +2. A valid cluster exists + +--- + +### Test Steps + +#### Step 1: Send POST request with nodepool name exceeding 15 characters +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/nodepools \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "NodePool", + "name": "this-is-too-long-nodepool-name", + "spec": {"nodeCount": 1, "platform": {"type": "gcp", "gcp": {"machineType": "n2-standard-4"}}} + }' +``` + +**Expected Result:** +- API returns HTTP 400 Bad Request +- Response contains validation error: +```json +{ + "code": "HYPERFLEET-VAL-000", + "detail": "name must be at most 15 characters", + "status": 400, + "title": "Validation Failed" +} +``` + +--- + +## Test Title: API validates required spec field + +### Description + +This test validates that the HyperFleet API correctly rejects cluster creation requests with missing or null spec field, as defined in the OpenAPI specification. + +**Note:** This test currently FAILS - see Issue #10. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully + +--- + +### Test Steps + +#### Step 1: Send POST request without spec field +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "test-no-spec" + }' +``` + +**Expected Result:** +- API returns HTTP 400 Bad Request +- Response contains validation error: +```json +{ + "code": "HYPERFLEET-VAL-000", + "detail": "spec is required", + "status": 400, + "title": "Validation Failed" +} +``` + +**Actual Result (Bug - Issue #10):** +- API returns HTTP 200 and creates cluster with `spec: null` + +--- + +## Test Title: API validates Kind field + +### Description + +This test validates that the HyperFleet API correctly rejects requests with invalid Kind field values. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully + +--- + +### Test Steps + +#### Step 1: Send POST request with wrong Kind value +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "WrongKind", + "name": "test-wrong-kind", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' +``` + +**Expected Result:** +- API returns HTTP 400 Bad Request +- Response contains validation error about invalid Kind + +--- + +## Test Title: API validates JSON format + +### Description + +This test validates that the HyperFleet API correctly rejects requests with malformed JSON. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully + +--- + +### Test Steps + +#### Step 1: Send POST request with invalid JSON +**Action:** +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{invalid json content' +``` + +**Expected Result:** +- API returns HTTP 400 Bad Request +- Response contains JSON parsing error + +--- + +## Test Title: API handles database connection failure gracefully + +### Description + +This test validates that the API handles database connection failures gracefully for cluster and nodepool operations, ensuring proper error responses (HTTP 503), no data corruption, and automatic recovery when the connection is restored. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully +2. A cluster has been created and is in Ready state +3. Ability to simulate database connection failure (e.g., stop PostgreSQL pod) +4. Ability to restore database connection + +--- + +### Test Steps + +#### Step 1: Establish baseline - verify normal operations +**Action:** +- Verify API is functioning correctly before simulating failure: +```bash +# Verify cluster operations +curl -s ${API_URL}/api/hyperfleet/v1/clusters | jq '.total' + +# Verify nodepool operations (using a known cluster ID) +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/nodepools | jq '.total' +``` + +**Expected Result:** +- API returns HTTP 200 with valid cluster list +- API returns HTTP 200 with valid nodepool list +- Record baseline data for comparison after recovery + +#### Step 2: Simulate database connection failure +**Action:** +- Stop or disrupt the PostgreSQL pod: +```bash +kubectl scale deployment -n hyperfleet hyperfleet-hyperfleet-api-postgresql --replicas=0 +``` + +**Expected Result:** +- PostgreSQL pod is terminated + +#### Step 3: Verify API returns proper error during outage +**Action:** +- Attempt API operations during database outage: +```bash +# List clusters +curl -s -w "\n%{http_code}" ${API_URL}/api/hyperfleet/v1/clusters + +# Create cluster +curl -s -w "\n%{http_code}" -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "db-failure-test", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' + +# List nodepools (using a known cluster ID from baseline) +curl -s -w "\n%{http_code}" ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/nodepools + +# Create nodepool +curl -s -w "\n%{http_code}" -X POST ${API_URL}/api/hyperfleet/v1/nodepools \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "NodePool", + "name": "db-fail-np", + "cluster_id": "'${CLUSTER_ID}'", + "spec": {"nodeCount": 1, "platform": {"type": "gcp", "gcp": {"machineType": "n2-standard-4"}}} + }' +``` + +**Expected Result:** +- All operations (cluster and nodepool) return HTTP 503 (Service Unavailable) +- API process does not crash +- No partial data is written +- API service remains running and responsive (returns errors, does not hang) + +#### Step 4: Restore database connection +**Action:** +- Restore the PostgreSQL pod: +```bash +kubectl scale deployment -n hyperfleet hyperfleet-hyperfleet-api-postgresql --replicas=1 +kubectl rollout status deployment/hyperfleet-hyperfleet-api-postgresql -n hyperfleet --timeout=60s +``` +- Wait for API to reconnect + +**Expected Result:** +- PostgreSQL pod starts successfully +- API reconnects to the database automatically + +#### Step 5: Verify recovery and data integrity +**Action:** +- Verify normal operations resume: +```bash +# Verify cluster operations +curl -s ${API_URL}/api/hyperfleet/v1/clusters | jq '.total' + +# Verify nodepool operations +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/nodepools | jq '.total' +``` + +**Expected Result:** +- API returns HTTP 200 with valid data for both cluster and nodepool operations +- All baseline clusters and nodepools are intact (no data loss or corruption) +- No orphaned records from failed operations during outage +- End-to-end workflow functions correctly (create new cluster and verify it reaches Ready state) + +--- diff --git a/test-design/testcases/async-processing.md b/test-design/testcases/async-processing.md new file mode 100644 index 0000000..1ef4941 --- /dev/null +++ b/test-design/testcases/async-processing.md @@ -0,0 +1,386 @@ +# Feature: Asynchronous Processing and Generation Tracking + +## Table of Contents + +1. [Generation update triggers new adapter processing](#test-title-generation-update-triggers-new-adapter-processing) +2. [Sentinel detects and publishes generation changes](#test-title-sentinel-detects-and-publishes-generation-changes) +3. [Adapter tracks observed_generation correctly](#test-title-adapter-tracks-observed_generation-correctly) +4. [Adapter resilience - processes events after restart](#test-title-adapter-resilience---processes-events-after-restart) +5. [Multiple events for same cluster are idempotent](#test-title-multiple-events-for-same-cluster-are-idempotent) + +--- + +## Test Title: Generation update triggers new adapter processing + +### Description + +This test validates that when a cluster's spec is updated, the generation field is incremented, and Sentinel detects this change and triggers a new adapter processing cycle. This ensures that spec changes are propagated through the system correctly. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Positive | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API, Sentinel, and Adapter are deployed and running +2. A cluster exists in Ready state + +--- + +### Test Steps + +#### Step 1: Create a cluster and wait for Ready state +**Action:** +```bash +# Create cluster +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "Cluster", + "name": "generation-test", + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-central1"}}} + }' + +# Wait for Ready state +``` + +**Expected Result:** +- Cluster created with generation=1 +- Cluster reaches Ready state + +--- + +#### Step 2: Record initial generation and observed_generation +**Action:** +```bash +# Get cluster generation +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} | jq '.generation' + +# Get adapter observed_generation +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.items[0].observed_generation' +``` + +**Expected Result:** +- Cluster generation = 1 +- Adapter observed_generation = 1 + +--- + +#### Step 3: Update cluster spec to trigger generation increment +**Action:** +```bash +curl -X PATCH ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} \ + -H "Content-Type: application/json" \ + -d '{ + "spec": {"platform": {"type": "gcp", "gcp": {"projectID": "test", "region": "us-west1"}}} + }' +``` + +**Expected Result:** +- API returns updated cluster with generation=2 +- Spec change is persisted + +--- + +#### Step 4: Wait for Sentinel to detect and publish event +**Action:** +- Wait for Sentinel polling interval (30-60 seconds) +- Check Sentinel logs for generation change detection + +**Expected Result:** +- Sentinel detects generation changed from 1 to 2 +- Sentinel publishes new event to broker + +--- + +#### Step 5: Verify adapter processed new generation +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.items[0].observed_generation' +``` + +**Expected Result:** +- Adapter observed_generation = 2 +- Adapter has processed the updated spec + +--- + +## Test Title: Sentinel detects and publishes generation changes + +### Description + +This test validates that Sentinel correctly detects generation changes during its polling cycle and publishes events only for resources that need processing. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Positive | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API and Sentinel are deployed and running +2. Access to Sentinel logs + +--- + +### Test Steps + +#### Step 1: Create cluster and wait for initial processing +**Action:** +- Create cluster via API +- Wait for adapter to process and report status + +**Expected Result:** +- Cluster created +- Adapter status reported with observed_generation=1 + +--- + +#### Step 2: Monitor Sentinel logs during polling +**Action:** +```bash +kubectl logs -n hyperfleet -l app.kubernetes.io/name=sentinel -f +``` + +**Expected Result:** +- Sentinel logs show polling activity +- No events published for clusters where generation == observed_generation + +--- + +#### Step 3: Update cluster spec +**Action:** +- Update cluster spec via API +- Monitor Sentinel logs + +**Expected Result:** +- Sentinel detects generation > observed_generation +- Sentinel publishes event for the cluster +- Log shows "generation changed" or similar message + +--- + +## Test Title: Adapter tracks observed_generation correctly + +### Description + +This test validates that adapters correctly track and report observed_generation, which indicates the last generation they successfully processed. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Positive | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet system is deployed and running +2. Adapter is processing events correctly + +--- + +### Test Steps + +#### Step 1: Create cluster and verify initial observed_generation +**Action:** +```bash +# Create cluster +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d '{"kind": "Cluster", "name": "obs-gen-test", "spec": {...}}' + +# Wait and check status +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.items[0].observed_generation' +``` + +**Expected Result:** +- observed_generation = 1 (matches cluster generation) + +--- + +#### Step 2: Update cluster multiple times +**Action:** +- Update cluster spec 3 times +- Check observed_generation after each update + +**Expected Result:** +- After update 1: generation=2, observed_generation eventually=2 +- After update 2: generation=3, observed_generation eventually=3 +- After update 3: generation=4, observed_generation eventually=4 + +--- + +## Test Title: Adapter resilience - processes events after restart + +### Description + +This test validates that adapters can recover from restarts and continue processing pending events. Events that were not acknowledged before restart should be redelivered and processed. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Positive | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet system is deployed and running +2. Adapter is running normally + +--- + +### Test Steps + +#### Step 1: Create cluster and verify initial processing +**Action:** +- Create cluster via API +- Verify adapter processes it and reports status + +**Expected Result:** +- Cluster created and processed +- Adapter status shows Available=True or processing in progress + +--- + +#### Step 2: Restart adapter pod +**Action:** +```bash +kubectl delete pod -n hyperfleet -l app.kubernetes.io/name=hyperfleet-adapter +``` + +**Expected Result:** +- Adapter pod is terminated +- New adapter pod starts up + +--- + +#### Step 3: Create new cluster while adapter is restarting +**Action:** +- Create another cluster via API during adapter restart window + +**Expected Result:** +- Cluster created successfully (API is independent of adapter) + +--- + +#### Step 4: Verify adapter processes pending events after restart +**Action:** +- Wait for adapter to fully restart +- Check status of the new cluster + +**Expected Result:** +- Adapter recovers and connects to broker +- Adapter processes pending events +- Both clusters eventually show adapter status + +--- + +## Test Title: Multiple events for same cluster are idempotent + +### Description + +This test validates that when multiple events are published for the same cluster (e.g., due to retries or rapid updates), the adapter handles them idempotently without creating duplicate resources. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Positive | +| **Priority** | Tier2 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet system is deployed and running +2. Adapter uses discovery pattern to check existing resources + +--- + +### Test Steps + +#### Step 1: Create cluster and wait for initial processing +**Action:** +- Create cluster via API +- Wait for adapter to create resources + +**Expected Result:** +- Cluster created +- Adapter creates namespace and other resources +- Status reported + +--- + +#### Step 2: Trigger multiple events for same cluster +**Action:** +- Update cluster spec multiple times in rapid succession +- Or restart Sentinel to trigger re-polling + +**Expected Result:** +- Multiple events may be published for the same cluster + +--- + +#### Step 3: Verify no duplicate resources created +**Action:** +```bash +# Check namespace count (should be exactly 1) +kubectl get namespace ${CLUSTER_ID} + +# Check job count (should be 1 or managed correctly) +kubectl get jobs -n ${CLUSTER_ID} +``` + +**Expected Result:** +- Only one namespace exists for the cluster +- No duplicate jobs or resources +- Adapter discovers existing resources and evaluates postconditions + +--- + +#### Step 4: Verify status is consistent +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.items | length' +``` + +**Expected Result:** +- Only one status entry per adapter +- Status reflects latest state (not duplicated) + +--- diff --git a/test-design/testcases/concurrent-processing.md b/test-design/testcases/concurrent-processing.md new file mode 100644 index 0000000..3e52659 --- /dev/null +++ b/test-design/testcases/concurrent-processing.md @@ -0,0 +1,114 @@ +# Feature: Concurrent Processing + +## Table of Contents + +1. [Concurrent cluster creation - no message loss](#test-title-concurrent-cluster-creation---no-message-loss) + +--- + +## Test Title: Concurrent cluster creation - no message loss + +### Description + +This test validates that the system can handle multiple cluster creation requests submitted simultaneously without message loss, resource conflicts, or processing failures. It ensures that the message broker, Sentinel, and adapters can correctly process concurrent events and that all clusters reach their expected final state. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Positive | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions +1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources +2. HyperFleet API, Sentinel, and Adapter services are deployed and running successfully +3. The adapters defined in testdata/adapter-configs are all deployed successfully + +--- + +### Test Steps + +#### Step 1: Submit 5 cluster creation requests simultaneously +**Action:** +- Submit 5 POST requests in parallel using background processes: +```bash +for i in $(seq 1 5); do + curl -X POST ${API_URL}/api/hyperfleet/v1/clusters \ + -H "Content-Type: application/json" \ + -d "{ + \"kind\": \"Cluster\", + \"name\": \"concurrent-test-${i}\", + \"spec\": {\"platform\": {\"type\": \"gcp\", \"gcp\": {\"projectID\": \"test\", \"region\": \"us-central1\"}}} + }" & +done +wait +``` + +**Expected Result:** +- All 5 requests return successful responses (HTTP 200/201) +- Each response contains a unique cluster ID +- No request is rejected or fails due to concurrency + +#### Step 2: Wait for all clusters to be processed +**Action:** +- Poll each cluster's status until all reach Ready state or a timeout is reached: +```bash +for CLUSTER_ID in ${CLUSTER_IDS[@]}; do + curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} | jq '.conditions' +done +``` + +**Expected Result:** +- All 5 clusters eventually reach Ready=True and Available=True +- No cluster is stuck in a pending or processing state indefinitely + +#### Step 3: Verify Kubernetes resources for all clusters +**Action:** +- Check that each cluster has its own namespace and expected resources: +```bash +for CLUSTER_ID in ${CLUSTER_IDS[@]}; do + kubectl get namespace ${CLUSTER_ID} + kubectl get jobs -n ${CLUSTER_ID} +done +``` + +**Expected Result:** +- 5 separate namespaces exist (one per cluster) +- Each namespace contains the expected jobs/resources created by adapters +- No cross-contamination between clusters (resources are isolated) + +#### Step 4: Verify adapter statuses for all clusters +**Action:** +- Check that each cluster has complete adapter status reports: +```bash +for CLUSTER_ID in ${CLUSTER_IDS[@]}; do + curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/statuses | jq '.items | length' +done +``` + +**Expected Result:** +- Each cluster has the expected number of adapter status entries +- All adapters report Applied=True, Available=True, Health=True for each cluster +- No missing status reports (no message was lost) + +#### Step 5: Cleanup +**Action:** +- Delete all 5 test clusters: +```bash +for CLUSTER_ID in ${CLUSTER_IDS[@]}; do + curl -X DELETE ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID} +done +``` + +**Expected Result:** +- All clusters are deleted successfully + +--- diff --git a/test-design/testcases/nodepool.md b/test-design/testcases/nodepool.md index c783973..557060c 100644 --- a/test-design/testcases/nodepool.md +++ b/test-design/testcases/nodepool.md @@ -4,6 +4,8 @@ 1. [Nodepools Resource Type - Workflow Validation](#test-title-nodepools-resource-type---workflow-validation) 2. [Nodepools Resource Type - K8s Resource Check Aligned with Preinstalled NodePool Related Adapters Specified](#test-title-nodepools-resource-type---k8s-resource-check-aligned-with-preinstalled-nodepool-related-adapters-specified) +3. [Nodepools Resource Type - List and Get API Operations](#test-title-nodepools-resource-type---list-and-get-api-operations) +4. [Nodepools Resource Type - Multiple NodePools Coexistence](#test-title-nodepools-resource-type---multiple-nodepools-coexistence) --- @@ -173,3 +175,203 @@ curl -X DELETE ${API_URL}/api/hyperfleet/v1/nodepools/{nodepool_id} ``` --- + +## Test Title: Nodepools Resource Type - List and Get API Operations + +### Description + +This test validates that the NodePool list and get API endpoints work correctly. It verifies that all nodepools belonging to a cluster can be listed, and that individual nodepool details can be retrieved by nodepool ID. + +--- + +| **Field** | **Value** | +|-----------|---------------| +| **Pos/Neg** | Positive | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions + +1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources +2. HyperFleet API and HyperFleet Sentinel services are deployed and running successfully +3. The adapters defined in testdata/adapter-configs are all deployed successfully +4. A cluster resource has been created and its cluster_id is available +5. At least one nodepool has been created under the cluster + +--- + +### Test Steps + +#### Step 1: Create a nodepool under the cluster +**Action:** +- Submit a POST request to create a NodePool resource: +```bash +curl -X POST ${API_URL}/api/hyperfleet/v1/nodepools \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "NodePool", + "name": "list-get-test", + "cluster_id": "'${CLUSTER_ID}'", + "spec": { ... } + }' +``` + +**Expected Result:** +- API returns successful response with nodepool ID + +#### Step 2: List all nodepools belonging to the cluster +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/nodepools | jq '.' +``` + +**Expected Result:** +- Response returns HTTP 200 (OK) status code +- Response contains an array of nodepools +- The created nodepool appears in the list +- Each nodepool entry includes: id, name, cluster_id, conditions, spec + +#### Step 3: Get individual nodepool details by ID +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/nodepools/${NODEPOOL_ID} | jq '.' +``` + +**Expected Result:** +- Response returns HTTP 200 (OK) status code +- Response contains the full nodepool object including: + - `id`: Matches the requested nodepool ID + - `name`: Matches the created nodepool name + - `cluster_id`: Matches the parent cluster ID + - `conditions`: Current condition states + - `spec`: The nodepool spec as submitted + +#### Step 4: Cleanup resources +**Action:** +- Delete nodepool-specific Kubernetes resources: +```bash +kubectl delete -n {cluster_id} +``` + +**Expected Result:** +- Nodepool-specific resources are deleted successfully + +**Note:** This is a workaround cleanup method. Once CLM supports DELETE operations for "nodepools" resource type, this step should be replaced with: +```bash +curl -X DELETE ${API_URL}/api/hyperfleet/v1/nodepools/{nodepool_id} +``` + +--- + +## Test Title: Nodepools Resource Type - Multiple NodePools Coexistence + +### Description + +This test validates that multiple nodepools can be created under the same cluster and coexist without conflicts. It verifies that each nodepool is processed independently by the adapters, has its own set of Kubernetes resources, and reports its own status without interfering with other nodepools. + +--- + +| **Field** | **Value** | +|-----------|---------------| +| **Pos/Neg** | Positive | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + + +--- + +### Preconditions + +1. Environment is prepared using [hyperfleet-infra](https://github.com/openshift-hyperfleet/hyperfleet-infra) with all required platform resources +2. HyperFleet API and HyperFleet Sentinel services are deployed and running successfully +3. The adapters defined in testdata/adapter-configs are all deployed successfully +4. A cluster resource has been created and its cluster_id is available + - **Cleanup**: Cluster resource cleanup should be handled in test suite teardown where cluster was created + +--- + +### Test Steps + +#### Step 1: Create multiple nodepools under the same cluster +**Action:** +- Create 3 nodepools with different names: +```bash +for i in 1 2 3; do + curl -X POST ${API_URL}/api/hyperfleet/v1/nodepools \ + -H "Content-Type: application/json" \ + -d "{ + \"kind\": \"NodePool\", + \"name\": \"np-coexist-${i}\", + \"cluster_id\": \"${CLUSTER_ID}\", + \"spec\": { ... } + }" +done +``` + +**Expected Result:** +- All 3 nodepools are created successfully +- Each returns a unique nodepool ID + +#### Step 2: Verify all nodepools appear in the list +**Action:** +```bash +curl -s ${API_URL}/api/hyperfleet/v1/clusters/${CLUSTER_ID}/nodepools | jq '.items | length' +``` + +**Expected Result:** +- List contains all 3 nodepools +- Each nodepool has a distinct ID and name + +#### Step 3: Verify each nodepool reaches Ready state independently +**Action:** +- Check each nodepool's conditions: +```bash +for NODEPOOL_ID in ${NODEPOOL_IDS[@]}; do + curl -s ${API_URL}/api/hyperfleet/v1/nodepools/${NODEPOOL_ID} | jq '.conditions' +done +``` + +**Expected Result:** +- All 3 nodepools eventually reach Ready=True and Available=True +- Each nodepool's adapter status is independent (one nodepool's failure does not block others) + +#### Step 4: Verify Kubernetes resources are isolated per nodepool +**Action:** +- Check that each nodepool has its own set of resources: +```bash +kubectl get configmaps -n ${CLUSTER_ID} -l nodepool-id +``` + +**Expected Result:** +- Each nodepool's resources are labeled/named distinctly +- No resource name collisions between nodepools +- Resources for one nodepool do not overwrite resources of another + +#### Step 5: Cleanup resources + +**Action:** +- Delete nodepool-specific Kubernetes resources: +```bash +kubectl delete -n {cluster_id} +``` + +**Expected Result:** +- All nodepool-specific resources are deleted successfully + +**Note:** This is a workaround cleanup method. Once CLM supports DELETE operations for "nodepools" resource type, this step should be replaced with: +```bash +curl -X DELETE ${API_URL}/api/hyperfleet/v1/nodepools/{nodepool_id} +``` + +--- diff --git a/test-design/testcases/nodepool_lifecycle_negative.md b/test-design/testcases/nodepool_lifecycle_negative.md new file mode 100644 index 0000000..77f00ee --- /dev/null +++ b/test-design/testcases/nodepool_lifecycle_negative.md @@ -0,0 +1,108 @@ +# Feature: NodePool Lifecycle - Negative Tests + +## Table of Contents + +1. [Create nodepool with non-existent cluster returns 404](#test-title-create-nodepool-with-non-existent-cluster-returns-404) + +--- + +## Test Title: Create nodepool with non-existent cluster returns 404 + +### Description + +This test validates that the HyperFleet API correctly validates the existence of the parent cluster resource when creating a nodepool, returning HTTP 404 Not Found for non-existent clusters. This ensures proper resource hierarchy validation and prevents orphaned nodepool records. + +--- + +| **Field** | **Value** | +|-----------|-----------| +| **Pos/Neg** | Negative | +| **Priority** | Tier1 | +| **Status** | Draft | +| **Automation** | Not Automated | +| **Version** | MVP | +| **Created** | 2026-02-11 | +| **Updated** | 2026-02-11 | + +--- + +### Preconditions +1. HyperFleet API is deployed and running successfully +2. No cluster exists with the test cluster ID + +--- + +### Test Steps + +#### Step 1: Attempt to create nodepool with non-existent cluster ID +**Action:** +```bash +FAKE_CLUSTER_ID="non-existent-cluster-12345" + +curl -X POST ${API_URL}/api/hyperfleet/v1/clusters/${FAKE_CLUSTER_ID}/nodepools \ + -H "Content-Type: application/json" \ + -d '{ + "kind": "NodePool", + "name": "np-test", + "spec": { + "nodeCount": 1, + "platform": { + "type": "gcp", + "gcp": { + "machineType": "n2-standard-4" + } + } + } + }' +``` + +**Expected Result:** +- API returns HTTP 404 Not Found +- Response follows RFC 9457 Problem Details format: +```json +{ + "type": "https://api.hyperfleet.io/errors/not-found", + "title": "Not Found", + "status": 404, + "detail": "Cluster not found" +} +``` + +--- + +#### Step 2: Verify error response format (RFC 9457) +**Action:** +- Parse the error response and verify required fields + +**Expected Result:** +- Response contains `type` field +- Response contains `title` field +- Response contains `status` field with value 404 +- Optional: Response contains `detail` field with descriptive message + +--- + +#### Step 3: Verify no nodepool was created +**Action:** +```bash +curl -X GET ${API_URL}/api/hyperfleet/v1/clusters/${FAKE_CLUSTER_ID}/nodepools +``` + +**Expected Result:** +- API returns HTTP 404 (cluster doesn't exist, so nodepools list is not accessible) +- OR returns empty list if API allows listing nodepools for non-existent clusters + +--- + +#### Step 4: Verify no Sentinel events triggered +**Action:** +- Check that no namespace was created for the fake cluster +```bash +kubectl get namespace ${FAKE_CLUSTER_ID} +``` + +**Expected Result:** +- Namespace does not exist +- No adapter processing occurred (validation-level rejection) + +---