Skip to content

pipeline-manager: pipeline tags and distinguish client metadata#6229

Merged
Karakatiza666 merged 3 commits into
mainfrom
group-pipelines
Jun 19, 2026
Merged

pipeline-manager: pipeline tags and distinguish client metadata#6229
Karakatiza666 merged 3 commits into
mainfrom
group-pipelines

Conversation

@Karakatiza666

@Karakatiza666 Karakatiza666 commented May 13, 2026

Copy link
Copy Markdown
Contributor

pipeline-manager changes

This PR adds the ClientMetadata struct that is flattened on serialization and is not bound to the pipeline deployment state; it is to be used by clients to store certain metadata - like description, or tags for organizing the pipelines. client_metadata can be updated when the pipeline is running, it does not trigger version nor refresh_version increments.

web-console changes

Re-generate API bindings, forward the new tags field to be accessed later

Added search bar to the pipeline list in the sidebar when editing a pipeline

Python SDK

Added support for the new tags field

Testing: manual

@Karakatiza666 Karakatiza666 requested a review from snkas May 13, 2026 22:52
@mihaibudiu

Copy link
Copy Markdown
Contributor

so this is not per user, it's per pipeline manager.
what happens to your view if someone else is moving pipelines from another browser?

@gz

gz commented May 13, 2026

Copy link
Copy Markdown
Contributor

I'm a bit confused why we went with folders when users were asking for tags and in the issue everyone seemed to like tags?

@gz gz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just build the version people asked for (tags) from the beginning no need to confuse users with different models

@@ -0,0 +1,6 @@
-- Add metadata field to the pipeline table.
-- This is arbitrary, optional text provided by the user.
ALTER TABLE pipeline ADD COLUMN metadata TEXT NOT NULL DEFAULT '';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's good to have a catch-all JSON element this. We're already strictly structured, we dont have to switch to the dark side of using unstructured data on top of a relational database.

pub struct PipelineInfo {
pub id: PipelineId,
pub name: String,
/// Deprecated: use `metadata` instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to mark it deprecated; nothing has changed we weren't using it before we dont need to use it now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to discuss this more with Simon once he's back; I think it's benefitial to fold the "description" column into the "metadata" column model I propose here

&& program_config.is_none())
);

// Metadata-only fast path. `metadata` is a free-form client-side annotation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand how it can be free form and at the same time serve as a web-ui folder structure.

If we use if for that IMO it deserves an well structured field that is reserved for use by webconsole so we dont have arbitrary clients overwriting/using it differently.

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

@gz

We're already strictly structured, we dont have to switch to the dark side of using unstructured data on top of a relational database.

The model I am proposing is not "unstructured", it just doesn't bind relational structure to the shape structure for a family of pipeline information fields with similar semantics - they are not used by the backend, not relevant to backend or pipeline deployment lifecycle, only used by the clients - "description", "tags", "path". JSON operations on a small dataset (for the number of pipelines) are fine in Postgres.

Are you saying you would prefer a e.g. "tags" column in PipelineDescriptor, with the same semantics as the "metadata" field I added? (can be changed without triggering pipeline version updates, when the pipeline is running)?

@Karakatiza666

Karakatiza666 commented May 14, 2026

Copy link
Copy Markdown
Contributor Author

... a well structured field that is reserved for use by webconsole so we dont have arbitrary clients overwriting/using it differently.

Two points I want to make here:

  • database structure =/= API structure; if we wanted to harden the API contract we would add validation at endpoint level, not modify the relational schema
  • I don't expect rogue clients to not respect the structure or overwriting the field, losing web-console's information. I guess I misused the term "free-form"; I propose "metadata" as a JSON field which contains various fields clients can set for e.g. organizing the pipelines - like "tags" or "path". Clients would respect this convention and not overwrite the field with e.g. plaintext, but individual clients can add their unique metadata fields they want to without affecting metadata set by other clients, and without having to add support for it in pipeline-manager.
    With the current, simpler model of pipeline-manager not validating "metadata" field it is easier for clients to implement related features, and, again, we don't expect rogue clients that just overwrite other clients' data

@Karakatiza666

Karakatiza666 commented May 14, 2026

Copy link
Copy Markdown
Contributor Author

The customer did not ask for tags specifically, they listed tags and folders as ways to solve their UX problem. I need to discuss tags with Anna because the feature is not trivial - good location to show and edit tags, and e.g. version can be displayed as another tag, which could improve the design and UX

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

Whether tags or folders, the feature is likely to be used by tooling around Feldera some of our customers have built, not just web-console

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

what happens to your view if someone else is moving pipelines from another browser?

They see the change immediately - within the polling period of web-console, which is 2 seconds now

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

What I would like to do is merge this PR with folders as a simpler solution, and then open a separate PR for user-defined tags and search-by-tag. Tags and folders are orthogonal, both are optional, both are useful.
The folder feature is not invasive, it is only "visible" through the small drag icon at the left of the pipeline list item

@snkas snkas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My design suggestion for the backend is:

  • Adding a pipeline field tags, of type Vec<String>
  • Adding a pipeline table field tags with type VARCHAR[] or TEXT[]
  • The tags field can be edited like any other field using PATCH /v0/pipelines/<name>
  • Only tricky part is allowing the tags to be editable while the pipeline is running.

I can get to it sometime mid next week.

Reasoning for my design:

  • In future, the backend can potentially filter by tag without having to do JSON parsing
  • In future, the tags can still be migrated as the type is known to the database, without having to do JSON operations
  • Having a separate pipeline_tag table is likely too much as they are always part of the pipeline
  • Deprecating description should be a separate PR
  • It seems unnecessary to have metadata as nested, as what is "metadata" is not as well defined, and at least for now besides tags I'm not sure how much more of these fields we would add

The above does assume that we don't support for instance defining colors of tags, as that would require them to become independently management objects with their own API endpoint.

@snkas

snkas commented May 15, 2026

Copy link
Copy Markdown
Contributor

Regarding tags vs. nested directory structure: it seems to be most simple to just have tags, as it allows a pipeline to belong to multiple groups, and it is a concept users are already familiar with (e.g., GitHub). Directory structure is also a big departure from the current list.

@snkas

snkas commented May 15, 2026

Copy link
Copy Markdown
Contributor

For design of UI, this seems most straightforward (mock-up):
tags

  • In search bar, you can hint Search pipelines... e.g., "example", "tag:example"
  • There could be a dropdown filter by tags as well, that shows all the tags and once selected shows all pipelines with that tag

@lalithsuresh

lalithsuresh commented May 15, 2026

Copy link
Copy Markdown
Contributor

NACK from me on folders too. Folders are used in environments where there is a natural filesystem hierarchy for projects and a mix of content types (e.g. workspace, catalogs, schemas and separate files for sql like in dbt and other warehouse cloud products). It's not a good fit here.

Tags are the most appropriate here. There's many other things in the backlog we should get to first, but let's go through UI design here as well before implementing anything.

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

I'll remove the folders and make this PR pipeline-manager only. Tags will be waiting for the UI design

@mythical-fred mythical-fred left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend (pipeline-manager) changes are well-structured and thoroughly tested — the PipelineFieldUpdates struct with exhaustive destructuring is a nice pattern, and the metadata-only fast-path that skips version/refresh_version bumps is well-reasoned.

However, the web-console side adds ~1,060 lines of new behavioral code (DraggableTreeView, PipelineTreeView, SidebarPipelineTreeView, plus the Table.svelte rewrite) with zero tests. This is a hard-rule violation — behavior changes without tests get REQUEST_CHANGES, no exceptions. See inline comments for specifics.

Gerd's concerns about the metadata field design are still open; I won't duplicate them here.

@@ -0,0 +1,588 @@
<script lang="ts" generics="T extends { name: string }">

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 588 lines of new interactive behavior — drag-and-drop, folder creation, folder renaming, expand/collapse state, multi-selection propagation. None of it has tests.

Good first targets for Vitest + @testing-library/svelte:

  1. Tree-building logic (ensureFolder, sortNode) — extract into a pure function and unit-test the tree shape for various folder-path inputs.
  2. folderCheckState — pure logic over a set; easy to test without DOM.
  3. is_metadata_only-style classification on the JS side: parseMetadata, buildMetadata, getFolderPath — these are already pure functions, just not tested.
  4. Component tests: render the tree view with a few items, simulate a drag from one row onto another, assert the onCreateFolderFor callback fires with the correct arguments.

The recommended stack is Vitest + @testing-library/svelte — runs in Node, no browser, no flake.

if (!metadata) {
return {}
}
try {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY: parseMetadata is duplicated here and in Table.svelte (lines ~57-64). Extract it into a shared utility (e.g. $lib/functions/pipelines/metadata.ts) — along with getFolderPath and buildMetadata — so the parsing logic lives in one place and can be unit-tested once.

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

I agree that "description" API field should not be deprecated in the PR. More so, "description" and "tags" fields should be top-level in the SelectedPipelineInfo API struct and others. I have updated the PR to reflect that.

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

I argue against having separate "description" and "tags" columns in the Pipeline table.
I believe there is a stronger technical case to have the single client_metadata JSON column that stores the "description" and "tags":

  • For the current requested feature set, the backend does not need to know about the concept of tags. This is purely a client-driven feature; the only thing pipeline-manager cares about is the semantics of the family of the client-driven metadata fields: they are independent from pipeline versioning and deployment runtime status, so need special handling when PATCH-ing the pipeline. There is no feature in the near future that would require pipeline-manager knowing about any of these fields - past the strong API typing.
  • Simon cited worse performance of querying the JSON field compared to the SQL ARRAY TEXT. In our case, the performance impact is not material - firstly, all handling of the tags and other metadata is client-side, and there is no use case to process it in pipeline-manager, secondly - pipeline-manager would need to store tens of thousands of pipelines for impact of querying over JSON vs TEXT ARRAY to be noticeable. PostgreSQL 11 we currently use has plenty of operations for querying over JSON. For example, GIN index can be declared to speed up set queries on an array inside JSONB column.
  • If needed, writing a SQL query over JSON(B) field is only a marginal complication, and if we really add more features that require SQL querying over metadata we can easily migrate the schema to dedicated SQL columns because the structure is well known by convention and API contract.

The benefits of a single client_metadata column are clear: the pipeline-manager remains simpler, and clients can iterate on the functionality with trivial changes to the backend without the need for DB migrations.

@Karakatiza666 Karakatiza666 changed the title Add ability to group pipelines into "folders" Make client metadata for pipelines a special case in pipeline-manager, add 'tags' May 18, 2026

@snkas snkas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments -- let's discuss on how best to get this PR in the shape to get merged.

// "material change happened" counter; client-metadata edits are not
// material.
// - No `pipeline_monitor_event` is emitted. Such an event triggers a
// Postgres `NOTIFY` that wakes the per-pipeline runner. The runner has

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no relation between pipeline_monitor_event and NOTIFY.

// This will also return an error if the pipeline does not exist.
let current = get_pipeline(txn, tenant_id, original_name).await?;

// Apply the patch to a copy of the current client metadata so we can both

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a departure from the current way that patches are handled, only at the top level.

///
/// Persisted as a single JSON object in the `client_metadata` text column
/// (renamed from `metadata` in V35). The schema lives in code, not the
/// database — adding a new annotation field only requires extending this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there so many em-dashes in all comments (which is not ASCII)?

/// not break when newer writers add fields. Empty / missing values
/// (description == "", empty tags) are normalized to `None` so that the
/// serialized form stays minimal (`{}` for fully empty metadata, which we
/// further collapse to `""` in storage).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Can you add in the commit description how backward compatibility is done?


/// Self-descriptive tags for grouping / filtering.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub tags: Option<Vec<String>>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any enforcement on tags?

if self.is_empty() {
String::new()
} else {
// Serializing a fixed-shape struct of strings cannot fail.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed-shape struct? What is that?

}
}

/// True when no annotations are set.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are annotations?

pub fn apply_patch(&mut self, patch: ClientMetadata) {
let ClientMetadata { description, tags } = patch;
if let Some(d) = description {
self.description = if d.is_empty() { None } else { Some(d) };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description is just an empty string, why is it interpreted as None now?

self.description = if d.is_empty() { None } else { Some(d) };
}
if let Some(t) = tags {
self.tags = if t.is_empty() { None } else { Some(t) };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty array seems perfectly fine?

Comment thread openapi.json
}
},
"PatchPipeline": {
"type": "object",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the OpenAPI spec shift so much?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just did due to the new CLientMetadata

/// Client-generated data stored alongside a pipeline.
///
/// Persisted as a single JSON object in the `client_metadata` text column
/// (renamed from `metadata` in V35). The schema lives in code, not the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no V35 migration?

@snkas

snkas commented May 18, 2026

Copy link
Copy Markdown
Contributor

Does it pass CI?

@Karakatiza666

Karakatiza666 commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

Does it pass CI?

I haven't run CI on it yet

@mythical-fred mythical-fred left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big improvement since my last pass — the untested ~1,060-line drag-and-drop tree view is gone, folders dropped, and the backend now ships a proper integration test (pipeline_client_metadata_update_while_running) that covers the metadata fast-path including the version/refresh_version/monitor-event invariants. The destructuring-based PipelineFieldUpdates pattern is nice. Downgrading my previous REQUEST_CHANGES to a COMMENT.

Not approving yet:

  • @snkas has an active review pass with many open prose threads and a real V35 migration question; let that play out.
  • The metadata-shape design (client_metadata JSON column vs flat tags/description columns) is still being debated between @gz, @snkas and the author. Not my call to settle.

One residual finding inline.

Comment on lines +247 to +337
@@ -273,6 +275,8 @@ const toPipeline = <
) => ({
name: pipeline.name,
description: pipeline.description ?? '',
tags: pipeline.tags ?? [],
path: pipeline.path ?? '',
runtimeConfig: pipeline.runtime_config,
programConfig: pipeline.program_config!,
programCode: pipeline.program_code ?? '',
@@ -293,7 +297,9 @@ const toExtendedPipeline = ({
deploymentStatus: deployment_status,
deploymentStatusSince: pipeline.deployment_status_since,
programStatusSince: pipeline.program_status_since,
description: pipeline.description,
description: pipeline.description ?? '',
tags: pipeline.tags ?? [],
path: pipeline.path ?? '',
id: pipeline.id,
name: pipeline.name,
programCode: pipeline.program_code ?? '',
@@ -327,6 +333,8 @@ const toExtendedPipeline = ({
const fromPipeline = <T extends Partial<Pipeline>>(pipeline: T) => ({
name: pipeline?.name,
description: pipeline?.description,
tags: pipeline?.tags,
path: pipeline?.path,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead field: path is leftover from the removed tree-view feature. The API no longer carries it (no path field on PipelineInfo / PipelineSelectedInfo / PostPutPipeline / PatchPipeline after the refactor — only description and tags), so pipeline.path is always undefined here and in lines 279, 302, 337. Drop the four path: … lines. While here: tags is read into PipelineThumb/Pipeline and written back via fromPipeline, but nothing in this PR actually edits or displays tags — consider trimming the writeback path too until a follow-up PR adds the UI, otherwise it's dead code in waiting.

@mythical-fred mythical-fred left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blockers:

  1. Still no web-console tests for the new search/filter behavior in List.svelte (same as review 4300127304).
  2. Metadata storage shape still unsettled: tags are now present, but the client_metadata JSON column vs. first-class columns decision is still under debate (see review 4341939167 / gz comment). Please resolve before merge.

Also: commit message "Review fixes" is a dirty-history message; please squash/rename.

@snkas snkas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just make sure downstream CI will also pass, and then ready to merge. Nice job!

Comment thread docs.feldera.com/docs/changelog.md

@mythical-fred mythical-fred left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase pickup re-APPROVE.

Force-push afcd946526c7fc28 is a clean rebase onto fresh main (7 ahead / 1 behind at the previous tip). The only PR-affected file that moved is docs.feldera.com/docs/changelog.md, where the new tags / description-no-version-bump entries get correctly threaded under Unreleased while the v0.311.0 / v0.309.0 / v0.307.0 headings that landed on main get inserted around them. Resolved cleanly. No behavior change to the PR itself.

Prior APPROVE @ afcd9465ca stands.

@gz gz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit puzzling that this feature needs 2.2k lines of code.
I understand that much of it is tests, but one wonders if we can maybe fold some of these tests into existing ones (we dont always have to write new tests especially given that CI is already slow)

@Karakatiza666

Copy link
Copy Markdown
Contributor Author

I'll see if some tests can be folded, but the new functionality is not trivial - skipping version bumps for certain fields - so the complexity is justified

@Karakatiza666 Karakatiza666 added this pull request to the merge queue Jun 18, 2026
@Karakatiza666 Karakatiza666 removed this pull request from the merge queue due to a manual request Jun 18, 2026
@Karakatiza666 Karakatiza666 force-pushed the group-pipelines branch 2 times, most recently from 43b2b01 to c4b67b8 Compare June 18, 2026 05:31
@Karakatiza666 Karakatiza666 enabled auto-merge June 18, 2026 05:31
@Karakatiza666 Karakatiza666 added this pull request to the merge queue Jun 18, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 18, 2026
@Karakatiza666 Karakatiza666 added this pull request to the merge queue Jun 18, 2026

@mythical-fred mythical-fred left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-APPROVE on 653dd2d. The follow-up "Simplify pipeline tags tests" commit is a clean cleanup pass:

  • pipeline_parsing.rs gets rustdoc on the three PIPELINE_COLUMNS_* constants spelling out which descriptor flavor each one feeds (full / monitoring / event-info) and what gets dropped. Helpful for future readers.
  • The split_columns test now goes through a new column_to_descriptor_fields helper that knows client_metadata expands into {description, tags} at the descriptor layer. That's the right shape: the SQL column is one JSON blob, the wire descriptor is two fields, and the test was previously hand-waving around that.
  • Python: test_tags_default_empty folded into test_tags, and test_pipeline_tags_invalid + test_pipeline_description_length merged into test_pipeline_metadata_validation. Less duplication, same coverage (invalid tag set on create, over-limit description on create, exactly-max description accepted, patch with invalid tag rejected).

Nothing else changed.

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 18, 2026
@Karakatiza666 Karakatiza666 added this pull request to the merge queue Jun 18, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 18, 2026
@Karakatiza666 Karakatiza666 added this pull request to the merge queue Jun 19, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 19, 2026
Karakatiza666 and others added 3 commits June 19, 2026 19:39
Add a `tags` field to pipelines: a list of free-form labels for organizing,
grouping, and filtering pipelines. Together with the existing `description`,
tags are now stored as client-side metadata in a single `client_metadata`
JSON column (migration V34) rather than a dedicated column. Further
annotation fields can be added by extending the Rust `ClientMetadata` struct
alone, with no additional database migration.

Editing `tags` or `description` no longer bumps the pipeline `version`,
`program_version`, or `refresh_version`, and does not trigger recompilation:
this metadata is client-generated and has no deployment semantics. As a
result it can be patched at any deployment stage (running, paused, or
stopped), whereas core fields remain editable only while stopped.

Compatibility:
- Migration V34 adds `client_metadata`, copies existing descriptions into
  it, and drops the old `description` column. Reads are lenient: an empty
  column, and even corrupted JSON, are read as empty metadata (and logged),
  so a single bad row never blocks reads.
- Descriptions migrated from the old column are exempt from validation until
  next changed, so values predating the constraints stay readable and
  re-submittable.
- PATCH patches each field independently (an omitted field is left
  untouched); an empty string or empty list is a real value that clears the
  field, not a request to leave it unchanged. POST/PUT replace resets omitted
  metadata to empty.
- Validation: up to 50 tags, each up to 50 chars drawn from
  `[a-zA-Z0-9 ._/|\:=-]`; description up to 300 chars. Violations return 400.
  `tags` is included in POST/PUT/PATCH bodies and GET responses (both
  selectors); the OpenAPI spec is regenerated accordingly.

User-facing:
- Python: `PipelineBuilder(..., tags=[...])`, `pipeline.modify(tags=...)`,
  and a new `pipeline.tags()` getter; the REST client sends tags on
  create/replace/patch.
- The fda CLI and the web-console pipeline list carry the new field through.

Signed-off-by: Karakatiza666 <[email protected]>
Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
The manager reuses its own HTTPS certificate as the leaf certificate it
hands to each pipeline, so marking it CA:TRUE made rustls reject the
pipeline as a server (CaUsedAsEndEntity) on /pause. Exposed by the new
test_pipeline_metadata_edits_at_deployment_stages, the first crud test
to start and pause a pipeline.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@Karakatiza666

Copy link
Copy Markdown
Contributor Author

A new commit fixes HTTPS test failure caused by the new pipeline test.

The new test test_pipeline_metadata_edits_at_deployment_stages is the first test in test_pipeline_crud.py that actually starts a pipeline and then pauses, resumes, and stops it - every other test in that file only creates and edits pipelines without ever running one. This matters because of how the pipeline manager handles secure connections. When you run the manager in HTTPS mode, it needs a security certificate to prove its identity. Normally there are two distinct kinds of certificate: an "authority" certificate, whose only job is to vouch for other certificates, and an ordinary "server" certificate, which a server presents to identify itself. The manager hands each pipeline it launches a server certificate, and then connects to that pipeline to control it (e.g. to pause it). The secure-connection library we use is strict: it refuses to accept an authority-type certificate when what it expects is a server's own identity, and reports this as CaUsedAsEndEntity.

The problem is in one of our CI jobs, "Make sure manager runs with HTTPS," which runs exactly this test file with the manager in HTTPS mode. That job generates a throwaway test certificate on the fly, and it was marking that certificate as an authority (basicConstraints = critical,CA:TRUE). But this job doesn't give the manager a separate certificate to hand out to pipelines, so the manager reuses its own certificate as each pipeline's server identity. As long as no test ever started a pipeline, this was harmless - the manager never had to connect to a running pipeline, so the certificate's wrong "kind" never mattered. The new test changed that: it starts a pipeline (which works, because the runner uses a more lenient connection path to first reach it), and then asks the manager to pause it. That pause goes through the strict connection path, which sees the authority certificate being used as the pipeline's server identity and rejects it - producing the 503 Service Unavailable every time.

The fix is a one-line change in the CI job: generate the throwaway certificate as an ordinary server certificate (drop the CA:TRUE marking) instead of an authority certificate. This brings it in line with how our other HTTPS test suite already works. That suite (the "OSS platform tests" job) runs the manager with the testing feature flag enabled, which makes the manager auto-generate its own test certificate at startup - and that auto-generated certificate is created as a plain server certificate with no authority marking at all (see ensure_testing_https_cert_key in crates/pipeline-manager/src/config.rs, which sets the host names but never sets basicConstraints). Because that suite's certificate is an ordinary server certificate, it has always been able to start, pause, resume, and stop pipelines over HTTPS without trouble - which is exactly the behavior the new test needs. After this change, both HTTPS jobs generate the same kind of certificate, so the new test passes. No product code changes are required: the tags/metadata feature itself was working correctly the whole time; only the test certificate's configuration was wrong.

@Karakatiza666 Karakatiza666 enabled auto-merge June 19, 2026 19:42
@Karakatiza666 Karakatiza666 added this pull request to the merge queue Jun 19, 2026
Merged via the queue into main with commit 059a68f Jun 19, 2026
1 check passed
@Karakatiza666 Karakatiza666 deleted the group-pipelines branch June 19, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants