RFC: proposal for declarative VDUs #5792

jcoglan · 2025-11-27T15:04:17Z

This PR contains the design documents we have produced as part of the M4 milestone for the STA project to design a declarative way to express document update validation functions. This was originally written as an "exploratory" or "rationale" document to compare different approaches and discover features that would be necessary to adequately cover the needed functionality, and then a shorter "specification" describing just the features we're proposing to build. At @janl's suggestion these documents form the "Advantages and Disadvantages" and "Detailed Description" parts of the RFC respectively.

ricellis

A couple of initial thoughts on where the proposed APIs could be painful.

ricellis · 2025-12-04T10:05:57Z

src/docs/rfcs/018-declarative-vdu.md

+fields:
+
+- `language`: must have the value `"query"`
+- `validate_doc_update`: contains an _Extended Mango_ expression that encodes


This type of polymorphism pattern that occurs in CouchDB APIs increases the complexity of modelling Couch types in strongly typed languages (i.e. Go, Java) and with technologies like OpenAPI.

ddoc language validate_doc_update type

javascript string

query object

One could argue for, say, the use of oneOf and a discriminator based on language in OpenAPI, but such a change is a breaking one for anyone with an already existing model type representing a design document.

I am, sadly, very aware that this string/object problem already exists for the map property of a design document, see for example IBM/cloudant-go-sdk#507, and I'd be reluctant to see it propagate.

a way around this would be to allow a single _design/_validate doc that doesn’t need a language field. That’d be a bit of a departure from how things work today, but I personally found the “multiple VDUs” always a bit awkward

ricellis · 2025-12-04T10:06:03Z

src/docs/rfcs/018-declarative-vdu.md

+- `reason`: this contains either a custom error message, or the list of failures
+  generated by the first non-matching selector.
+
+If no custom `$reason` is set, then the `reason` field contains a list of
+failures like so:
+
+    {
+      "error": "forbidden",
+      "reason": {
+        "failures": [
+          {
+            "path": ["$newDoc", "type"],
+            "type": "in",
+            "params": ["movie", "director"]
+          }
+        ]
+      }
+    }
+
+This is consistent with the current working of JavaScript VDUs. Such functions
+can call `throw({ forbidden: obj })` where `obj` is an object, and it will be
+passed back to the client as JSON, i.e. it is already possible for user-defined
+VDUs to generate responses like that above.


Whilst I see the argument that it is already possible for JS VDUs to pass any error structure they like I feel that formally specifying object types into the reason property is a breaking API change from the existing string usages enshrined in system defined error responses. Just because someone can change that already with their own functions doesn't mean that they should.

My concern here is that this change means using declarative VDU comes with a burden to check and potentially rewrite all post-write error handling code to handle both cases. That feels like a barrier to adoption or an avenue for unanticipated breakages.

I accept the value of providing more validation information, but I don't agree with overloading the reason property with a different type to do it. Keeping reason a string at least would mean existing code continues to work on new VDU validation failures even if it doesn't provide as much information. Applications that want to use the extra information have coding to do anyway so they could as easily obtain that from some other part of the error structure.

@ricellis that’s a fair point. For now the design brief was: allow porting currently built-in validations into this, plus what we have as examples in the docs.

I personally would not mind a simplification here.

nickva

Finally got a chance to take a better look. It's a very well written RFC, well done!

I added some comments in-line. If I had to summarize them, the idea I was going with was to try to make everything simpler and see how to avoid having two evaluation modes, two different separate syntaxes, avoid too many new query language features, and maybe use new features for both indexing /filtering selectors as well as to the VDU ones.

I'd basically be happy with 80/20 or 70/30 solution with fewer new features vs a 95/5 but at the expense of more complexity.

nickva · 2025-12-19T06:56:58Z

src/docs/rfcs/018-declarative-vdu.md

+  `"director"`. If it does not, return a 403 response to the client.
+- Otherwise, accept the write and return a 201 or 202.
+
+The body of the response contains two fields:


Wonder if we started with something simpler so that if the selector doesn't match then we just throw a forbidden error (maybe with design doc id to indicate which one failed)...

nickva · 2025-12-19T07:11:02Z

src/docs/rfcs/018-declarative-vdu.md

+The intent of this interface is that each individual selector expression
+produces a complete list of _all_ the ways in which the input did not match the
+selector expression, so that the client can show all the validation errors to
+the user in one go.


While it's more flexible, I think it might make it harder for users to understand as it departs from a simpler filter / selectors behavior by returning a complex list (union) of failures as opposed a simple boolean.

while true, imagine writing software against this:

a user enters some form data

the app sends the data to CouchDB, which validates it. say 3/10 fields fail validation, but we only return the first error

user gets presented the one error, fixes it, re-submits

CouchDB validates, returns error 2/3

and so on

I think those ergonomics would be very unfortunate and I’d be okay with the added complexity here

nickva · 2025-12-19T07:17:14Z

src/docs/rfcs/018-declarative-vdu.md

+defining VDUs. Some of these features _only_ make sense for VDUs and should only
+be allowed in this context.


I'd prefer to minimize the differences between the VDU type selectors and indexing / filtering selectors. It would be nicer, I think, if users just learned one query syntax and could use it everywhere, as opposed to having to remember this works in VDUs but that works in selectors only and so on. This could mean we'd add new operators for indexing (type checking, range checking, etc) as well.

that would be preferable, let’s see if it can be done

nickva · 2025-12-19T07:20:18Z

src/docs/rfcs/018-declarative-vdu.md

+An Extended Mango expression is considered to match a given input if its
+evaluation returns an empty list.
+
+To produce the `path` field, the `match` function will need to track the path it


The additional implementation complexity is why I think at first we can probably get away with covering a good amount use cases just with the existing matchers, maybe with the design doc ids indicating which VDU failed. It's not as flexible as Javascript but if users need more flexibility can just use Javascript, that's not going away.

nickva · 2025-12-19T07:29:58Z

src/docs/rfcs/018-declarative-vdu.md

+not match, and all failures from all fields should be returned to the caller.
+
+
+### `$if`/`$then`/`$else`


To me, aesthetically if/then/else seem a bit out of place in what looks like a declarative query language. It makes me think of an imperative language like Javascript or Python. Especially as the ifs/thens can be translated to a bunch of ANDs and ORs user could still build the same logic with those, and it be nice to minimize change to our custom selector syntax and grammar.

we can bike sheds the names :)

nickva · 2025-12-19T07:54:28Z

src/docs/rfcs/018-declarative-vdu.md

+        }
+      ]
+    }
+


A simpler way I could see the above example working would be to have two design docs: one checking for admin and the other for types

_design/type_checks:

"validate_doc_update": { "$newDoc.type": { "$in": ["movie", "director"] }}

nickva · 2025-12-19T08:00:14Z

src/docs/rfcs/018-declarative-vdu.md

+The most "obvious" way to achieve this would be simply allow
+`validate_doc_update` (VDU) functions to be written in Erlang, as we do for
+map-reduce index definitions. However, this is not especially accessible to most
+users, and allows them to execute arbitrary code inside the database engine. A
+better solution would be to design a way for validation rules to be expressed in


I am not sure about Erlang, while it is a full / proper language, it's not something users will probably want to write VDUs in. I could see Lua or something more widespread perhaps. But we do have Javascript and that's not going away so there is always a way to defer complex cases to that. We don't have to invent a 100% replacements for JS VDUs, which makes our jobs easier a bit!

However, that said, when it comes to us (CouchDB devs) a good number of magically auto-injected VDU docs we use as system VDUs could be (many already were) rewritten in Erlang. In CouchDB internals those are called a bit differently -- they are BDUs ("before-document-update") callbacks.

nickva · 2025-12-19T08:07:44Z

src/docs/rfcs/018-declarative-vdu.md

+context_. The path to the current value is just one thing we will need to store
+in the match context as selectors are evaluated.
+
+One other thing we may want to store in the context is a "mode" that indicates


Evaluation modes I think adds too much complexity for users. Maybe if we pick one new mode that works the same everywhere, but having to juggle two evaluation methods in a custom query language I think is too tricky.

nickva · 2025-12-19T08:18:29Z

src/docs/rfcs/018-declarative-vdu.md

+Strictly speaking, this is an accurate description of the intended validation
+rules. However, it will not give good feedback. Under the evaluation rules
+described above, where all operators return a possibly-empty list of failures,
+the `$or` operator would work be collecting a list of failures for each of its
+sub-selectors, and then if any of these lists is empty, then `$or` returns an
+empty list. If none of the lists is empty, then the most reasonable thing for
+`$or` to do would be to return the combined list of all failures from its
+sub-selectors, since it has no idea what any of them mean.


I think compared to the complexity added (if/then statements, guards) I'd rather have it be simper in the first go-around with less feedback details. For example, the above could be in a design document named _design/type_checks with that $or:[...] structure and it can do the job without adding new constructs to the query language. If users want detailed feedback, there is always Javascript.

nickva · 2025-12-19T08:22:36Z

src/docs/rfcs/018-declarative-vdu.md

+To illustrate how the features proposed here could work in practice, we'll now
+work through an example of translating an exiting JavaScript VDU to a
+declarative form. Specifically we will look at the [example from the CouchDB VDU
+docs](https://docs.couchdb.org/en/stable/ddocs/ddocs.html#validate-document-update-functions).


The system _design/_auth VDU I think is a great candidate to rewrite as an Erlang BDU. For query based VDUs example simple type checks (is this a move, is this a director, is the year and integer in the right range, etc) would probably look better.

janl · 2025-12-19T14:56:21Z

We also had a discussion in the last dev meeting and I’m posting relevant bits from the transcript here (some of what Nick is covering he’s already added above).

re if/then/else naming of terms:

I mean I had similar thought. I mean I read the whole thing. I was reading it for the last hour and I didn't get all the way through to the end because there's an awful lot in there. and the one that stood out for me was if then else. Just seems very strange. Those aren't the right words, I don't think, for what it does. I get the intention of what if then else does, but if is run if the thing that matches matches and then is then done afterwards it was very odd to me. I don't think that's quite the right terms for that stuff. But that's detail. (Bob)
My concern really is that I think we're inventing a new language just in a fairly obuse manner and maybe that's fine. is there any precedent for it though? does Mongo itself have this syntax and we're just aping it or… (Bob)

re adoption:

I mean I realize the barrier of entry for people to write VDUs writing JavaScript there are people that do like doing that and there are quite a lot of people we have that don't like doing that. are they going to doing this better given how poorly they like writing selectors in my experience as so I mean I don't want to it's a great proposal…but I just wonder if it's the right way to what are we trying to solve the fact that doing the JavaScript thing is slow or is it that JavaScript is an unpleasant way to write it I guess it's both… (Bob)
The point I was trying to make with all this [references] is that I could envision a future where we have a lot of example documentation for all these selectors.
Jan Lehnardt: sorry for the validation selectors that for the lack of a better term and maybe even have support in Photon to put them together and then make it easy to construct these design documents so people don't actually have to go into the intricate details of it all but there's this web standard subcategory of micro formats whereas it's just like an here's how you do an email address how you do the HTML markup and you just import this you give it a bunch of values and then it does the right and that it's a little bit more machine readable across the web. so my thinking is we'll build a bunch of quote unquote types for typical stuff that people would want to validate and then they can just plug that together any way they want. and then it's a little bit easier than having to learn all the intricate details of all this kind of stuff. And then there's going to be plenty of examples of people can know… (Jan)

sidebar about just the performance aspect of running JS VDUs:

I mean, we could be running quickjs in process, could we not? (Bob)
We could right I mean it's be kind of a bit dangerous like it we'll try to avoid that for now. (Nick)
Jan translates this as “technically possible but maybe practically not feasible for CouchDB’s stability needs”
Bob recounts lua in haproxy as a viable option, but that’s not somnething that fits CouchDB

re can this be simplified?

I'm thinking from the perspective of what if we just did simplify it and it might not cover 100% but it might cover 80%. let's say even with a simple selector if you just say I'd like to mash or not and then if it doesn't it throws a formid like you don't get a choice to do these paths and do but let's just do the simplest thing could you do that and then we can look at maybe a whole bunch of examples and we see we need to really do this string ends with we need to cap together maybe two things or we need to get the type of something like it's really crucial that when you v you do this video use I want to know the type is a number here and we don't really have is number so maybe there's just sort of a smaller set of minimal kind of checks or things but we can also use them in a regular indexer as well an indexing can say I'd like to if it's a number I'm doing something else I emit this way if it's not I do that for… (Nick)
And we already have So we have this select a language that we kind of impose and say it's there and I don't want to depart too far from it or add too much to it and I'd rather keep it as small as possible because already know selectors use selectors for more things as opposed to selectors plus it has a bunch of other things just for VD use maybe an extensions or… (Nick)
contra: I think that's really where I currently am is is there a way that we're adding lots of extra options and hoping we cover everything. It's inventing our own language, which is fine if you get it right. It's complexity. I mean, I'm not quite on Nick's page with a version of this that's just less complicated. I mean, I think if you're going to do this, you need to cover everything you can. Otherwise, who's going to use the damn thing? They'll just have to use JavaScript.
(Bob)

re implementation:

Of course, you're concerned. I mean, I mean, it's not you [Nick] or I going to be writing the code. (Bob)

Bob sums up his thoughts:

where people try to use it, don't understand it. I'll have to learn it and then I'll have to fix the bugs in it and they'll have to add functions that we didn't think of. so there's the maintenance aspect that I'm concerned about.
I don't object to the idea. I don't think it's technically wrong. I can't think of an alternative, Literally, embedding lure would make me happy, but I mean wouldn't make anybody else happy. I think you can walk away. You can say yes, if you can write it in this language is complete. You can write anything in that. It's just a question of how much you want to write. if someone else is building it and it's going in and my only concern would be that it's a quality that it comes in, it's got lots of tests. coverage is there as complete as it can be that it's well written it's well documented and that it doesn't crash couch if you use it in stupid ways so from that point of view you fill your boots with my cloud and

rfc: proposal for declarative VDUs

9e6d22c

jcoglan force-pushed the rfc-declarative-vdu branch from 8f5cc3a to 9e6d22c Compare November 27, 2025 15:14

ricellis reviewed Dec 4, 2025

View reviewed changes

nickva reviewed Dec 19, 2025

View reviewed changes

		defining VDUs. Some of these features _only_ make sense for VDUs and should only
		be allowed in this context.

		not match, and all failures from all fields should be returned to the caller.


		### `$if`/`$then`/`$else`

RFC: proposal for declarative VDUs #5792

Are you sure you want to change the base?

RFC: proposal for declarative VDUs #5792

Conversation

jcoglan commented Nov 27, 2025

Uh oh!

ricellis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nickva left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nickva Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nickva Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nickva Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janl commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nickva Dec 19, 2025 •

edited

Loading

nickva Dec 19, 2025 •

edited

Loading

nickva Dec 19, 2025 •

edited

Loading