Skip to content

Conversation

@kupiakos
Copy link

@kupiakos kupiakos commented Dec 9, 2025

Enable ranges of enum discriminants to be reserved ahead of time with an unnamed enum variant. This requires all users of that enum to consider those values as valid - including within the declaring crate.

This is useful for any situation in which recompiling an enum to handle new discriminant values is infeasible, and using a newtype integer is unergonomic due to the type's intended nature as a set of enumerated values. This includes FFI, Protobuf, and embedded syscalls.

If an enum is valid for every discriminant (it has no niches), it is an open enum and may be as cast from its explicit backing integer. For example:

#[repr(u32)]     // Fruit is represented by any `u32` due to the unnamed *open variant* below.
enum Fruit {
    Apple,       // Apple is represented with 0u32.
    Orange,      // Orange is represented with 1u32.
    Banana = 4,  // Banana is represented with 4u32.
    _ = ..,      // The rest of the discriminants in `u32` are reserved.
}
// Using an `as` cast from `u32`, since it is an open enum.
let fruit = 3 as Fruit;

// Does not match any of the known variants.
match fruit {
    Fruit::Apple | Fruit::Orange | Fruit::Banana => unreachable!(),
    // This wildcard branch is required for all exhaustive matches of `Fruit`.
    // `fruit` preserves its value casting back to `u32`.
    _ => assert_eq!(fruit as u32, 3),
}

While writing this RFC I was not aware of the recent work on a similar RFC by @madsmtm. I considered using an attribute to make an open enum before I had learned about this RFC - I explain why I chose unnamed variants instead in the Alternatives.

@rustbot label T-lang

Important

When responding to RFCs, try to use inline review comments (it is possible to leave an inline review comment for the entire file at the top) instead of direct comments for normal comments and keep normal comments for procedural matters like starting FCPs.

This keeps the discussion more organized.

Rendered.

@rustbot rustbot added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 9, 2025
There is no reason for these bytes to be in flash memory - it is merely
one of the applicable use cases.
unnamed_variants is an unambiguous feature name that describes what the
author is now able to spell. However, it takes more mental effort to
remember what a variant is, while "enum variant" is immediately clear.

This also clarifies the zero-copy deserialization motivation language
further.
@kupiakos kupiakos changed the title RFC: Allow reservation of enum discriminants via unnamed variants RFC: Open up your enum with an unnamed variant Dec 10, 2025
@nielsle
Copy link

nielsle commented Dec 10, 2025

Out of curiosity, could the following syntax work? (or something similar)

#[repr(u32)]
#[reserve_discriminants(5..10)]  
#[reserve_discriminants(15..20)]  
enum Fruit {
    Apple,       // Apple is represented with 0u32.
    Orange,      // Orange is represented with 1u32.
    Banana = 4,  // Banana is represented with 4u32.
}

See f17e862 for reasoning.
The `taken_discrimimant_ranges` and `empty_discriminant_ranges` lints
catch the situation in which an unnamed variant has no effect. The
latter is much more dangerous than the former, so it is
`deny`-by-default. However, both should be allowed for certain codegen
and macro cases as described.

This also moves around and expands some language regarding `derive`
difficulty in open-enum, as well as add a suggested alternative.
@kupiakos
Copy link
Author

kupiakos commented Dec 10, 2025

@nielsle Using an attribute to reserve discriminants/open an enum (of which there are many spellings) is discussed here as a considered alternative. Also, please use inline (file/line-level) comments instead of PR-level comments for RFC comments in the future to keep the discussion organized.

@scottmcm
Copy link
Member

This is useful for any situation in which recompiling an enum to handle new discriminant values is infeasible, and using a newtype integer is unergonomic due to the type's intended nature as a set of enumerated values. This includes FFI, Protobuf, and embedded syscalls.

Fundamentally I think I disagree with this. A C-style open enum is a newtype struct, not a rust enum. (Not unlike how a rust enum isn't an open Box<dyn Trait> either.)

If the main complaint is that you don't like how rustdoc shows that or how R-A generates things for it, I think it'd be better to start with tooling changes for that instead of language changes.

How much of this would be solved if you could just have a #[diagnostics::open_enum] on your struct, for example?

And I think that "well I didn't want to reserve all of the values" both

  1. Isn't well motivated in the text, and
  2. Would be better done with struct TaskState(u8 is 0..8); via pattern types that are already "on the list".

- The unstable [`non_exhaustive_omitted_patterns`] lint has no easy way to
work with this enum-alike, even though treating it like a `non_exhaustive`
enum would be more helpful.
- Generated rustdoc is less clear (the pseudo-enum is grouped with `struct`s).
Copy link
Member

@scottmcm scottmcm Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I've long found this distinction in rustdoc unhelpful always. If it's an opaque type, for example, what do I care if it's a struct or a union? Not to mention that I'd rather have Statement and StatementKind next to each other on https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/ rather than multiple pages apart.

I'd absolutely 👍 an RFC to change rustdoc to group things in a module by namespace instead of item kind, but I don't think the current behaviour is a reason to do a language change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Using colour or an icon or something for struct vs enum to help give my brain an extra cue as I scan? Sure, great! I just don't think it makes sense as a top-level split.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to have more control over rustdoc presentation and organization of types as well, in general. I consider the accurate grouping in rustdoc for an open enum as more removing a papercut than a full motivation for the feature.

Comment on lines +1290 to +1292
- `derive`s cannot be easily written with enum variant names. In order to avoid
duplicating the names, a `derive` macro must directly inter-operate with
another macro that generates these pseudo-variants like `open-enum`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand this. Can you clarify? What duplicate names are you referring to? What's an example derive macro that would be difficult to implement with this attribute-based scheme?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose one issue I can foresee is that a derive macro can't enumerate the known variants with this approach. So e.g., you can't implement Debug or Display or similar via a derive macro on such a type--those would have to be provided by a declarative or attribute macro.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate names I'm referring to are the variant names - the impl block contains the names, but a derive macro needs to be able to see it on the struct definition in order to work with the names. See this comment.


However:

- Open enums require even more typing for the desired semantics.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trivially solved with a macro, right?

Do you expect that this open enum pattern is common outside of specialized and generated code? The motivating examples of bindgen and protobuf are both generated so there is no "typing".

(I personally use open enum patterns in a lot of systems code for ABI types that cross trust boundaries. But I am generally OK with reaching for a macro for these use cases, which I would consider fairly specialized.)

Copy link
Author

@kupiakos kupiakos Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trivially solved with a macro, right?

"trivially" comes with a big asterisk, as described in this comment. What about non-literal discriminant expressions and implicit discriminants? It's not as trivial as it initially seems.

I'm aware that a macro solves the "extra typing" problem, as is mentioned repeatedly throughout the text and in a following bullet point. It is still a drawback to require a macro in order to avoid extra typing or duplicating discriminant values/variant names. Can you suggest alternative language that is less confusing to you?

Do you expect that this open enum pattern is common outside of specialized and generated code? The motivating examples of bindgen and protobuf are both generated so there is no "typing".

I expect the pattern to be more common once users realize they can as cast from the backing integer (lovely!) and improve some benchmarks by shifting validity checks around without sacrificing too much on ergonomics (that extra match arm). Also, folks should be using bindgen but they often don't - this makes it easy for them to get the open enum aspect right.

@kupiakos
Copy link
Author

@scottmcm

This RFC is about more than open enum ergonomics - it's about empowering Rust users to control ABI compatibility for enums.
I'd love to see a future where a Rust dynamic library could add a variant to an enum with fields and have it be a non-ABI-breaking change, as described in Future possibilities.

Fundamentally I think I disagree with this. A C-style open enum is a newtype struct, not a rust enum. (Not unlike how a rust enum isn't an open Box<dyn Trait> either.)

I'm curious - why do you think that a C-style enum is a newtype struct and not a Rust enum? What role does an enum have that restricts it from representing unnamed variants? What are the negative implications to enums with unnamed variants?

The choice of enum or struct reflects the author's intent and informs how other features and tooling are supposed to interpret the type - a set of names with associated values, or a wrapper around a plain integer. That makes C-style enums much simpler to work with by directly reflecting the author's intent to create an enumeration.

An open enum using unnamed variants:

  • Is still represented by a bounded set of representable values. That set is bounded by the known discriminant type.
  • Has that set of representable values declared in the enum definition by the author of the enum.

That's very unlike a Box<dyn Trait>, which allows users of the type to introduce their own valid representation. An explicit-repr enum remains as a set of known discriminants, it's just easier to expand that known set with an unnamed variant.

A newtype integer is a fine way way to represent a C-style open enum, but is an irritating paradigm in practice. My goal is to vastly improve the user experience for those who handle open enums regularly. Having this be a language feature is friendlier to users of open enums, because a solely diagnostic improvement will always require using a 3rd party macro or duplication in order to avoid significant boilerplate.

Consider this enum that we want to be made open with an open_enum attribute:

#[open_enum]
#[repr(u32)]
enum Foo {
    A = 0,
    B = 1,
    C = 5,
    D = 7,
    E = 9,
    F = 10,
    G = 20,
    H = 21,
}

That converts Foo to this definition:

#[derive(ParitalEq, Eq)]  // for use in `match`
struct Foo(pub u32);
// plus checks to ensure no duplicate discriminants
#[allow(non_uppercase_globals)]  // n/a
impl Foo {
    const A: Foo = Foo(0);
    const B: Foo = Foo(1);
    const C: Foo = Foo(5);
    const D: Foo = Foo(7);
    const E: Foo = Foo(9);
    const F: Foo = Foo(10);
    const G: Foo = Foo(20);
    const H: Foo = Foo(21);
}

Myself and clients have run into some issues with this pattern in practice:

  • If you try to combine #[open_enum] with another attribute macro, the declaration order can unexpectedly matter very much. In our case, it was an attribute macro that conditionally enabled Debug based on some cfgs and that caused issues. We had to replace the attribute macro with the intended cfg_attr (and thus duplicating those conditions anywhere open_enum was used).
  • It's not as clear to readers what's going on - that this is actually creating a struct and they can't treat it exactly the same as an enum. No as casts or use Foo::A, for example.
  • How does the user specify derive(Debug) to format with variant names as intended? The 0.5 version of my attribute macro looks for the literal tokens Debug or std::fmt::Debug or similar in derive and intercepts it, but that's obviously faulty and can't handle a use alias correctly. It also doesn't compose well and can only work with a small set of known macro names.
  • How does a derive for FromStr know what the names are, after the attribute macro has run? The "check the tokens in derive in the attribute macro" hack is tolerable for Debug as a special case, but not for custom derives.

There's a solution to this, but it still has serious drawbacks. With a new design, the open_enum macro transforms this:

#[open_enum]
#[derive(open_enum::Debug, open_enum::FromStr)]
#[repr(u32)]
enum Foo {
    /* variants A-H */
}

into this:

// `open_enum` derive macros read `open_enum_variant_names`, as well as
// other macros that are aware of the attribute.
#[derive(open_enum::Debug, open_enum::FromStr, PartialEq, Eq)]  
#[open_enum_variant_names = "A,B,C,D,E,F,G,H"]
struct Foo(pub u32);

impl Foo { /* ... */ }

With this:

  • derive(Debug) now prints Foo(10) and our own visible derive is needed.
  • derives can now read what the known variant names are (and access the discriminant values with Foo::VariantName.0).
  • If I attempt the "check the tokens for Debug" hack to warn or intercept, I can't distinguish from a custom derive imported as Debug.
  • Each way to construct an open enum needs it own macro ecosystem in order to communicate the variant names to derives that use them. In order to integrate with other libraries, they need to coordinate on how variant names are represented in a custom attribute.
  • Open enums are still considered "not an enum" by most of the ecosystem since the way you manipulate it is still different. There's not an obvious ergonomic way for existing libraries with derives that consider variant names to extend their support to open enums even if they're aware of the attribute.

How much of this would be solved if you could just have a #[diagnostics::open_enum] on your struct, for example?

This is similar to the diagonstic::enum_variant suggested alternative and its drawbacks apply. If that or diagnostic::open_enum (or enum_like) were to be merged, I'd improve open-enum to use it. However, it wouldn't solve the derive or duplication issue in a satisfiable manner.

If a derive is intended to manipulate enum variant names, how do you make it work with a newtype integer? If there's a diagnostic attribute, do you duplicate the names in the attribute so a derive can read it? On a large team, I'd want to lint for the attribute's presence.

Also, for feature parity, use Struct::Name would also need to work like use Enum::Name.

And I think that "well I didn't want to reserve all of the values" both

  1. Isn't well motivated in the text, and

I'm aiming to improve the text - can you be more specific? Rust users have good reasons to create enums with 1010 reserved variants, and it's not great. It's not the first time I've seen an enum like that (and won't be the last).

  1. Would be better done with struct TaskState(u8 is 0..8); via pattern types that are already "on the list".

I disagree - the ergonomics of enums with unnamed variants cannot be simply solved with an attribute, and this applies to struct TaskState(u8 is 0..8). The discriminant values are duplicated between the inner type and the variant names, and so constructing the pseudo-enum would have to be done with a macro. That macro needs to handle non-literal discriminant expressions, which yields a very unwieldy pattern type for complex enums. Pattern types would also need to allow overlapping patterns in order to allow non-literal discriminant expressions to be processed by such a macro. Additionally, pattern types are a much more impactful change to the language than unnamed enum variants.

Pattern types and unnamed variants cooperate well - if you write EnumName as u8, the output integer can be a u8 is PATTERN where PATTERN exhaustively covers all valid discriminants for Enum. Unnamed variants make it easier to define a span of valid discriminants.

@tmccombs
Copy link

How much of this would be solved if you could just have a #[diagnostics::open_enum] on your struct, for example?

Another problem with that is it wouldn't really allow having a lint to check that a match includes all known variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T-lang Relevant to the language team, which will review and decide on the RFC.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants