Skip to content

optionally retrieve unescaped pattern where available#1372

Open
njr-11 wants to merge 2 commits intojakartaee:mainfrom
njr-11:unescaped-pattern
Open

optionally retrieve unescaped pattern where available#1372
njr-11 wants to merge 2 commits intojakartaee:mainfrom
njr-11:unescaped-pattern

Conversation

@njr-11
Copy link
Member

@njr-11 njr-11 commented Feb 11, 2026

In #1365 (comment) @dstepanov requested the ability to use unescaped patterns for Like and NotLike constraints. It does make sense this could be an optimization, although I can also see a good case for how we currently have it, where queries allowing for an escape character can be computed in advance and will work regardless. This PR explores what a compromise might look like that offers the Jakarta Data provider to request either way. It adds a .unescapedPattern() method to Like and NotLike that returns a pattern without any escape character if such a pattern is available. If unavailable, then the Jakarta Data provider needs to fall back to using the existing methods for a pattern with an escape character. Alternatively, the Jakarta Data provider could choose to use the existing methods regardless, which will always work.

@njr-11 njr-11 added this to the 1.1 milestone Feb 11, 2026
@gavinking
Copy link
Member

I am not sure that every database allows patterns with no escape character. @beikov do you recall?

@njr-11
Copy link
Member Author

njr-11 commented Feb 11, 2026

I am not sure that every database allows patterns with no escape character. @beikov do you recall?

This isn't necessarily a problem here because we are keeping the .escape() and .pattern() (escaped pattern) methods that will always have an escape character, so a provider that needs to connect to such databases can always use these existing methods and ignore the .unescapedPattern() method entirely. That said, when a Jakarta Data provider delegates to a Jakarta Persistence, it is the Persistence provider that will have the knowledge of the database, not the Jakarta Data provider, so maybe the .unescapedPattern() will end up never being useful. I might be trying too hard here to offer an optimization at the Jakarta Data layer to avoid escape characters when the more appropriate place for that optimization is the Jakarta Persistence provider, which alternatively could just check whether or not the supplied pattern actually contains the escape character and then avoid sending the escape character if it knows the database can handle that.

@beikov
Copy link

beikov commented Feb 12, 2026

Some databases have a default escape character (usually \) and don't allow disabling the escaping i.e. string like pattern escape ''. To implement the behavior of "disabling the escaping", in Hibernate ORM for example we escape the pattern with that default escape character i.e. string like replace(replace(replace(pattern, '\', '\\'), '%', '\%'), '?', '\?'), similar what LikeRecord.translate() seems to do.

The SQL spec says that the like predicate by default should not have an escape character, but you seem to prefer defaulting to \ as an escape character.

What I don't quite understand is why the LikeRecord even keeps the escaped pattern in the first place. IMO, the LikeRecord should only contain the unescaped pattern and maybe offer a method to construct the escaped pattern on demand that the Jakarta Data provider can use if it wants.

@dstepanov
Copy link

My comment was about knowing if the escape is present or not, in the JPA Criteria (which we also use in Micronaut for internal query construction) there are those variations:

    Predicate like(Expression<String> x, String pattern);

    Predicate like(Expression<String> x, String pattern, char escapeChar);

So I would prefer to invoke specific method based on the information if the escape char is defined on not. Right know I will have to always trigger the one with the escape.

@njr-11
Copy link
Member Author

njr-11 commented Feb 12, 2026

What I don't quite understand is why the LikeRecord even keeps the escaped pattern in the first place. IMO, the LikeRecord should only contain the unescaped pattern

There are many ways to construct a Like constraint. Some of them involve the user supplying an already-escaped pattern, so having LikeRecord only contain an unescaped pattern is not an option.

and maybe offer a method to construct the escaped pattern on demand that the Jakarta Data provider can use if it wants.

For the subset of usage where a Like constraint is constructed in a way that we know it will not require an escape character, we could do as you recommend here and only initialize the unescaped pattern, deferring the computation of an escaped pattern until requested. There are some reasons why this might not be as nice as it sounds. There is nothing to stop a user from performing one-time creation of a Like constraint or more complex Restriction involving Like constraints that they continually reuse throughout their application. If an application does this, an approach to compute upon request ends up continually recomputing instead of once only. To address that, I suppose we could save the value once computed, but then we need to make it thread safe because it is completely valid for the application to be using the same instance from multiple threads. To the application, these constraints and restrictions are supposed to be immutable, which is likely one reason Gavin implemented them as Java records. In summary, we could in some cases switch to what you are asking for here with delayed computation, but it adds complexity and might not be as beneficial as hoped for. Let me know if you would like me to proceed with that, and I'll update the pull.

So I would prefer to invoke specific method based on the information if the escape char is defined on not. Right know I will have to always trigger the one with the escape.

It was my intention that what I already added under this PR would give you that ability. You first ask for the unescapedPattern() and if non-null then use it without an escape character. Otherwise ask for the escape() character and escaped pattern() and supply the escape character. (I suppose renaming pattern() to escapedPattern() might be a good idea for clarification if we go with an approach like this) Are there reasons that this doesn't help?

@beikov
Copy link

beikov commented Feb 13, 2026

It's fine to leave it like it is or do the renaming as you suggest. I just thought it might be better to avoid computing something that isn't always necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants