Skip to content

Conversation

@nick-sturrock
Copy link

Previously whitelisted words would only be accepted if the case matched that used in the configured list

@snipe
Copy link
Owner

snipe commented May 31, 2018

Looks like this PR broke the tests. Can you fix that and push again?

@nick-sturrock
Copy link
Author

nick-sturrock commented Jun 1, 2018

Done. This still has some shortcomings that someone with more time (perhaps a future me...) needs to address. There are cases where the original case of the input string is not preserved because of the way the whitelisting replaces all matching parts of the input string with a single placeholder. If the input string contains the same whitelisted word more than once with different capitalisations then all instances of the whitelisted word will end up being replaced with the same capitalisation as the original whitelist entry. Example: 'Pig pIG PIG' with whitelist entry 'pig' becomes 'pig pig pig', because 'Pig', 'pIG' and 'PIG' would all now case-insensitive match 'pig', and the placeholder value in the whitelist transformation would just be 'pig' as the placeholder is taken from the whitelist rather than from the string.

In my use case this doesn't matter as I'm only using the lib to determine whether the content is safe or not - i.e. does the clean string match the original one, but you may want to consider not merging the pull request until a case-safe implementation is added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants