Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vocabulary extension: keywords relating to string sorting #40

Open
karenetheridge opened this issue Dec 20, 2020 · 2 comments
Open

vocabulary extension: keywords relating to string sorting #40

karenetheridge opened this issue Dec 20, 2020 · 2 comments

Comments

@karenetheridge
Copy link
Owner

karenetheridge commented Dec 20, 2020

  • locale: <string>. specify the locale in use for the other keywords in this vocabulary. always evaluates to true. produces an annotation of its own value. when not provided, default value is implementation-defined (but likely the locale of the current runtime environment)
  • caseSensitive: <boolean>. always evaluates to true. produces an annotation of its own value. may be derived from the locale if not provided.
  • sorted: <boolean>. only relevant when the instance is an array; only looks at the array elements that are strings (items of other data types are ignored). takes the values of locale and caseSensitive into account. evaluates to true when the instance array is(n't) sorted.
  • maximum, minimum, exclusiveMaximum, exclusiveMinimum maximumString, minimumString, exclusiveMaximumString, exclusiveMInimumString: <string>. only relevant when the instance is a string. takes the values of locale and caseSensitive into account. evaluates to true when the instance string successfully compares to the keyword string.

(yes, some keyword names overlap with the "validation" vocabulary; this should be okay as we can infer the vocabulary based on instance and keyword data type. That is, if both vocabularies are in use, both vocabularies will look at these keywords and attempt to evaluate them, but the vocabulary(ies) with the mismatched data type(s) will do nothing.)

When considering string ordering, we do NOT use the unicode codepoint order, but rather we respect the sorting and collation semantics of the stated locale -- see https://www.unicode.org/reports/tr10/. case-sensitive sorting will use Unicode Collation Level 4. case-insensitive sorting will use Unicode Collation Level 2.

@karenetheridge
Copy link
Owner Author

karenetheridge commented Feb 1, 2021

@karenetheridge
Copy link
Owner Author

I would love to be able to have the existing maximum, minimum keywords support strings, but there is the question of what to do about non-ascii characters. We could simply say that all non-ascii characters would be replaced by 0x80 for the purpose of string comparisons, so that all non-ascii characters are "greater" than ascii characters, and all non-ascii characters are equivalent to each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant