Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce validation of lists #20

Open
dimitarvp opened this issue May 5, 2019 · 7 comments
Open

Introduce validation of lists #20

dimitarvp opened this issue May 5, 2019 · 7 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@dimitarvp
Copy link

dimitarvp commented May 5, 2019

I've found myself in a need to validate a list with the rough shape of:

  • The list contains structs/maps of different types.
  • Each of the types has a minimum and maximum amount of occurrences allowed (specified by an external non-machine-readable schema).
  • Some of the structs/maps embed others in a tree-like children list attribute.

So, something like this:

[
  %{tag: "00", ...},
  %{tag: "11", ...},
  %{tag: "12", ..., children: [%{tag: "16", ...}, %{tag: "21", ...}]},
  # ...A lot more recursive rules like these follow...
  %{tag, "99", ...}
]

In this example, tags 00 and 11 and 99 can appear once and only once, while the tag 12 can appear 1..9999 times but its children must appear only once. There are other structs/maps that must be present but some of their children are optional (0..1 occurrences).

I was thinking of devising my own code to validate such lists by having a schema like this:

[
  %{tag: "00", limit: 1..1},
  %{tag: "11", limit: 1..1},
  %{tag: "12", limit: 1..9999, children: [%{tag: "16", limit: 1..1}, %{tag: "21", limit: 1..1}, %{tag: "26", limit: 1..1}]},
  %{tag: "99", limit: 1..1},
]

...etc. The point being, have a hierarchical structure that:

  • Gives you a pattern to match on -- in this case it's %{tag: x} when is_binary(x) and x in ~w[00 11 12 16 21 99], and
  • Assert on how many times the given pattern can occur.

As the author of the library pointed out in ElixirForum (https://elixirforum.com/t/exvalibur-smart-validation-in-elixir/17677/14), this is akin to XML Schema (XSD), although I'd like to be able to specify such a validation schema by hand as well.

I admit I haven't fully cleared all the specifics inside my head yet -- I tried several recursive approaches and some of them work but they irritated me by being too specialised. So I was wondering if we can generalise a solution inside this library.

@am-kantox am-kantox self-assigned this May 5, 2019
@am-kantox am-kantox added the enhancement New feature or request label May 5, 2019
@am-kantox am-kantox added this to the v0.11.0 milestone May 5, 2019
@am-kantox
Copy link
Owner

@dimitarvp I want to clarify the multiple tag occurrence.

%{tag: "12", limit: 1..9999,
   children: [%{tag: "16", limit: 1..1}, %{tag: "21", limit: 1..1}]}

How the above controls the children limit? E.g. imagine the following instance:

[
  %{tag: "12", children: [%{tag: "16"}, %{tag: "21"}],
  %{tag: "12", children: [%{tag: "16"}],
  %{tag: "12", children: [%{tag: "21"}],
]

Shall we imply the uniqueness of children only amongst the immediate parent, or amongst the whole set of children of the respective tag: "12"?

@dimitarvp
Copy link
Author

dimitarvp commented May 5, 2019

Uniqueness should be hierarchical. Every %{tag: "12"} is validated separately. There can be 1..9999 of them but the children records should comply with their rules individually. In my example, you must have tag 16, 21 and 26 inside 12, and all of them can appear only once per tag 12. Globally (when validating the entire list) they are much more than one.

@dimitarvp
Copy link
Author

dimitarvp commented May 5, 2019

To clarify: the example you gave in your comment is invalid because none of the 3 records of type %{tag: "12"} have all the required tags (16, 21 and 26). If all three had them then the list would be valid (see below). Reiterating: in my use case the validation is to be done recursively and individually.

[
  %{tag: "00"},
  %{tag: "11"},
  #...
  %{tag: "12", children: [ %{tag: "16"}, %{tag: "21"}, %{tag: "26"} ] },
  %{tag: "12", children: [ %{tag: "16"}, %{tag: "21"}, %{tag: "26"} ] },
  %{tag: "12", children: [ %{tag: "16"}, %{tag: "21"}, %{tag: "26"} ] },
  #...
  %{tag: "99"}
]

☝️ This is valid if the schema is like in the issue description (updated to add tag 26 to it). Tags 16, 21 and 26 do not get their limit: 1..1 rules violated because the rules are valid only within the scope of their parent (as you said in the latter part of your comment). Additionally: the whole thing is valid also because tags 12 are occurring 3 times which is within 1..9999. And tags 00, 11 and 99 are within the 1..1 limit as well.

@am-kantox
Copy link
Owner

@dimitarvp I think I get it. So, basically what we need is the ability to validate:

  • list validation with rules applied to each list element;
  • nested validation with rules applied to the nested term transparently.

We already have Exvalibur.Guards.Default.{min_count/2,max_count/2}, so I imagine syntax somewhat like:

  1. Default guard @spec limit(from_to :: Range.t()) :: guard() as an alias to min_count: from, max_count: to;
  2. Default guard @spec item() :: guard() declaring the specification of the element in the list;
  3. New sigil/macro to declare a nested type to avoid code duplication.

Then your example would be guarded somewhat as:

use Exvalibur,
  types: [
    leaf_tag: %{guards: %{tag: tag in ["00", "11", "99"]},
                conditions:%{tag: limit: 1..1}},
    branch_leaf_tag: %{guards: %{tag: tag in ["16", "21", "26"]}
                conditions:%{tag: limit: 1..1}},
    branch_tag: %{guards: %{tag: tag in ["12"]},
                conditions:%{tag: limit: 1..1},
                children: %{tag: item(branch_leaf_tag())}}
  ],

  rules: [    
    %{matches: %{tags: leaf_tag() or branch_tag()}}]

Right?

@dimitarvp
Copy link
Author

More or less yes, unless I am reading your format wrong:

use Exvalibur,
  types: [
    leaf_tag: %{guards: %{tag: tag in ["00", "11", "99"]},
                conditions:%{tag: limit: 1..1}},
    branch_leaf_tag: %{guards: %{tag: tag in ["16", "21", "26"]}
                conditions:%{tag: limit: 1..1}},
    branch_tag: %{guards: %{tag: tag in ["12"]},
                conditions:%{tag: limit: 1..9999},
                children: %{tag: item(branch_leaf_tag())}}
  ],

  rules: [    
    %{matches: %{tags: leaf_tag() or branch_tag()}}
]

Note the limit: 1..9999 in branch_tag.

The part that sucks the most is writing the recursive validation code in such a way that it can show you exactly where input data doesn't comply. I've tried 3 versions so far and they all did the validation job fine but the error messages remain unconquered territory for me so far.

Considering making Ecto embedded schemas and encoding the rules in their changeset methods at this point. ☹️ Might be easier in the long run.

How's your library in terms of pinpointing and reporting exact places in the input where validation fails?

@am-kantox
Copy link
Owner

am-kantox commented May 5, 2019

@dimitarvp Yeah, 9999, of course, it was a copy-paste leftover.

How's your library in terms of pinpointing and reporting exact places in the input where validation fails?

It’s uh planned: #13

Actually, I never pretended this to be a preferable solution for complex data, and I am using Ecto myself when it comes to cumbersome nesting.

I am to approach this issue in any case and will report back if there might be a better concise way to describe this kind of data.

@dimitarvp
Copy link
Author

dimitarvp commented May 5, 2019

Sure, that's my goal as well -- to see if there's a neutral(ish) approach to recursive validation. I quite like Ecto and reach for it in every single hobby project even when there's no actual persistence involved (Ecto 3.0 makes this even easier). The Changeset functionality is priceless.

But since I am quite interested in strong typing lately, I am trying to find a way to encode complex validations. If I find it, the end goal is it to be not Elixir-specific.

Feel free to bounce ideas with me here or on ElixirForum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants