Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Props & quantifiers #217

Merged
merged 17 commits into from
Jul 20, 2020
Merged

Props & quantifiers #217

merged 17 commits into from
Jul 20, 2020

Conversation

shaylew
Copy link
Collaborator

@shaylew shaylew commented Jun 23, 2020

(Previous discussion at #200.)

byorgey and others added 9 commits June 4, 2020 21:54
`holds` tries its best to convert `Prop` to `Bool`.  Once we add things like
`forall` and `exists`, it may not terminate.  For now all it does it just
evaluate the underlying `Bool` value.
So far they parse, typecheck, and pretty-print.  No evaluation yet.
Move the definition of SortMap from Disco.Types.Rules into Disco.Types,
in preparation for adding SortMap to PolyType (allowing qualified types
to show up in surface syntax).

See #179.
@shaylew
Copy link
Collaborator Author

shaylew commented Jun 23, 2020

Wow, it's amazing how much code we get to delete when switching to use simple-enumeration.

There could possibly be even more -- the countType, isEmptyTy, and isSearchable machinery is almost but not quite redundant with the value enumerations. Combining them would (practically) require putting all the enumeration code into Maybe so it could signal "nope, can't enumerate that", and (philosophically) introduce a weird dependency between the typechecker and the representation of terms in the evaluator, but those things might be worth it.

(There's also a tiny bit of code right now in enumFunction that works around a difference of opinion between Disco -- which is happy to enumerate N -> Void or N -> Unit -- and simple-enumeration, which complains about the infinite domain. I made an issue there.)

Ah, it's because nested quantifiers desugar to alternating quantifiers and cases:

Ah, oops, yeah. One quantifier with several bindings is desugaring differently from nested (matching) quantifiers, and the current attempt only works when there aren't intervening cases.

I guess there are a couple levels of sophistication we could go for here:

  1. The current behavior, where search works for one syntactic quantifier in the source program (and exists x. exists y. p is different from exists x y. p).
  2. Lifting the abstraction out of the (single-branch) case, so syntactically-adjacent quantifiers are treated the same whether they're merged or broken up. We could either do this in the compiler (looking through cases) or the desugarer (merging the abstractions before we put the cases in).
  3. Doing something clever in the evaluator (keeping a quantifier stack?) to make the question of which quantifiers get searched together semantic rather than syntactic. For instance, no amount of desugarer magic is going to merge a quantifier hidden behind a function call, but in a pure language one might hope that the obvious inlining transformation wouldn't change the semantics.

(Illustrative but not super realistic example:)

isSquare : N -> Prop
isSquare n = exists (k : N). k^2 == n

!!! exists (x : N). isSquare (x + 5)

@byorgey
Copy link
Member

byorgey commented Jun 23, 2020

There could possibly be even more -- the countType, isEmptyTy, and isSearchable machinery is almost but not quite redundant with the value enumerations. Combining them would (practically) require putting all the enumeration code into Maybe so it could signal "nope, can't enumerate that", and (philosophically) introduce a weird dependency between the typechecker and the representation of terms in the evaluator, but those things might be worth it.

Interesting, I'd have to think about this more. I don't like the sound of "weird dependency between the typechecker and the representation of terms in the evaluator" but I'm not sure I completely understand all the issues. Let's definitely not worry about it for now and maybe we can clean things up even more down the road if we find a reason to.

(There's also a tiny bit of code right now in enumFunction that works around a difference of opinion between Disco -- which is happy to enumerate N -> Void or N -> Unit -- and simple-enumeration, which complains about the infinite domain. I made an issue there.)

Oh, I wondered about that bit of code! When I saw it I thought, "wait, doesn't functionOf already take care of that?" Let's definitely fix that in simple-enumeration.

The current behavior, where search works for one syntactic quantifier in the source program (and exists x. exists y. p is different from exists x y. p).

Ugh, I don't like this option at all. I'd really like to be able to tell students that exists x y z. ... is an abbreviation for exists x. exists y. exists z. ..., and for them to be able to write nested quantifiers without weird gotchas about fair vs. unfair search.

Lifting the abstraction out of the (single-branch) case, so syntactically-adjacent quantifiers are treated the same whether they're merged or broken up. We could either do this in the compiler (looking through cases) or the desugarer (merging the abstractions before we put the cases in).

Yeah, now that I think about it, I think the desugarer is the right place. We already should change the desugarer so it avoids inserting redundant cases. We might as well put in something to collect repeated quantifiers. It is a bit odd though, it feels sort of like "sugaring" rather than "desugaring".

Option 3 is really interesting, and the example is compelling (if not inherently realistic). I'm not sure it's worth doing right now (unless it turns out to be really easy?), but we should keep it in mind.

@byorgey
Copy link
Member

byorgey commented Jun 23, 2020

The Travis build was failing due to a Haddock parse error I created earlier. Hopefully should be fixed now.

@shaylew
Copy link
Collaborator Author

shaylew commented Jun 23, 2020

It is a bit odd though, it feels sort of like "sugaring" rather than "desugaring".

I agree it's a little odd. If we were doing it just to lambdas it would feel more like a core-to-core optimization pass -- increasing the arity of a lambda to avoid immediately returning a closure, which incidentally makes it less strict. But doing it to quantifiers is less like an optimization with a minor semantic side effect and more like a semantic transformation with a minor performance benefit.

I thought a little more about option 3 and it seems like doing it in generality would require fair backtracking search. That sounds like a pain... except that LogicT exists right off the shelf! So while this would definitely be a complication, it does have the upside of giving a very solid semantics to values of type Prop: a prop would be (ignoring the example/counterexample) a LogicT (Disco IErr) (), operations on props would combine those values, and holds would runLogicT them to observe whether they succeed or fail.

(That said, despite my excitement about this, it's probably better to just do (2) for now.)

@byorgey byorgey linked an issue Jun 24, 2020 that may be closed by this pull request
@byorgey byorgey linked an issue Jun 24, 2020 that may be closed by this pull request
@byorgey
Copy link
Member

byorgey commented Jun 24, 2020

(That said, despite my excitement about this, it's probably better to just do (2) for now.)

Yeah, the LogicT thing sounds really cool, but I think I agree with your assessment.

@byorgey
Copy link
Member

byorgey commented Jun 25, 2020

simple-enumeration-0.2.1 is now on Hackage with the enhancements to functionOf.

-- Type helpers
------------------------------------------------------------

mkFunTy :: [Type] -> Type -> Type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this function already exists somewhere in the Disco codebase...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact the place it exists (local to typechecking abstractions) is passing the answer right along, and in the place I though I needed to do this it had already been done. Maybe it belongs in Types but for now I just took it out and un-abstracted it back into desugarDefn.

@byorgey
Copy link
Member

byorgey commented Jun 25, 2020

How hard would it be to add another quantifier, find, such that find (x1 : T1) ... (xn : Tn). prop has type Unit + T1 * ... * Tn? i.e. like exists except it gives you the witness it finds instead of being a Prop. For example

Disco> holds (exists (x:N) (y:N) (z:N). 0 < x < y < z and x^2 + y^2 == z^2)
true
Disco> find (x:N) (y:N) (z:N). 0 < x < y < z and x^2 + y^2 == z^2
right (3, 4, 5)

@shaylew
Copy link
Collaborator Author

shaylew commented Jun 25, 2020

I don't think that would be too hard. It's a new abstraction with its own typing rule, but on the evaluation end we're actually already generating the witness and then throwing it away.

@byorgey
Copy link
Member

byorgey commented Jul 1, 2020

@shaylew , let me know if there's any way I can help on this, or let me know if there are specific parts you're currently working on so I can pick a different aspect to work on. For example, I could start working on getting rid of the special Property type and replace it with terms of type Prop.

@shaylew
Copy link
Collaborator Author

shaylew commented Jul 3, 2020

I made a first pass at unifying propositions and properties. This pass is just the evaluator and not the frontend, so there's still a separate path for parsing and typechecking properties to be simplified away.

I think the major fiddly bit left is dealing with quantified argument names. Currently they get lost and replaced with deeply unhelpful (arg0, arg11) generated names at some point in the pipeline. And in fact it's not completely clear we want to report the argument names, rather than the pattern-bound variable names (though the one subsumes the other when applicable). If someone tries to test forall (x: N, y: N). x < y or x > y we'd maybe want to report a counterexamples with values for x and y, not values for the (unnamed) pair.

Maybe the thing to do here is to capture the environment at the point of failure, then somehow prune it to the variables of interest? Possibly sticking some more information on VPSearch values would make this easier.

The other (less fiddly) things to do are to recover "these things weren't equal" reporting with a Prop-valued equality operator, and to parse and typecheck properties as plain old expressions (of type Prop).

There are also definitely some opportunities for cleanup (e.g. finding a monoid for the expanded TestResult that's lazy enough for unbounded search), and some loose ends that may or may not pan out (e.g. the Prop negation stuff, which falls out for not-quite-free from the choice to represent exists and forall with one search value but which may or may not merit being exposed to the user).

@shaylew
Copy link
Collaborator Author

shaylew commented Jul 8, 2020

@byorgey Do you have opinions about how to unify the syntax of properties and propositions? Right now properties don't require parenthesis around parameters but do require commas between them; but forall/exists in expressions work the other way around (like lambdas).

!!! forall a : Q, b : Q. divide a b * b == a

p :: Prop
p = forall (a : Q) (b: Q). divide a b * b == a

The former style is definitely closer to math notation, the latter is consistent with lambdas. There's no issue with supporting one or the other or both, it's just down to what syntax(es) you want to support on which abstractions.

@byorgey
Copy link
Member

byorgey commented Jul 8, 2020

Ah, good question, I had forgotten about this. I definitely prefer the version with commas and no parentheses, for the reason you mention --- it is more in line with the usual mathematical syntax.

I think while we're at it we should just change the syntax of lambdas to match, i.e. \x : N, y : Z. blah blah. I like not having to put parentheses around variables. It feels weird that if you start out with \x.x but then you decide you want to put a type annotation on the x, you have to add parentheses. Sure, it doesn't match Haskell or Agda syntax, but it will be more internally consistent and, I think, less confusing for students.

@byorgey
Copy link
Member

byorgey commented Jul 8, 2020

This is all looking good, by the way. The issue with names for reporting counterexamples is tricky, I'd be happy to chat about that sometime if you want.

@shaylew
Copy link
Collaborator Author

shaylew commented Jul 8, 2020

So \x : N, y : Z. blah blah would be a function of two arguments, and with parens \(x : N, y : Z). blah blah would take a tuple? And standalone definitions would still not use commas to keep them looking like their callsites?

foo : N -> Z -> Blah
foo x y = blah blah

src/Disco/AST/Generic.hs Outdated Show resolved Hide resolved
src/Disco/AST/Generic.hs Outdated Show resolved Hide resolved
src/Disco/Extensions.hs Outdated Show resolved Hide resolved
-- (look for a counterexample, fail if you find it) and the motive
-- @(True, True)@ corresponds to "exists". The other values
-- arise from negations.
newtype SearchMotive = SearchMotive (Bool, Bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually making use of both booleans, or is this left over from the stuff with a prim to invert a prop? I tried tracing its use and it took me to invertMotive, which took me to primNotProp, which took me to ONotProp, which does not seem to be used at all.

If we are in fact not making use of this extra flexibility, should we get rid of it, or should we leave it in just in case we want it in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only using two values right now (exists is "succeed if you find a success" and forall is "fail if you find a failure"). The search code is slightly simplified by being able to abstract over "what am I looking for" and "what do I do when I find it", but without the NotProp stuff (which gives rise to the other two values) it'd make just as much sense to do a case and determine those locally.

Up to you if you think you're going to want negation, I think?

Copy link
Member

@byorgey byorgey Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, when you say "if you're going to want negation", you mean a Prop-level negation, right? I guess it would be nice to have if most of the machinery is already there. And actually we kind of do already have it, except not:

Disco> :type not (exists x:N. x == x)
not (∃(x : ℕ). x == x) : Prop
Disco> :test not (exists x:N. x == x)
user error (Pattern match failure in do expression at src/Disco/Interpret/Core.hs:372:3-12)

Copy link
Collaborator Author

@shaylew shaylew Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Yeah, I meant prop-level negation.)

Oh huh, I didn't think that would typecheck! But in fact not, and, and or all check at Prop -> Prop as well as the expected B -> B and B -> Prop, unlike identical user-defined boolean functions!

In the case of not it'd be easy to not desugar it away and make one connective that works for either booleans or props, though I don't know if that's clearer or less clear than using a separate name for the prop version. But this works for not only because we can push it inside searches in the evaluator (by twiddling the motive). There's no representative in our type of ValProp normal forms that could represent the and or or of two searches.

edit: Oh no, there is a way to do this at the desugarer level. You can turn p and q and p or q for props into:

forall b: B. {? p if b, q otherwise ?}
exists b: B. {? p if b, q otherwise ?}

And this is the desugarer, so it's even possible to do this without generating a reporting frame for the introduced variable. I'm not sure how I feel about this but if it's possible to desugar the connectives differently based on whether they're taking booleans or props this is a workable approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in fact not, and, and or all check at Prop -> Prop as well as the expected B -> B and B -> Prop, unlike identical user-defined boolean functions!

Yes, that's because of this code here: https://github.com/disco-lang/disco/pull/217/files#diff-6f4e13d554f1399d016fcb9dc348a635R477 I made those changes to the type checker intentionally when I started adding Prop but I hadn't really thought through all the implications yet.

Long-term, I think it would be ideal for Prop to support and and or (and possibly not), so one can start talking about proofs and evidence etc., maybe even connect Disco to some sort of external theorem provers and/or proof assistants, etc.

I was in the middle of writing this when I saw your edit. I was going to say that it seems clear at this point we should push it off to the future, put the type checking for and, or, not back the way they were, and create an issue to track this feature for the future. So now I guess it's less clear, but maybe still a good idea. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still a good idea to put it off. That trick gets the right answers but it's going to take some work to also get good error messages -- you really don't want to see "I tried 2 things and couldn't find an example" when you ask :test false or false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, makes sense.

generateSamples :: SearchType -> E.IEnumeration a -> Disco e [a]
generateSamples Exhaustive e = return $ E.enumerate e
generateSamples (Randomized n m) e
| E.Finite k <- E.card e, k <= n + m = return $ E.enumerate e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In another comment I asked why when I tested something of the form forall x:B. ... I was getting a message that said something about trying 100 possibilities. Looking at this code, it seems like what is going on is that under the hood it actually just tried the two boolean values, but the SearchType value remained the same, and that's what the pretty-printer used to decide what message to display. Maybe generateSamples should also return a potentially updated SearchType value??

src/Disco/Typecheck.hs Outdated Show resolved Hide resolved
@byorgey
Copy link
Member

byorgey commented Jul 16, 2020

This is so cool! I think it's looking really good. After fixing the various minor things I mentioned in comments above, we should also (probably in this order):

  1. Rebase on top of master and fix merge conflicts.
  2. Fix compiler warnings. There are a bunch at the moment, and I like to keep things -Wall clean as much as possible (see the build script for the GHC options we use). There are a couple warnings about the pattern-match checker exceeding a certain number of iterations which are impossible to turn off, but other than that we should try to get rid of all warnings. Happy to help with that if you want, let me know.
  3. Merge in the restyler PR (this can wait until the very end of course).

shaylew added 4 commits July 19, 2020 19:13
Use props as properties; unify syntax; add =!=.

This touches a large amount of disco code because it changes the syntax
of all multi-argument `lambda`, `forall`, and `exists` expressions to
match the comma-separated syntax of test properties.
@byorgey
Copy link
Member

byorgey commented Jul 20, 2020

This is looking good. What's left to do before merging?

@byorgey byorgey merged commit 351f331 into master Jul 20, 2020
@byorgey byorgey deleted the prop branch July 20, 2020 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce separate Prop type 'forall' and 'exists' expression syntax
3 participants