Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial work on combinatorial models of GATs #135

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

kris-brown
Copy link
Collaborator

@kris-brown kris-brown commented Dec 23, 2023

This branch contains some experiments I've done with getting a canonical category of models for any GAT.

Relation to C-sets

This generalizes C-Sets, which correspond to models of theories with nullary type constructors and unary term constructors (ideally, C-Sets with all their performance characteristics should come automatically for such theories).

Rather than a FinSet for every Ob of a schema C, a GATset associates with each AlgSort a nested matrix, e.g. for Hom2(f,g) ⊣ [(a,b)::Ob, (f,g)::Hom(a,b)] there is a parameterized family of FinSets. We can automatically partition the four parameters of Hom2 into two dependent levels of parameters ([[a,b],[f,g]])

|Hom2| is NOT equal to |Hom|², since f and g are not completely independent: they must have the same dom and codom. E.g. if we have: |Ob| = 2 |Hom(1,1)| = 2 |Hom(2,1)| = 0 |Hom(1,2)| = 1 |Hom(2,2)| = 1 and give names to these elements, e.g. Ob=[ω,β] Hom(ω,ω) = [η,μ] Hom(β,ω) = [] Hom(ω,β) = [δ] Hom(β,β) = [ξ] then we could potentially get a Hom2 matrix like:

       a=ω    a=β
    ⌜-------------⌝
b=ω | [3 2  |     |     i.e. Hom2(η,η) = |3|, Hom2(η,μ) = |2|, Hom2(μ,η) = |1|
    |  1 0] |[]₀ₓ₀|          Hom2(μ,μ) = |0|, Hom2(δ,δ) = |2|, Hom2(ξ,ξ) = |1|
    |-------------|
b=β |  [2]  | [1] |
    ⌞-------------⌟ 

There is a rudimentary HTML printing for these nested matrices but they can also be rendered as a flattened table:

┌───┬───┬─────┬───────┬─────┐
│ a │ b │ dom │ codom │ val │
├───┼───┼─────┼───────┼─────┤
│ 1 │ 1 │   1 │     1 │   3 │
│ 1 │ 1 │   2 │     1 │   1 │
│ 1 │ 1 │   1 │     2 │   2 │
│ 1 │ 1 │   2 │     2 │   0 │
│ 2 │ 1 │   1 │     1 │   2 │
│ 2 │ 2 │   1 │     1 │   1 │
└───┴───┴─────┴───────┴─────┘

The NestedMatrix{T} data structure can be used for:

  • cardinalities of all the dependent sets of a model (T=Int), as above
  • data of term constructors, e.g. id/compose (T=Int)
  • component data for morphisms between GATsets (T=Vector{Int})

Current features

  • add_part!, rem_part!
  • homomorphisms(X,Y; monic, epic, iso, initial)
  • @instance ThCategory{CombinatorialModel, CombinatorialModelMorphism}
  • random model generation
  • basic colimits (coproducts, pushouts, otimes on morphisms, copair)

Wishlist

  • Performance (e.g. indexing, CompTime, add/rem_part! not reallocating)
  • Attributes
  • Enumeration + nauty symmetry quotienting to verify properties hold of all models up to some size limit

Enforcing equations

For these models to be particularly useful, we need to be able to enforce GAT axioms. This is done with C-Sets using the chase, which performs equality saturation via a sequence of pushouts. Based on some experiments, that algorithm won't work for GATsets (it doesn't terminate even in simple cases where it ought to). But this isn't a problem, because E-graphs have a semidecidable equality saturation algorithm. Corresponding to each model is an e-graph, with an e-node for each element of every dependent set. The nested matrix representation is less ideal for inference or modification, but a flexible shifting between e-graph and nested matrix could obtain the best of both worlds.

Interesting things

  • This is a potential solution to the issue of wanting CSets with finite limit constraints. E.g. consider representing a morphism of polynomial functors as a model of the theory P::TYPE, P'::TYPE, D(p::P)::TYPE, D'(p::P')::TYPE, f(p::P)::P', f#(d)::D(p)⊣[p::P,d::D'(f(p))]
  • We can work with a category of Petri Nets with etale maps by imposing the iso=[:I, :O] constraint on homomorphism search, as this asserts the constraint for each I(s::S,t::T) (resp. O(s,t)) dependent set (See NonStdlib.Theories.ThPetri and tests/combinatorial/HomSearch.jl)
  • One can use random model generation to test code written for models of a GAT. E.g. if someone writes Julia code which assembles a model of ThCategory given any model of ThCategoryWithPushouts, one way to verify the code is correct is to run the function through a series of randomly-generated models which implement ThCategoryWithPushouts (and check that the code doesn't crash, that the resulting models satisfy the ThCategory GAT axioms, etc.). The ability to do this kind of verification increases the value of writing generic code using GATlab.
  • It's important that GATsets need not always be fully "chased" to use them, i.e. we should represent GATsets which imply an infinite set. E.g. a category with one object and one morphism but the composition of that morphism with itself does not yet have a specific ID. Not all features may be supported for such models, but colimits can produce infinite things from finite building blocks, and we want our class of models to be closed under colimits. Currently, a function output with ID=0 is meant to imply that the resulting element is free (modulo the equations of the GAT).

add and remove part

NestedType/Term, better add/rem part

Iterate through NM

more hom progress. checkpoint before refactor

NestedType/Term, better add/rem part

implement ThCategory, functioning HomSearch

epic / monic constraint (begun)

hom search working iso

initial constraint
Copy link

codecov bot commented Dec 23, 2023

Codecov Report

Attention: 31 lines in your changes are missing coverage. Please review.

Comparison is base (1ee1536) 96.61% compared to head (bd76a06) 96.40%.

❗ Current head bd76a06 differs from pull request most recent head 2972040. Consider uploading reports for the commit 2972040 to get more accurate results

Files Patch % Lines
src/combinatorial/Limits.jl 0.00% 13 Missing ⚠️
src/combinatorial/CModels.jl 94.00% 9 Missing ⚠️
src/combinatorial/Visualization.jl 93.33% 5 Missing ⚠️
src/combinatorial/DataStructs.jl 98.24% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #135      +/-   ##
==========================================
- Coverage   96.61%   96.40%   -0.21%     
==========================================
  Files          38       45       +7     
  Lines        2067     2756     +689     
==========================================
+ Hits         1997     2657     +660     
- Misses         70       99      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kris-brown kris-brown self-assigned this Dec 23, 2023
Kris Brown added 3 commits December 23, 2023 12:02
Doesn't terminate when it should: likely because all TGDs and all EGDs must be applied at once, rather than one at a time.
more colimits

remove test file

rem
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant