Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawl statement to feed the Experience Index #167

Merged
merged 8 commits into from
Jan 23, 2024
Merged

Crawl statement to feed the Experience Index #167

merged 8 commits into from
Jan 23, 2024

Conversation

lebaudantoine
Copy link
Contributor

@lebaudantoine lebaudantoine commented Dec 8, 2023

Purpose

Index Moodle Courses and Modules in the Experience Index

Proposal

This PR proposes several important pieces of logic:

  • Clients of the Experience Index and Moodle web services.
  • Indexers for Moodle LMS.
  • Runner script to run any indexer on any source.
  • Factories to instantiate indexer and data sources.

Two indexers have been implemented for Moodle:

  • Courses, which indexes all available courses in Moodle
  • CourseContent, which indexes all available modules for a given course

To do:

  • Set-up Warren CLI to run the runner script.
  • Discuss how we would register indexing job.
  • Discuss storing the module's type in technical_datatypes experience attribute.

Next steps:

  • Write indexers to parse statements.
  • Set up the orchestration of the available indexers (cronjobs or airflow).
  • Implement LifeCycle metadata part of an Experience.
  • Implement archive/delete feature in the Experience Index API and client.

Copy link
Contributor

@wilbrdt wilbrdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/api/core/warren/xi/crawler.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/crawler.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/crawler.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/crawler.py Outdated Show resolved Hide resolved
bin/wip.py Outdated Show resolved Hide resolved
bin/crawl_statement.py Outdated Show resolved Hide resolved
@lebaudantoine lebaudantoine force-pushed the al-crawler branch 2 times, most recently from 5304d59 to 038ff86 Compare December 15, 2023 15:58
Copy link
Contributor

@quitterie-lcs quitterie-lcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge!

src/api/core/warren/xi/client.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/routers/experiences.py Outdated Show resolved Hide resolved
Copy link
Contributor

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing!

The PR description needs more love ❤️

bin/runner.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/client.py Show resolved Hide resolved
src/api/core/warren/xi/client.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/client.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/indexers/client.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/indexers/etl.py Show resolved Hide resolved
src/api/core/warren/xi/indexers/factory.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/indexers/moodle/models.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/routers/experiences.py Outdated Show resolved Hide resolved
@lebaudantoine lebaudantoine force-pushed the al-crawler branch 3 times, most recently from be05b60 to 1202dad Compare December 28, 2023 18:13
Copy link
Contributor

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost ready to merge!

src/api/core/warren/xi/client.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/indexers/etl.py Show resolved Hide resolved
src/api/core/warren/xi/indexers/moodle/etl.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/indexers/moodle/models.py Outdated Show resolved Hide resolved
src/api/core/warren/xi/routers/experiences.py Outdated Show resolved Hide resolved
Copy link

changeset-bot bot commented Jan 18, 2024

⚠️ No Changeset found

Latest commit: a439dd1

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@jmaupetit jmaupetit marked this pull request as ready for review January 19, 2024 10:15
@jmaupetit jmaupetit self-assigned this Jan 19, 2024
@wilbrdt wilbrdt force-pushed the al-crawler branch 3 times, most recently from edb24f4 to ed86a12 Compare January 22, 2024 11:34
Copy link
Contributor

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lebaudantoine and others added 8 commits January 23, 2024 09:08
Updated Json attributes to our custom type JsonField.
It makes creation, manipulation and serialization,
of ExperienceCreate instances easier.
The IRI serves as a unique identifier for experiences, acting as a
natural primary key. Enhance developer experience by introducing a new
endpoint dedicated to retrieve an experience based on its IRI.
Added:
* Client Protocol to declare any custom async HTTP client.
* User-friendly interfaces for performing CRUD operations on the XI.
* Async HTTP client for the XI.

Overall, improve the dx by introducing these clients, designed to be simple.
Please note that Delete operations on Experiences and Relations,
are not yet implemented in the API.
Added :
* Factories to create indexers and datasources.
* Mixins class to build LangString inspired dictionary.
* Interfaces to create executable ETL indexer classes.
* Interface to create LMS client.
Added tools to index from Moodle:
* available courses
* available modules from a given course

These indexers don't handle precisely course and module lifecycle,
a deleted course or module from Moodle won't be archived in
the XI.
Allow ETL tool to be run from a runner script.
This script needs to be connected with a CLI.
Database logs are useful for debuging but not used 90% of the time.
httpx_mock does not mock httpx requests to localhost (due to the
non_mocked_hosts) fixture. We need to adapt to it and change the base
server URL for xi endpoints we need to mock.
@jmaupetit jmaupetit merged commit db25878 into main Jan 23, 2024
27 of 28 checks passed
@jmaupetit jmaupetit deleted the al-crawler branch January 23, 2024 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants