-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GIP] Integrate Apache Superset (dashboarding software) into geOrchestra #10
Comments
Looks good to me ! |
For illustration, superset could come with out of the box statistics like these below (liked to https://github.com/georchestra/analytics/). Number of unique users, per weekusers by yearstype of OGC services comsuption per day |
i had a first look last week, and it shouldnt be that hard once (sorry for being technical in a GIP, but that's how i work..) :
it can be installed from pypi and this method is supported upstream some bits could be taken from https://github.com/onaio/ansible-superset/ but thats unmaintained and wayyys too much configurable, we can have something much simpler all that to say, +1 for me. The real value is not in 'having the tool' but 'providing sample dashboards that replace analytics' like the ones @MaelREBOUX showed.... |
Right now,it's reading the roles from the Console, filters to keep only the ROLE_SUPERSET_SOMETHING roles and checks for a match in Superset's roles to see if there is a Something role (case-agnostic). Seems to work well. Question: I'm considering also adding a support based on the user's organization. Like for instance, user A belongs to org Geo2france, then it would also try to match with a possible O_Geo2france role in superset. Is it relevant ? Interesting ? Overkill ? The idea is to avoid overloading the console with many roles that can be infered in another way. The drawback is that it creates a mapping that is not obvious on first look. I can also make this optional. |
Very useful and awaited evolution, thanks Some considerations :
|
coming back to this, i have a start of ansible role to deploy it, and having it behind a subdir looks.. complicated without going through gory hacks, and unsupported upstream. Cf apache/superset#24823, and https://github.com/komoot/superset-reverse-nginx-example among others. how do you have it running ? |
Yup. That was a blind spot when I estimated the work to do. I considered it such a basic feature that I forgot to check. There is a PR that has good chances to get into master, has quite some community support and upstream devs seem OK to consider it, but still a bit young to be certain: apache/superset#30134. I'm betting on it and will support it. As a first step (in the next few months), we will probably have to use a fork if we want this feature. I'm also investigating the code. Flask comes with blueprints module, which is specificly made to support this usecase, but the way Flask appbuilder implements it is a bit weird. And Superset folk, well, I'm not sure they remember they had this. Anyway, the JS frontend does not support it. And most of the backend doesn't seem to either. The alternative being to use a separate subdomain (I personally don't favour this). |
looking at the PR, i see that changing the base path requires 'rebuilding' the JS frontend.. oh well. modern web dev.. |
Well, yes, that's one of the things I'd like to act on. I don't see why the frontend couldn't read a config file provided by the backend, instead of rebuilding based on a hard-coded path... But that won't be a short-term feature ;o). |
Noted. Makes sense to me. This is authorization config though (done through roles), so it might vary depending on the platform. Not really an integration topic IMO.
Didn't know about that, but I'd say it is covered in https://superset.apache.org/docs/security/#content-security-policy-csp, isn't it ?
Auth with headers is geOrchestra's basic behaviour. For me, this is implied in "integration in geOrchestra". Using openid would rather be "deploy alonside geOrchestra". Main benefit I see is to be able to assign roles to users in the geOrchestra console, like for the other integrated apps. |
I voted : +1 |
+1 ;o) |
+1 too |
+1 |
2 similar comments
👍 |
+1 |
Since this GIP is kind of a big evolution, we shall clarify its compatibility with the manifest "geOrchestra is a spatial data infrastructure project whose founding characteristics are: free, modular, interoperable. This project is community driven."
Note : analytics (other GIP) and this GIP are connected since analytics will use superset. This is not a problem for modularity. analytics can be superset dependent, letting the admin choose another dataviz system for regular datas. |
Great GIP, I vote +0 just because I never took time to check superset vs other choice, but sure you did :) |
Thank you all. I'm going to create a git repo for this feature, that will contain documentation and custom code. I want to name the repo after the path that we will give to access superset visualizations. This is something that was already a bit discussed with @landryb when he was investigating deployment with ansible. We went for Important detail about this: for now, superset does not allow for a dynamic change of this path. changing the path means rebuilding the JS frontend. So it's probably better if we settle on a name/path that is common to all geOrchestra instances. |
LGTM :-) |
i would then expect a repo named |
That will probably come when I deal with analytics. In my mind, the repo |
But then, as it is open source projects, you'll of course be welcome to contribute dashboard demo/samples ;o) |
Proposing |
Also taken ! See https://github.com/apache/superset/blob/master/superset-frontend/src/views/routes.tsx#L205 What about Or, since this kind of dataviz is also called "Business Intelligence", |
OMG, now I understand why they usually take a full domain. |
Or |
Some naming ideas that I had:
(Sometimes the name of a project doesn't need to have a direct relation to what people think is it supposed to be, sometimes an easy name and which sounds good will stay better in people's mind than a complicated name that exactly describe the meaning of the project.) |
Who ?
pi-Geosolutions ([email protected]) with funding from geo2france
Target Module
This will be an additional, optional module
What ?
Apache Superset is a tool for building all kinds of dashboards. What we call a "Business Intelligence" software. It is a Python Flask application.
This proposal is about integrating Superset in geOrchestra:
Superset uses a concept of roles, to determine what a registered user can access. This will connect nicely with our roles system. Superset-specific roles will still be managed inside Superset (to determine, for a given roles, what it will give access to), but mapping which roles are assigned to a given user will be managed in the Console, like the other apps. It will behave mostly like what we already have in GeoServer for instance.
Why ?
We don't really have a BI tool right now. Possibly by making dashboards in MapStore, but this is a bit of a strech if working on data that has no geospatial aspect. And I believe Superset is much reacher in what it can do.
Adding an optional BI tool seems like a good asset for geOrchestra. Some platforms are already using it (geo2france, but also geobretagne making experiments in integrating superset and mviewer).
Also, Superset is seriously considered for the visualization part of the new Analytics module, to come soon.
How ?
Superset, for user management, relies on Flask_appbuilder, which supports HTTP remote user mode. It is possible to add some custom logic, which should allow us to also retrieve the roles from the HTTP headers and update accordingly (live) the user's profile.
It should (should) be a fairly simple integration. It should not impact the rest of the platform in any way (except, of course, adding a route in the gateway). Even on Superset's side, ithe implementation will most likely not touch the core app, but live in a few "config" files that can be shipped alongside the core app, ensuring simple maintainance.
Any potential pitfalls and ways to circumvent them ?
Apart from adding more complexity in the whole geOrchestra architecture, I don't see any.
Oh, yes, one: Superset's doc now mostly deals with a dockerized deployment. There is very little doc on how to deploy it without docker. But I got it working, we should be able to provide an Ansible deployment without too much hassle (I might need a bit of help though, since I don't know much about ansible)
When ?
I plan to have it operational by January 2025. It should be possible to beta-test it around Christmas.
State of the vote:
The text was updated successfully, but these errors were encountered: