Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various tasks where LLM might help #6145

Open
nicolas-raoul opened this issue Jan 19, 2025 · 4 comments
Open

Various tasks where LLM might help #6145

nicolas-raoul opened this issue Jan 19, 2025 · 4 comments
Assignees
Labels
enhancement gsoc Google Summer of Code

Comments

@nicolas-raoul
Copy link
Member

nicolas-raoul commented Jan 19, 2025

  • help find correct category name, showing suggestions from "Cuisine of Japan" when "Japan food" is typed
  • fix obvious typos in depiction search
  • detect unwanted nudity pictures from caption
  • detect meaningless captions such as PXL1234 etc
  • show translation of captions/etc not available in the user's languages. Either show next to original, or replace original when "AI translation" button shown. Better than OS-level because we know source language.

Feel free to comment if you want to add more, thanks! :-)

@nicolas-raoul nicolas-raoul added enhancement gsoc Google Summer of Code labels Jan 19, 2025
@nicolas-raoul nicolas-raoul self-assigned this Jan 19, 2025
@Thejas775
Copy link

Addition ? Maybe check the caption with image

@nicolas-raoul
Copy link
Member Author

Maybe check the caption with image

Indeed great idea! All Android device-embedded models so far seem to be text-to-text, but that could change in the future.

@Thejas775
Copy link

Thejas775 commented Jan 23, 2025

Maybe check the caption with image

Indeed great idea! All Android device-embedded models so far seem to be text-to-text, but that could change in the future.

yeah in future maybe. Now too very less devices support on device model running.
I also have a doubt why are we not considering api calls?

@whym
Copy link
Collaborator

whym commented Jan 23, 2025

I think API calls to a Wikimedia-controlled Toolforge server would be fine from a privacy perspective. (We already call Wikimedia servers, including a Toolforge server, so more calls to Wikimedia servers would be fine.) However, whether Toolforge will host a large vision-language model would depend on their policy and is still unclear (https://phabricator.wikimedia.org/T336905).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement gsoc Google Summer of Code
Projects
None yet
Development

No branches or pull requests

3 participants