Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of JabRef embedding models #12240

Open
InAnYan opened this issue Nov 26, 2024 · 1 comment
Open

Improve handling of JabRef embedding models #12240

InAnYan opened this issue Nov 26, 2024 · 1 comment
Labels
component: ai Related to AI Chat/Summarization

Comments

@InAnYan
Copy link
Collaborator

InAnYan commented Nov 26, 2024

Is your suggestion for improvement related to a problem? Please describe.

Currently JabRef provides means to view and download embedding models from DJL ModelZoo. JabRef also stores the size of embedding models.

However, it is badly implemented (another black page of my GSoC project).

Problems are:

  1. This list is not auto-updated. Actually, this list is ... hard-coded.
  2. Model size in this list is not properly calculated.
  3. There is no way to view what models are already downloaded.
  4. There is no way to delete old or unused models.
  5. Embedding models access Internet without agreement (agreement on using AI != any Internet connection). There should be a way to download model on 1 computer and then transfer it to another computer. The question is: what to download? Where is it stored? Which files to transfer? Where to put in JabRef?

Describe the solution you'd like

  1. Provide a list of available models using actual DJL API (up-to-date list).
  2. Add the ability to download a model.
  3. Add the ability to select model for using in AI features.
  4. Provide a way to list downloaded models.
  5. Provide the ability to delete a downloaded model.

Additionally, there should be a way to download a model beforehand. E.g. download model on one computer, then transfer it to another and install in JabRef.

Additional context

It seems there are some useful methods in DJL, though they are not documented thoroughly (https://javadoc.io/doc/ai.djl/api/latest/ai/djl/repository/zoo/ModelZoo.html#listModels()). I couldn't quickly grasp how to connect local (downloaded) models + remote, but probably this is a problem of time.

Thi Lo also found a link with models metadata (https://mlrepo.djl.ai/model/nlp/text_embedding/ai/djl/huggingface/pytorch/models.json.gz), which is enough to have.

This is not an easy issue, one needs to create useful UI. However, it's not debatable, so I posted it here.

Maybe introduce a section in AI preferences "Available models" with button "+", button "+" opens a dialog for choosing a remote embedding model or a local one

@InAnYan InAnYan added the component: ai Related to AI Chat/Summarization label Nov 26, 2024
@InAnYan
Copy link
Collaborator Author

InAnYan commented Nov 26, 2024

Oliver has found an interesting UI for download dialog (https://forum.image.sc/t/trouble-getting-gpu-to-work-with-instanseg-qupath/102042/26):

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: ai Related to AI Chat/Summarization
Projects
Status: Normal priority
Development

No branches or pull requests

1 participant