Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate cloud and on-prem MLOps framework #116

Open
aashish24 opened this issue Dec 9, 2024 · 1 comment
Open

Investigate cloud and on-prem MLOps framework #116

aashish24 opened this issue Dec 9, 2024 · 1 comment

Comments

@aashish24
Copy link
Collaborator

No description provided.

@BryonLewis
Copy link
Collaborator

BryonLewis commented Dec 19, 2024

USGS NABat

Currently Understood Process:

  1. BatAI application lives on NABat servers within their own AWS infrastructure
  2. A user can launch the BatAI application and it appears as a different page. NABat passes the authentication information (JWT) to NABat as well as the current file/project/waveform being looked at.
  3. BatAI uses this token and the NABat graphQL API to retrieve information about the waveform. This includes metadata like the Guanometadata, as well as a presigned S3 URL for the waveform itself.
  4. BatAI then takes the preseigned S3 URL and downloads the Waveform to create the spectrograms and possible run the ML inferences for the type.
  5. The user can then view the spectrogram and make a decision about the current annotation. Once this annotation is created there is a button that allow the user to 'Push Data to NABat' which will make a graphQL call to update the Manual Annotation Id for the file.

Cloud Deployment:

Currently BatAI uses docker containers. This includes using Celery for async tasks and RabbitMQ as a message queue to create the spectrograms and run inference.

NABat utilizes AWS and lambda functions for their Async tasking.

Options for BatAI:

  • Convert functions to utilizing AWS Lambda - Would need to convert some of the local development to make it a bit easier for development purposes. There is an option for creating an AWS Lamba function using a container image . This method could possibly be used with boto3 in django to call the function and the function will produce the results and log them.
  • AWS Fargate - a serverless way to run containers. I believe this was used in the RD-WATCH deployment to AWS for celery tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants