Any chance of porting KTO into SD training? #1138

SLAPaper · 2024-02-26T12:12:33Z

SLAPaper
Feb 26, 2024

KTO is a training method like DPO, originally implemented for LLMs in https://github.com/ContextualAI/HALOs

It can take arbitrary good or bad samples as input, rather than the pairwise ones that required by DPO.

This might be more user-friendly if can be porting to SD traning than DPO (https://github.com/huggingface/diffusers/tree/main/examples/research_projects/diffusion_dpo) I wonder?

Imagine collecting a bunch of pictures I like and a bunch of pictures I dislike, throw them into KTO and get a LoRA that reflects personal preference?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any chance of porting KTO into SD training? #1138

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Any chance of porting KTO into SD training? #1138

SLAPaper Feb 26, 2024

Replies: 0 comments

SLAPaper
Feb 26, 2024