Hypernetwork training #2284

bmaltais · 2022-10-11T15:39:53Z

bmaltais
Oct 11, 2022

Starting a discussion where we can exchange on hypernetwork training tips and tricks.

One tip I can give is to use a learning rate of 0.00005 for training hypernetwork... if you use the default 0.005 you will get to NaN very quick.

Discoveries:

can be use for style transfer
can be used for subject transfer
act a lot like dreambooth by influencing all images generated by the model. Quality wist it produces results as good... if not better than dreambooth with much less resources.

Questions:

Will hypernetworks merging be possible
Could hypernetworks be treated like embeddings and used by names as part of prompts and only apply when referenced? Would that not be great? Right now they are on/off over the model all the time. This would make it dynamic. Could en hypernetwork be weighted rather than full force?

Suggestions/Requests:

Move hypernetwork dropdown selection from settings to be located beside the checkpoint selection dropdown at the top of the window
Add a seed field beside the "Preview prompt" field for hypernetwork training to see progress from N steps to N steps. This would allow to monitor training progress on the same seed/prompt
Split hypernetworks training outside of TI tab to reduce user confusion.

bmaltais · 2022-10-11T15:52:01Z

bmaltais
Oct 11, 2022
Author

I trained an HN for 2500 steps on various midjourney images to see what would come out of that. Here are 2 examples of using the trained hypernetwork. Using same seed with two different prompts:

sd1-4 only:

sd1-4 + midjourney hypernetwork:

4 replies

thezakman Oct 14, 2022

This looks awesome @bmaltais are you disabling VAE and CLIP from VRAM when training?
I've tried few times training with 32 imgs a style like midjourney but had no luck at all in 1000, 2000 or 3000 steps. Should I use only 20 imgs?

This is what I did:

select the 32 images, crop it in 512x512 (1:1)
used the Preprocess method with BLIP for caption
created the hyper used with hypernetwork.txt
selected 0.000005 until 2000 steps (than I did 5e-6 for >3000 steps)
rendering a img every 100 steps

Unfortunately I have not seen any results with the style I was trying to the images...
What I did wrong here?

Heathen Oct 14, 2022

2000 steps isn't enough for good results. Around 10000 to 20000 for 5e-6 is the ideal. You should start seeing results around >5000. But I've noticed that with amsgrad on, most hypernetworks trained on 5e-6 become unstable past 15000 steps at one point or another. I believe this has to do also with the number of training pictures you have, the less you have, the faster it becomes unstable.

I'm going to use 5e-7 for my next training attempt.

thezakman Oct 14, 2022

@Heathen Thank for the reply, how many images should I used for training a style? would 64 or 128 be good or too much?

Heathen Oct 14, 2022

The two most successful ones I have were trained on 22 and 48 pictures. I don't know the effects of having more pictures, but I tried to train one with 5 and it became unstable fast.

tuangd · 2022-10-11T15:53:16Z

tuangd
Oct 11, 2022

Starting a discussion where we can exchange on hypernetwork training tips and tricks.

One tip I can give is to use a learning rate of 0.00005 for training hypernetwork... if you use the default 0.005 you will get to NaN very quick.

The wiki page has already been updated with the new info. It recommends 0.000005 or 0.0000005 even.
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Textual-Inversion#hypernetworks

3 replies

bmaltais Oct 11, 2022
Author

Interesting. I have used 0.0005 with pretty good success so far (sample above)... It will be interesting to share those HN given they are fairly small in size (84MB) vs 2GB for fp16 checkpoints.

buzzz3d Oct 12, 2022

Is it possible to set that new training rate somewhere in the default settings?

Filarius Oct 14, 2022

there suggestion to make in incremental
start with 0.00005 (or 0.0005 ?), train and make save every 100 steps, check results, choose last best save, start training again with one more zero in learning rate.

bmaltais · 2022-10-11T16:05:18Z

bmaltais
Oct 11, 2022
Author

Might be interesting to move the hypernetwork selection dropdown to the right of the checkpoint selection dropdown... Would be quicker than phishing for it in the settings.

0 replies

mykeehu · 2022-10-11T16:35:34Z

mykeehu
Oct 11, 2022

It is not yet clear to me what the hypernetwork is for and how to train it. I'm interested to know how it differs from textual inversion, why there are different sizes, which one is good for what and when? I am a beginner in this subject.

8 replies

bmaltais Oct 11, 2022
Author

I think the learning rate was too high and after 1200 steps it started to go bad. Here is a picture from the 1000 steps HN:

Prompt: a majestic masterpiece closeup photo of a man

mykeehu Oct 11, 2022

How many images did you use to train the hypernetwork?

bmaltais Oct 11, 2022
Author

Exactly 20.

ExponentialML Oct 11, 2022

From my understanding, the concept is a mixture of TI, Dreambooth, and fine tuning. Depending on the task, it may be able to do a mixture of them all, or just one depending on what you're trying to go for. Whether it's a style, subject, or both, it seems to be able to handle these tasks.

I've had a similar idea before where you fine tune the model but save the weighted difference in a separate file, but I didn't know how to go about implementing it. Pretty cool to see it conceptualized.

Heathen Oct 12, 2022

Feels like they are better for style transfer or if you want every generation to have one particular person or object. Unlike TI embedding that can be called upon whenever wherever.

bmaltais · 2022-10-11T16:54:52Z

bmaltais
Oct 11, 2022
Author

Here are the same prompts as the ones above but using a HN trained with 0.000005 for 4500 steps instead of 0.00005 at 2500 steps:

The effect is more subtle... I would say training at 0.00005 provided results that are closer to what I would expect for a midjourney style.

HN are really fascinating. I wonder if this is how midjourney is actually applying their "style" on top of other checkpoints,,, quite possible...

0 replies

x02Sylvie · 2022-10-11T16:56:52Z

x02Sylvie
Oct 11, 2022

How does hypernetworks training speed compare to textual inversion and dreambooth also the vram usage?

3 replies

bmaltais Oct 11, 2022
Author

As far as I can tell it is exactly the same. 9.5gb VRAM usage (Windows and webui) and 1.8it/s

sergeybok Dec 4, 2022

What's the advantage of Hypernetworks approach vs Dreambooth approach then? Just that you don't need to save a 2gb model file for each new concept?

rexelbartolome Dec 5, 2022

That's whats being discussed right now. If we could actually make it so that we can reference HN through a token, tag or whatever much like Textual Inversion, then it would be great, that could mean you'll be able to use and mix different HN's. The main issue right now with Dreambooth models/ckpt's is it's a bit of a hassle to switch models (as people pointed out, when done at scale or batching images) AND merging models together usually give worse results. So if I had a Dreambooth model that knows my face and another model that has a specific style I'm after, I'll never be able to use that style on my face since they're on different models. Then also the fact that it requires less resources to train a HN vs a Dreambooth model.

Or at least, that's what I understood from reading a lot of these.

mykeehu · 2022-10-11T17:54:10Z

mykeehu
Oct 11, 2022

Is embeddig or not necessary for hypernetwork training?

4 replies

bmaltais Oct 11, 2022
Author

Embeddings are not used as part of hypernetwork training.

R-N Oct 26, 2022

Embeddings are not used as part of hypernetwork training.

So if I already have an embedding of a subject, and I am training a hypernetwork for the same subject, the embedding will not interfere with the training?

tuangd Oct 26, 2022

Embeddings are not used as part of hypernetwork training.

So if I already have an embedding of a subject, and I am training a hypernetwork for the same subject, the embedding will not interfere with the training?

To my understanding, it will not affect the hypernetwork, but it will affect when the preview generates (if it has that embedding keyword).
You can test this out by just moving the embedding out of the /embeddings folder before training the hypernetwork and checking the result.

R-N Oct 27, 2022

Embeddings are not used as part of hypernetwork training.

So if I already have an embedding of a subject, and I am training a hypernetwork for the same subject, the embedding will not interfere with the training?

To my understanding, it will not affect the hypernetwork, but it will affect when the preview generates (if it has that embedding keyword). You can test this out by just moving the embedding out of the /embeddings folder before training the hypernetwork and checking the result.

I tried to test this out and it seems that embedding affects neither the training nor the preview. That, or the embedding actually affects it but was unable to improve the image generation. I mean, it was trained with the model but without the hypernetwork, and IIRC the textual inversion guide said that the embedding was "fine tuned" (sorry I forgot the term used) for the model used when training it. Now that I think of it, would the embedding be "fine tuned" for the hypernetwork too, if there was one used when training it?

bmaltais · 2022-10-11T19:32:36Z

bmaltais
Oct 11, 2022
Author

Quick output showing the overall impact an hypernetwork has on the model output. Here, I am prompting for a photo of tom cruise with and without the hypernetwork trained on my face:

Prompt: a majestic masterpiece closeup photo of tom cruise

So essentially just like dreambooth...

3 replies

mykeehu Oct 11, 2022

If I read it correctly (I haven't used it yet), the point of dreambooth is to place a given model in a variety of environments, so I see the given images in different environments. I like that, if that's really what hypernetworks is for! Can you use the txt2img tab as textual inversion or do I just load it in the setup and that's it?

bmaltais Oct 11, 2022
Author

Pretty much. More like dreambooth but that produce small files. It appear to tweak the primary model but as an overlay… so the main model stay intact. Dreambooth change the main model and produce a 4gb file vs 80mb for hyper network.

younyokel Oct 15, 2022

I guess it won't work if one trains SD on a person who is not in the dataset?

kidol · 2022-10-11T21:41:02Z

kidol
Oct 11, 2022

Testing now with 50 face images and a learning rate of 0.0000005 (< lowest of the recommended)
At 5000 steps, the generated faces look nothing like the training set. However, there is a difference when hypernet is enabled vs disabled. So maybe I need to remove a zero from the learning rate or just train much longer. I choose the latter for now and see where it goes...

3 replies

bmaltais Oct 11, 2022
Author

Try the following, use 0.00005 for learning rate and output an image and hypernetwork every 100 steps. When the output dégrade use the best hypernetwork. You can the further fine tune it with learning rate of 0.000001.

the model will probably be really good at 1000 steps with LR of 0.00005

kidol Oct 11, 2022

Yes thanks, but I want to see if it converges with current settings. Maybe the result is perfect? Much better than with much larger learning rate? Who knows. At step 5500 the generated faces start to look a little like training set. So I keep going and see where it goes. 👍

hd-x Oct 11, 2022

Try the following, use 0.00005 for learning rate and output an image and hypernetwork every 100 steps. When the output dégrade use the best hypernetwork. You can the further fine tune it with learning rate of 0.000001.

I'm learning that we can automate this in the learning rate
more info here
#1795

hd-x · 2022-10-11T21:44:12Z

hd-x
Oct 11, 2022

what are the modules checkboxes when creating a hypernetwork ? 768 320 640 1280
I'm just leaving all checked by default, but I want to know more

edit: ok I cannot even create a hypernetwork.. it gives error the pt file not found in models/hypernetwork

4 replies

kidol Oct 11, 2022

You have to create the folder "hypernetworks" manually within "models" .

The checkboxes define the layers of the hypernetwork. I doubt you should uncheck any. Probably makes sense to uncheck for faster training and/or when training style transfer. Not sure though.

1blackbar Oct 12, 2022

that is pretty basic bug but will stop lots of folks

AUTOMATIC1111 Oct 12, 2022
Maintainer

oops

AUTOMATIC1111 Oct 12, 2022
Maintainer

768 is a module that processes inputs,
320 640 1280 are modules that process intermediate results, the larger than number the closer to the center of the network those modules are placed (as far as I know).
numerical value is the third dimension of input for cross attention layer
hypernet is injected into the cross attention layer, and when the context input for the cross attention layer has matching shape (your prompt, has a shape of [x,x,768], while three other sizes are related to self-attention of convolutional layers (i hope i'm not messing the terminology up)).
nice thread

vexilligera · 2022-10-11T22:13:19Z

vexilligera
Oct 11, 2022

I have a question, why the hypernetwork is just two linear without activation?

4 replies

AUTOMATIC1111 Oct 12, 2022
Maintainer

good question

vexilligera Oct 12, 2022

Theoretically the training dynamic can be different, but the capacity is the same as one linear. Not sure if that's by design as a linear mapping like an embedding lookup or a bug in the code.

danielalcalde Oct 15, 2022

asked myself the same today while reviewing the code and it makes no sense
I get the same/similar results just using one linear function with 4x less space needed for the .pt files.

Right now it is implemented like this
self.linear1 = torch.nn.Linear(dim, dim * 2)
self.linear2 = torch.nn.Linear(dim * 2, dim)

but if one want to keep the two linear structure we might take advantage of it to reduce the optimization space by doing something like this
self.linear1 = torch.nn.Linear(dim, dim //n)
self.linear2 = torch.nn.Linear(dim //n, dim)

where n >= 2

this should also make it easier to combine hyper networks since we will be working in a subspace of the context.

and once we are at it I would also suggest initializing with XAVIER initialization
std=0.01 / sqrt(dim)
This would add an initial 1% noise to the context.

danielalcalde Oct 15, 2022

started a new Issue #2740 (comment)

ghost · 2022-10-11T23:21:33Z

ghost
Oct 11, 2022

Do I understand this correctly: hypernetworks are trained like embeddings with a very low learning rate and steps under 5000 and then put in /models/hypernetworks as a .pt file? And this might work better than dreambooth?

An answer would be appreciated. This sounds... really good!

1 reply

bmaltais Oct 11, 2022
Author

Yes, essentially. In my few test I found hypernetworks to work as goor or better than dreambooth to capture a subject and place it in SD outputs. Also work well for styles. And you can create hypernetworks with 8GB VRAM... bonus.

ghost · 2022-10-11T23:46:07Z

ghost
Oct 11, 2022

I get an error message when trying to load my trained HN:

Loading hypernetwork STYLEHN
Error loading hypernetwork C:\sdUpdNew\stable-diffusion-webui-master\models\hypernetworks\STYLEHN.pt
Traceback (most recent call last):
File "C:\sdUpdNew\stable-diffusion-webui-master\modules\hypernetwork.py", line 56, in load_hypernetwork
shared.loaded_hypernetwork = Hypernetwork(path)
File "C:\sdUpdNew\stable-diffusion-webui-master\modules\hypernetwork.py", line 40, in init
self.layers[size] = (HypernetworkModule(size, sd[0]), HypernetworkModule(size, sd[1]))
KeyError: 0

I trained it in the textual inversion tab with the instructions described here and put it in models/hypernetworks

18 replies

Luke2642 Oct 12, 2022

I think you're the hero of this thread @bmaltais ! If you want to do a little point-by-point summary of your process I think many people including me would benefit! :-)

bmaltais Oct 12, 2022
Author

I think you're the hero of this thread @bmaltais ! If you want to do a little point-by-point summary of your process I think many people including me would benefit! :-)

Sure, let me give you a play by play of what I did:

I selected and framed cropped images as 1.5:2 ratio in photoshop. 20 images or so.
I used the pre-process to generate 512x640 images with CLIP guidance filewords
I created my new hypernet
I configured the LR as 0.00005 and set the max steps to 3000
I selected the subject-filewords.txt method
I made sure no other hypernet were loaded in settings and made sure sd1.4 was my loaded model
I set the n step image output to 100 and save hypernet every 200 steps
I set the prompt to something like amazing portrait of a man
start training
Monitor loss and if it start to go above 0.25 then things might be going bad... I like to see values at 0.14 or less
Monitor produced images and if things start to look bad then you know you have gone too far.

That is pretty much it.

Luke2642 Oct 12, 2022

Thanks so much! It takes some discipline and patience to change just one thing and re-run when going through this learning process!

ogkalu2 Oct 12, 2022

@bmaltais
Thanks! so i've seen some conflicting info on this. Can you train with images other than 512x512 ?
If so, must they all be the same resolution? Can one of your ~20 images be 512x512 and another be 512x640 etc

Luke2642 Oct 13, 2022

I tried with 512x768 training images, set all the size sliders accordingly, but the preview images and generated images appear increasingly squashed as training goes on. So it's clearly working, but not as intended. I don't think resolution is critical to convergence.

From my experimentation the two key factors are:

the correct keyword you're trying to modify appearing every prompt (man, woman etc)
the learning rate schedule. When I see it's doing something but not improving, lower the learning rate.

I'm definitely seeing unwanted features bleeding over. If the CLIP description sees e.g. a straw hat in just one image out of twenty, straw hats will appear in more and more previews and generations of "a portrait of a man". I've no idea if it's the CLIP description causing this or the image itself, but clean images, clean background, clean description, makes sense.

mmcgeary · 2022-10-12T00:49:42Z

mmcgeary
Oct 12, 2022

So I trained for 1000 steps on 6 photos of myself, activated the hypernetwork in the settings but I'm a bit lost as to how to generate photos. Am I using it the same as an embedding: ie: a photo of x, where x is the name of the hypernetwork ?

4 replies

bmaltais Oct 12, 2022
Author

Try something like:

amazing portrait of a man at the beach, canon 50D, 55mm
Negative prompt: ugly, out of focus
Steps: 20, Sampler: DPM adaptive, CFG scale: 7, Seed: 1408801350, Size: 512x640

mmcgeary Oct 12, 2022

a ha, so I had to prompt for just a 'photo of a man' and then got quite a few good results. Going to train for another 2000 steps and see if I ruin it or it gets better.

bmaltais Oct 12, 2022
Author

Yup, I find hypernetworks produce very good results very quick... usually after 1000 steps you start to have decent ones... with a learning rate of 0.00005

Luke2642 Oct 13, 2022

You have to select the hypernetwork from the drop down on the settings tab and then apply. I'm still experimenting with a new learned embedding, e.g. 'a photo of x' vs the training overriding an existing one, e.g. 'a photo of a man'.

piyarsquare · 2022-10-12T02:01:50Z

piyarsquare
Oct 12, 2022

When training a hypernetwork, what do I select for "Prompt template file" if I am training for a subject?

6 replies

piyarsquare Oct 12, 2022

excellent. thank you!

hd-x Oct 12, 2022

there is a new txt for hypernetworks in the templates folder.

Hugo-Matias Oct 12, 2022

there is a new txt for hypernetworks in the templates folder.

Looks like it's just the subject_filewords file without the class [name]. Since HN doesn't use a keyword like TI I guess it makes sense to omit [name]. Does using [name] in the training prompt have much of an impact for HN in the end?

spacenavy90 Oct 12, 2022

I am also curious how the filename of images in the training folder can affect the output. I have been using the subject.txt template with meh results. Would it beneficial to add descriptive tags to individual photo filenames?

bitmanleo Oct 13, 2022

I am confused on how I am supposed to use the subject.txt or subject_filewords.txt files. Can someone explain what exactly I am supposed to put in them?

Much thanks in advance from the leastest smert person in the room. ;)

dill-shower · 2022-10-14T08:19:40Z

dill-shower
Oct 14, 2022

I'm going to train hypernetwork for WD1.3, but in their release note they says that the float32 version can only be used for generation, and the full version for generation and training. For hypernetwork training, should I use only the Full version, or is it possible to train using float32 in this case? It uses less VRAM

0 replies

SeaL773 · 2022-10-14T20:08:15Z

SeaL773
Oct 14, 2022

I have some issues when i training a hypernetwork.
When training, everything works fine. However, when it output a image it crush. I search for the issues nothing helpful. Can you help me?

1 reply

tuangd Oct 15, 2022

After the last couple of repos, hypernetwork training seems to have some problems. Mine was when CUDA out of memory when it tried to generate a preview image. I have to get back to yesterday's repo to continue training.

ogkalu2 · 2022-10-14T21:34:05Z

ogkalu2
Oct 14, 2022

@Heathen
Any luck with training at ×5e-7 ?

1 reply

Heathen Oct 14, 2022

I'm going to make a post about it soon.

kalkal11 · 2022-10-14T23:22:07Z

kalkal11
Oct 14, 2022

6 replies

kalkal11 Oct 15, 2022

I used a very low training rate, 0.000005, 800 steps here, worth considering as the compositions change, re-using old good seeds may produce bad results now, meaning to say there will still be good seeds but they'll be different ones.

kalkal11 Oct 15, 2022

don't forget to use a modified template that references your token and class btw

tuangd Oct 15, 2022

What's the size of each image? 512x512? or bigger?

kalkal11 Oct 16, 2022

Yes, 512*512. So crop your og, resize to 1024*1024 and split that into 4 images. You want images that are as crisp and detailed as possible for the inputs. I tried a run with more images, however with lower fidelity earlier and it kind of killed the improvement. If you can't see skin texture, don't use the image. I've been able to transfer it successfully between variations of models trained on me too.

bbecausereasonss Oct 18, 2022

Very interesting, can you upload more samples /w and /wo the hypernetwork on same prompt/seed?

ogkalu2 · 2022-10-15T03:51:25Z

ogkalu2
Oct 15, 2022

There is no option to use deepbooru for caption in the Train tab for me. Was it removed or am i missing something ?

5 replies

mykeehu Oct 15, 2022

Start SD with --deepdanbooru command, and options will there.

ogkalu2 Oct 15, 2022

@mykeehu
Sorry but how do i do this? On colab,

I've tried editing the webui-user.bat,
I've tried running !COMMANDLINE_ARGS="--deepdanbooru"

Neither seems to work

mykeehu Oct 15, 2022

I created a standalone .bat file (webui-user-my-deepbooru.bat) with this content, so it's no problem if you update the original .bat file. I run this and similarly I have several .bat files with different switches. Put that in the file:

@echo off

set COMMANDLINE_ARGS=--deepdanbooru

call webui.bat

I use this on windows, I don't know if it works on colab.

Update:
Try modify on colab this line:
!COMMANDLINE_ARGS="--share --gradio-debug --gradio-auth me:qwerty" REQS_FILE="requirements.txt" python launch.py

to this:
!COMMANDLINE_ARGS="--share --gradio-debug --gradio-auth me:qwerty --deepdanbooru" REQS_FILE="requirements.txt" python launch.py

ogkalu2 Oct 15, 2022

Thanks. This worked

"!COMMANDLINE_ARGS="--share --gradio-debug --gradio-auth me:qwerty --deepdanbooru" REQS_FILE="requirements.txt" python launch.py"
Was able to Interrogate on img2img

Now trying to create a hypernetwork gives CUDA errors. Lol

banMo88 Oct 18, 2022

have you solved this problem? I also get the same error :(

tuangd · 2022-10-15T10:00:09Z

tuangd
Oct 15, 2022

The lastest repo has a new option "Batch size" on training. Anyone know what dose it do?

4 replies

HonmaMeikodesu Oct 18, 2022

larger batch size can help training process faster，but it also demands for a larger GPU memory or you will get a "CUDA Out Of Memory" error

tuangd Oct 18, 2022

Aha, thank you very much

dill-shower Oct 18, 2022

larger batch size can help training process faster，but it also demands for a larger GPU memory or you will get a "CUDA Out Of Memory" error

Can it really improve learning? I set batch_size to 4 and learning (steps/s) is four times slower than with batch_size 1

HonmaMeikodesu Oct 18, 2022

larger batch size can help training process faster，but it also demands for a larger GPU memory or you will get a "CUDA Out Of Memory" error

Can it really improve learning? I set batch_size to 4 and learning (steps/s) is four times slower than with batch_size 1

Theoretically speaking it does, because batch size is a hyper parameter used to decide the amount of data per training batch, say you got a dataset of 10 images, with a batch size of 1 you need 10 training batches to complete a epoch, on the other hand with a batch size of 2 it only takes 5. But your experiment result should not be the elephant in the room, have you checked if the overall time of training is reducing with batch size increasing?

BTW: you can find the codebase usage of batch size here

TWIISTED-STUDIOS · 2022-10-16T18:44:55Z

TWIISTED-STUDIOS
Oct 16, 2022

Just wanted to ask I noticed something,

if I train a network for 2000, steps look through that and find the best one for instance 1500steps, and then use that as the base to keep training to let's say 10000 steps, should it continue from 1500 to then go to 10000 - because I have noticed that happening, instead of starting from 0 again on the hyper network, is this default behaviour?

4 replies

mykeehu Oct 16, 2022

Upgrade to the latest build. You will then have an update button on Train tab, which you press after you have restored the old file. The update will reload the steps and you can continue at 1500.

TWIISTED-STUDIOS Oct 16, 2022

So it's the default way then, if you train and get good ish results at 1500 out of the 2000 original steps,

Take that saved file at 1500, then use that and change the steps to 10000 but at a lower learn rate. It's designed to start from the 1500 steps and continue to 10000.

I was just wondering if it was meant to start from 0 again, or where it ended. That's all?

mykeehu Oct 16, 2022

No need to start from scratch. After you copy back the 1500 file and update the list, it will resume from 1500, so you will open another branch if you lower the rate.

TWIISTED-STUDIOS Oct 16, 2022

Oh ok then thank you very much, for explaining that to me.

giteeeeee · 2022-10-18T13:45:37Z

giteeeeee
Oct 18, 2022

Just a question

Is training at 5e-5 using 1000 steps the same with training at 5e-6 using 10000 steps?

Will the trained data from 5e-5 lose more resolution, though the two can add up?

2 replies

jameslh Oct 18, 2022

No, if it worked like that, then you could just train at .05 for 1 step. But(!) it does work like that "sort of" during the early steps. So you can take advantage of step scheduling that was added in #1795.

giteeeeee Oct 19, 2022

But(!) it does work like that "sort of" during the early steps. So you can take advantage of step schedulin

Thanks for clearing up. So by taking "advantage of the early steps", you still get loss of resolution but at a minimum?

danielalcalde · 2022-10-19T17:23:50Z

danielalcalde
Oct 19, 2022

Hi,
I have worked on a way to stabilize the training of hypernetworks while at the same time reducing the number of parameters (file size) of the model:

https://github.com/danielalcalde/stable-diffusion-webui

Issue: #2740 (comment)

The main idea is to add weight normalization to stop the model weights from exploding so quickly.

It would be cool if someone could give it a try.
I have good results training with 2 layers with a 4 times reduction at a lr=2e-5 seems to be quite stable, but I only trained with faces.

If I get good feedback I will launch a pull request to add it to the maser branch :)

0 replies

DOJO148 · 2022-12-06T18:19:10Z

DOJO148
Dec 6, 2022

The training cannot be carried out due to the above error. plz help..

0 replies

mykeehu · 2022-12-08T11:27:09Z

mykeehu
Dec 8, 2022

In the last few days I've been thinking about how to get img2img to change only the style. If I give it a big denoising, it's completely different from the original image, if I give it a little, nothing changes.
I was wondering if I could train a hypernetwork, and if I turn that on, maybe with Denoising 0 I could somehow drag the style on top of it (like a caricature style). Or should I train style on TI Embedding or DreamBooth instead?

4 replies

rexelbartolome Dec 8, 2022

Yea, right now, TI and Dreambooth, seem to be the greatest options if you want a style to be applied seamlessly to a face but you don't want to lose the likeness. Hypernetworks is like a style applied to all of your generations, has its uses but not sure if it's better to use than Dreambooth. HN's are also newer compared to Dreambooth, which might mean there's less tutorials/resources to help you train unless you already know what you're doing. Another option on the horizon is magicmix #4538 but not really sure when it will be implemented.

mykeehu Dec 8, 2022

So I think TI can be embedded in a prompt, while HN is more of a switchable style, but training an entire DB for one style seems a bit overkill to convert a face to another style. Maybe TI is the way to go if I want to train for a style, but I'm still thinking about it. This Magicmix doesn't look bad, but it's also a bit of an inpainting house, meaning I see it as a one-piece swap. I also thought about Aesthetica, but eventually discarded it because it can only "base", so for img2img I don't think it would be useful for a style conversion.

rexelbartolome Dec 8, 2022

Yep, if you want more coherence and the model to be more faithful to the style, DB is the way to go. In my experience, styles from TI is very basic, probably workable if you're a digital artist already and can do a lot of post processing and editing on your own. If you're not, then DB can generate wonderful images right out of the gate. If you want to check TI styles, go over here https://huggingface.co/spaces/sd-concepts-library/stable-diffusion-conceptualizer

If you want to test DB styles, you can download style models like modern disney or redshift here
Or you can also run some of these pretrained models on rundiffusion if you don't want to do any setup

If you want to train SD a new style AND and a new face, DB is pretty much the best at it.

DarkAlchy Dec 10, 2022

I trained a HN, and TI now I can't do either on Colab as I get

This was for HN using .00005 for the LR. I wish there was another tool to train with than with this program.

PiggyChu620 · 2023-08-08T17:32:17Z

PiggyChu620
Aug 8, 2023

Somebody please be so kind telling me what's the bottom input in Learning Rate is for?

Much appreciated!

0 replies

richielg · 2023-08-13T15:10:13Z

richielg
Aug 13, 2023

Hey I have a problem where the training always cuts off early and I can't see any info about that. I set it to 2000 and maybe it will run for 299 and then stop. I just set this up yesterday so i'm on the latest versions of everything, i'm on osx 12.5.1. hypernetwork learning rate 0.00005. save image and save a copy of embedding each 100. The message just reads loss:nan followed by the number of steps. Hope someone can help!

0 replies

nickkolok · 2023-11-26T00:27:46Z

nickkolok
Nov 26, 2023

Hi folks,
can hypernetworks be used to enlarge the coherent resolution? As for now, most photorealistic SD1.5-based models start to distort and mutate bodies at approx. 600x800px, getting it worse and worse as the side lengths increase. Of course, you can use High Res Fix (which is essentially an upscaler combined with img2img), but how about a native way?

If a hypernetwork is just an additional layer for UNet (have I caught it right?), why can't we train on 1024x1024px pics? A similar trick has been done with inpainting, hasn't it?..

0 replies

DarkAlchy · 2023-11-26T09:24:12Z

DarkAlchy
Nov 26, 2023

In order of complexity: TI, HN, LoRA, DB, FT. LoRA has many new subsets now that train more layers but my heart always belongs to HN as I found it the best for what I did. Since SDXL we can't train it, but so many things we can't with XL as far as training related.

0 replies

Hypernetwork training #2284

Replies: 52 comments · 163 replies

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

bmaltais Oct 11, 2022 Author

AUTOMATIC1111 Oct 12, 2022 Maintainer

AUTOMATIC1111 Oct 12, 2022 Maintainer

AUTOMATIC1111 Oct 12, 2022 Maintainer

bmaltais Oct 11, 2022 Author

bmaltais Oct 12, 2022 Author

Replies: 52 comments 163 replies

bmaltais
Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

bmaltais
Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

bmaltais
Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

bmaltais
Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

bmaltais Oct 11, 2022
Author

AUTOMATIC1111 Oct 12, 2022
Maintainer

AUTOMATIC1111 Oct 12, 2022
Maintainer

AUTOMATIC1111 Oct 12, 2022
Maintainer

bmaltais Oct 11, 2022
Author

bmaltais Oct 12, 2022
Author