-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training not giving the expected results #1321
Comments
Your config doesn't contain So there can be many different options why it doesn't work for you...
I found that data generation used in EasyOCR can be not efficient enough. I found that using images I'm working with is much more sufficient way to increase performance. My confing file, that works for me: |
Apricate your reply and help , I increased the sample to 10K for training and ~4k for validation also updated the configuration and start to see better results ,what do you think any other room for enhancement to get better results ? |
I can't really tell how much there can be room for improvement. It depends on model - how much stuff it can learn and memorize. But, from my experience, I think it can handle tens of thousands images if not hundreds of thousands.
|
Code I wrote to cut images as described in my method. You have to modify it a bit to make work, because it's adapted to my project structure.
|
Hello, @romanvelichkin. I’ve been fine-tuning the model specifically to better detect the * symbol, and I’ve achieved low training and validation losses (around 0.0001 for both). However, I’m facing an issue: even during validation, where the predicted label matches the ground truth, the confidence scores remain consistently low (always below 0.5). I recognize that my dataset is relatively small—500 images for training and 100 for validation—but I’m actively working to expand it. My primary concern is that when I use the fine-tuned model’s weight the outputs appear completely random (the pretrained weights that I've used are latin_g2.pth). This behavior is puzzling, given that the model has been fine-tuned and the weights were properly saved. My suspicion is that this might be an overfitting issue. However, if that were the case, I would expect the model to at least perform well on images from the validation dataset. Unfortunately, this isn’t happening. Do you have any insights into what might be causing these problems or suggestions for how to address them? |
So you need model to have high confidence score, right? I didn't train model that many times, so I can't tell how much data you need to make model more confident. Make sure that train and val data are somewhat similar and not completely different. Low confidence score can be because of that. I would advice to increase input resolution for scanner. I personally prefer 2560 - it gives fine results with average inference time. I've checked it with even bigger resolutions - then model can detect smaller elements much better. But inference speed falls down very bad. Increase amout of data - use data generation https://github.com/Belval/TextRecognitionDataGenerator, use lots of augmentations. Also I found that models used in EasyOCR are large so they have to be trained for a long time. Try to increase amount of epochs. Usually you need to train that many epochs until val accuracy won't start becoming worse. Try to experiment with learning rate and batch size. Golden standart is 32 images per batch. For my tasks I train easyocr recognizer with 64 batch size. I didn't get that part:
It doesn't show same result on val data as you get during training? Keep in mind, that even with well trained recognizer, EasyOCR may not work properly for small symbols. You also need to train CRAFT detector to detect those symbols. Detector finds pieces of an image with symbols and sends them to recognizer. If detector is not trained to find * symbols, it won't send them to recognizer - no matter how well it was trained.
Overfitting means validation loss/accuracy must decrease during training. It doesn't look like your case. |
Allow me to clarify: I fine-tuned a model starting from the latin_g2.pth weights (let’s call the resulting model fine_tuned.pth). When using the latin_g2.pth weights for inference, the results are generally good, although some symbols are not detected correctly. However, when using the fine-tuned weights (fine_tuned.pth), the output becomes significantly worse. Here’s an example: Ground Truth: Highland " Weaver Overpowered Using fine_tuned.pth: For the fine-tuning process, I used the dataset referenced in the EasyOCR repository as a test. However, the results are still not satisfactory. Below is the configuration file (config.yaml) I used for fine-tuning Model ArchitectureTransformation: 'None' Let me know your toughts on this. |
I think problem appears because you're fine-tuning for *, but expect good results for text. Recognizer is over-tuned on your specific dataset.
2.1. Check how fast you reach high accuracy on val data - if you reach 90% in just a few epochs and then you train for 1000 more epochs just to get 2% more, this can lead not to fine-tuning, but complete tuning of model. Especially if your dataset is small. Model is pretty big, it has to learn 500 images very fast. 2.2. Try to train with reduced learning rate and reduce amount of epochs. 2.3. Train it on all your data, not just * symbols. 2.4. Increase dataset. |
hi ,
I'm trying to train EasyOCR on Arabic new fonts, we created a dataset of 1200 image with the labels after training the new model used to check some images where the results are very poor , the yaml file, sample of the images and labels attached
Anyone can support in this or guide me if I'm doing something wrong
log_dataset.txt
log_train.txt
opt.txt
easyOCr.zip
The text was updated successfully, but these errors were encountered: