Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve accuracy of gcnn? #6

Open
yingyingll opened this issue Mar 31, 2020 · 5 comments
Open

How to improve accuracy of gcnn? #6

yingyingll opened this issue Mar 31, 2020 · 5 comments
Labels
question Further information is requested

Comments

@yingyingll
Copy link

yingyingll commented Mar 31, 2020

Hi,
I am running the codes on a combinatorial optimization problem defined by myself, which is similar to Capacitated Facility Location problem. However, the accuracy of the GCNN is low. The acc@10 is about 45% . What parameters of GCNN should I first consider to adjust? The training usually stops at 140 epochs with no improve for 20 on validation.
Uploading image.png…

Thank you.

@gasse
Copy link
Member

gasse commented Mar 31, 2020

Hello YingYing,

It's hard to say beforehand what is gonna work best for you... I would say the learning rate can make a big difference. Also, you should monitor the training / validation losses. If training stops because validation accuracy goes up, but training accuracy goes down, then probably you are not using enough training data. If you don't have access to more training data (few instances), then maybe try to regularize your model a bit. Those are my two cents. Hope it helps !

@yingyingll
Copy link
Author

yingyingll commented Apr 2, 2020

Hi gasse,

Thanks a lot. I changed the learning rate and other hyper parameters but it doesn't help. Both the training and validation accuracies don't go up after several epochs. And the accuracy is lower than the competitors like svmRank. I have two questions: 1. Is the GCNN model designed in the paper sensitive to the size of the problem? I think maybe the problem I defined is too much bigger. 2. When I collected samples from the instances, I often collect 0 samples in some certain instances,maybe at 50% frequency. But all the instances can be solved by SCIP normally. I want to know why.
Thank you for your help.

@gasse
Copy link
Member

gasse commented Apr 7, 2020

Dear YingYing,

In my experience no, the GCNN model is not sensitive to the graph size. At least for the range considered in our experiments. However the fact that you often collect 0 samples from your training instances may indicate that your problems do not require a lot of branch-and-bound to be solved. So probably learning to branch will have little impact here. I suggest that you display the SCIP statistics on your instances to try and figure out what is going on.

Best,
Maxime

@gasse gasse added the question Further information is requested label Apr 7, 2020
@yingyingll
Copy link
Author

Hello gasse,

Thanks for your reply! That helps. I changed my problem and the accuracy improved. I have another two questions when reading your work. 1. As mentioned in the paper, have you experimented the reinforcement learning for learning to branch? I wonder if the results are better. 2. In 02_generate_dataset.py and the class SamplingAgent, the node_record_prob = 0.05 means when I solve an instance, the solver will branch with 'vanillafullstrong' at 0.05 probability and branch with 'pscost' at 0.95 probability. And it will only save the samples branched with 'vanillafullstrong'. Is my understanding right?If so, I wonder why don't you set the node_record_prob higher to get more samples from one instance.

Thanks!

@gasse
Copy link
Member

gasse commented Jul 24, 2020

Hello,

It is maybe a bit late, still here is an answer. Yes we managed to improve a little bit via RL, but simply put, RL is hard :P Regarding the node_record_prob, you are right, this is indeed how it works. The idea is that 'pscost' is not a strong expert, and will derive away from the optimal trajectories of the expert 'vanillafullstrong' (assuming the expert is an optimal policy). This somehow fixes a potential distribution shift, which is a typical problem when using behavioral cloning. See, e.g., the work of Stéphane Ross on this topic (http://www.cs.cmu.edu/~sross1/publications/ross_phdthesis.pdf).

Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants