You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After carefully studying the example code for the multi-armed bandit on chapter six, I found a piece of code which I believe is missing a parameter:
def sample_bandits(self, n=1):
bb_score = np.zeros(n)
choices = np.zeros(n)
for k in range(n):
#sample from the bandits's priors, and select the largest sample
choice = np.argmax(np.random.beta(1 + self.wins, 1 + self.trials - self.wins))
#sample the chosen bandit
result = self.bandits.pull(choice)
Here, np.random.beta(1 + self.wins, 1 + self.trials - self.wins) is missing the size parameter, thus it returns a single value, not an array. That makes np.argmax() to pick a bandit useless, as that will always return 0.
Shouldn't the code be np.random.beta(1 + self.wins, 1 + self.trials - self.wins, len(self.n_bandits)) ?
The text was updated successfully, but these errors were encountered:
After carefully studying the example code for the multi-armed bandit on chapter six, I found a piece of code which I believe is missing a parameter:
Here,
np.random.beta(1 + self.wins, 1 + self.trials - self.wins)
is missing thesize
parameter, thus it returns a single value, not an array. That makesnp.argmax()
to pick a bandit useless, as that will always return 0.Shouldn't the code be
np.random.beta(1 + self.wins, 1 + self.trials - self.wins, len(self.n_bandits))
?The text was updated successfully, but these errors were encountered: