TensorFlow Play Game

Knowledge Square
0

Implementing Neural Network 

In the previous post we got to know about what is neural network and what are the uses. Here we talk about how to implement these concepts in order to solve a practical example. Practically its difficult to create a own neural network and implement in a production environment because of time consuming. For implementation of neural network we use built API environment called TensorFlow.
Using TensorFlow its easier to create a neural network and make a prediction.

First install Python 3.x version and install tensor flow using built in pip installer  by using following command
def pip install --upgrade tensorflow
Check whether TensorFlow is working in the command prompt type python and hit enter and enter import tensorflow as tf and hit enter again if this gives an error you haven't installed tensorflow corrrectly.

Figure of TensorFlow installation verification 
install numpy
def pip install nump
install OpenAi gym
def pip install gym
Here we play CartPole-v0 game using TensorFlow,
Game is about a pole, it is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center.
For training out model we need a initial data set for this we create a random dataset.
def initial_population():
# [OBS, MOVES]
    training_data = []
# all scores:
    scores = []
# just the scores that met our threshold:
    accepted_scores = []
# iterate through however many games we want:
    for _ in range(initial_games):
score = 0 # moves specifically from this environment:
      game_memory = []
# previous observation that we saw
        prev_observation = []
# for each frame in 200
        for _ in range(goal_steps):
# choose random action (0 or 1)
        action = random.randrange(0, 2)
# do it!
        observation, reward, done, info = env.step(action)

# notice that the observation is returned FROM the action
     # so we'll store the previous observation here, pairing        
     # the prev observation to the action we'll take.        
        if len(prev_observation) > 0:
game_memory.append([prev_observation, action])
prev_observation = observation
score += reward
if done:
break
# IF our score is higher than our threshold, we'd like to save
    # every move we made    # NOTE the reinforcement methodology here.    
    # all we're doing is reinforcing the score, we're not trying    
    # to influence the machine in any way as to HOW that score is    .
    # reached.    
       if score >= score_requirement:
accepted_scores.append(score)
for data in game_memory:
# convert to one-hot (this is the output layer for our neural network) if data[1] == 1:
output = [0, 1]
elif data[1] == 0:
output = [1, 0]

# saving our training data
            training_data.append([data[0], output])

# reset env to play again env.reset()
# save overall scores scores.append(score)

# just in case you wanted to reference later
  training_data_save =np.array(training_data)
np.save('saved.npy', training_data_save)

# some stats here, to further illustrate the neural network magic!
  print('Average accepted score:', mean(accepted_scores))
print('Median score for accepted scores:', median(accepted_scores))
print(Counter(accepted_scores))

return training_data
output will be something like this
Average accepted score: 60.40173410404624
Median score for accepted scores: 57.0
Counter({52.0: 30, 50.0: 28, 51.0: 26, 53.0: 24, 54.0: 22, 55.0: 22, 56.0: 17, 57.0:....
After that we come to creating a neural network modal for train.
def neural_network_model(input_size):
  network = input_data(shape=[None, input_size, 1], name='input')

network = fully_connected(network, 200, activation='relu')
network = dropout(network, 0.9)

network = fully_connected(network, 300, activation='relu')
network = dropout(network, 0.9)

network = fully_connected(network, 600, activation='relu')
network = dropout(network, 0.9)

network = fully_connected(network, 300, activation='relu')
network = dropout(network, 0.9)

network = fully_connected(network, 200, activation='relu')
network = dropout(network, 0.9)

network = fully_connected(network, 2, activation='softmax')
network = regression(network, optimizer='adam', learning_rate=LR,
  loss='categorical_crossentropy', name='targets')
model = tflearn.DNN(network, tensorboard_dir='log')

return model
After this its time to play
def train_model(training_data, model=False):
    X = np.array([i[0] for i in training_data]).reshape(-1, len(training_data[0][0]), 1)
y = [i[1] for i in training_data]

if not model:
model = neural_network_model(input_size=len(X[0]))

model.fit({'input': X}, {'targets': y}, n_epoch=3, snapshot_step=500, show_metric=True
    , run_id='openai_learning')
return model

training_data = initial_population()
model=train_model(training_data)

scores = []
choices = []
for each_game in range(10):
score = 0 game_memory = []
prev_obs = []
env.reset()
for _ in range(goal_steps):
env.render()

if len(prev_obs) == 0:
action = random.randrange(0, 2)
else:
action = np.argmax(model.predict(prev_obs.reshape(-1, len(prev_obs), 1))[0])

choices.append(action)

new_observation, reward, done, info = env.step(action)
prev_obs = new_observation
game_memory.append([new_observation, action])
score += reward
if done: break
scores.append(score)

print('Average Score:', sum(scores) / len(scores))
print('choice 1:{} choice 0:{}'.format(choices.count(1) / len(choices), choices.count(0)
/ len(choices)))
print(score_requirement)

Game Play






Post a Comment

0Comments
Post a Comment (0)