TensorFlow Play Game

Implementing Neural Network

In the previous post we got to know about what is neural network and what are the uses. Here we talk about how to implement these concepts in order to solve a practical example. Practically its difficult to create a own neural network and implement in a production environment because of time consuming. For implementation of neural network we use built API environment called TensorFlow.
Using TensorFlow its easier to create a neural network and make a prediction.

First install Python 3.x version and install tensor flow using built in pip installer by using following command

def pip install --upgrade tensorflow

Check whether TensorFlow is working in the command prompt type python and hit enter and enter import tensorflow as tf and hit enter again if this gives an error you haven't installed tensorflow corrrectly.

Figure of TensorFlow installation verification

install numpy

def pip install nump

install OpenAi gym

def pip install gym

Here we play CartPole-v0 game using TensorFlow,
Game is about a pole, it is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center.
For training out model we need a initial data set for this we create a random dataset.

def initial_population():
    # [OBS, MOVES]
    training_data = []
    # all scores:    
    scores = []
    # just the scores that met our threshold:    
    accepted_scores = []
    # iterate through however many games we want:    
    for _ in range(initial_games):
      score = 0       # moves specifically from this environment:       
      game_memory = []
       # previous observation that we saw      
        prev_observation = []
    # for each frame in 200    
        for _ in range(goal_steps):
        # choose random action (0 or 1)        
        action = random.randrange(0, 2)
        # do it!        
        observation, reward, done, info = env.step(action)

        # notice that the observation is returned FROM the action        
     # so we'll store the previous observation here, pairing        
     # the prev observation to the action we'll take.        
        if len(prev_observation) > 0:
            game_memory.append([prev_observation, action])
        prev_observation = observation
        score += reward
        if done:
            break
    # IF our score is higher than our threshold, we'd like to save    
    # every move we made    # NOTE the reinforcement methodology here.    
    # all we're doing is reinforcing the score, we're not trying    
    # to influence the machine in any way as to HOW that score is    .
    # reached.    
       if score >= score_requirement:
        accepted_scores.append(score)
        for data in game_memory:
            # convert to one-hot (this is the output layer for our neural network)            if data[1] == 1:
                output = [0, 1]
            elif data[1] == 0:
                output = [1, 0]

            # saving our training data            
            training_data.append([data[0], output])

    # reset env to play again    env.reset()
    # save overall scores    scores.append(score)

  # just in case you wanted to reference later  
  training_data_save =np.array(training_data)
  np.save('saved.npy', training_data_save)

  # some stats here, to further illustrate the neural network magic!  
  print('Average accepted score:', mean(accepted_scores))
  print('Median score for accepted scores:', median(accepted_scores)) 
  print(Counter(accepted_scores))

  return training_data

output will be something like this

Average accepted score: 60.40173410404624
Median score for accepted scores: 57.0
Counter({52.0: 30, 50.0: 28, 51.0: 26, 53.0: 24, 54.0: 22, 55.0: 22, 56.0: 17, 57.0:....

After that we come to creating a neural network modal for train.

def neural_network_model(input_size):
  network = input_data(shape=[None, input_size, 1], name='input')

  network = fully_connected(network, 200, activation='relu')
  network = dropout(network, 0.9)

  network = fully_connected(network, 300, activation='relu')
  network = dropout(network, 0.9)

  network = fully_connected(network, 600, activation='relu')
  network = dropout(network, 0.9)

  network = fully_connected(network, 300, activation='relu')
  network = dropout(network, 0.9)

  network = fully_connected(network, 200, activation='relu')
  network = dropout(network, 0.9)

  network = fully_connected(network, 2, activation='softmax')
  network = regression(network, optimizer='adam', learning_rate=LR, 
  loss='categorical_crossentropy', name='targets')
  model = tflearn.DNN(network, tensorboard_dir='log')

  return model

After this its time to play

def train_model(training_data, model=False):
    X = np.array([i[0] for i in training_data]).reshape(-1, len(training_data[0][0]), 1)
    y = [i[1] for i in training_data]

    if not model:
        model = neural_network_model(input_size=len(X[0]))

    model.fit({'input': X}, {'targets': y}, n_epoch=3, snapshot_step=500, show_metric=True
    , run_id='openai_learning')
    return model

training_data = initial_population()
model=train_model(training_data)

scores = []
choices = []
for each_game in range(10):
    score = 0    game_memory = []
    prev_obs = []
    env.reset()
    for _ in range(goal_steps):
        env.render()

        if len(prev_obs) == 0:
            action = random.randrange(0, 2)
        else:
            action = np.argmax(model.predict(prev_obs.reshape(-1, len(prev_obs), 1))[0])

        choices.append(action)

        new_observation, reward, done, info = env.step(action)
        prev_obs = new_observation
        game_memory.append([new_observation, action])
        score += reward
        if done: break
    scores.append(score)

print('Average Score:', sum(scores) / len(scores))
print('choice 1:{}  choice 0:{}'.format(choices.count(1) / len(choices), choices.count(0)
/ len(choices)))
print(score_requirement)

TensorFlow Play Game

Implementing Neural Network

Game Play

Post a Comment

Mastering Power BI Fabric Capacity Scaling and Adapting Dynamically

Efficiently Managing Power BI Project Files: Separating Reports and Datasets

Knowledge square

Contact form

TensorFlow Play Game

Implementing Neural Network

Game Play

You may like these posts

Post a Comment

Contact form