.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_ch8\ch8_example.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_ch8_ch8_example.py: ========================== 8.1 Using KAgents ========================== Here we will show how to instanciate and use any of the Kernel Reinforcement Learning algorithms. Most of the code is similar to what you would find with standard RL libraries. Specificities are commented and explained below. .. GENERATED FROM PYTHON SOURCE LINES 8-79 .. container:: sphx-glr-animation .. raw:: html
.. rst-class:: sphx-glr-script-out .. code-block:: none start game Computed global error Bellman mean: 7.89555062855977e-07 iter: 1 Game 1 start game Computed global error Bellman mean: 1.9174925518446647e-06 iter: 5 Game 2 start game Computed global error Bellman mean: 0.004522472092035111 iter: 5 Game 3 start game Computed global error Bellman mean: 0.19431530240631453 iter: 2 Game 4 start game Computed global error Bellman mean: 4.7390296076303594e-07 iter: 5 Game 5 start game Computed global error Bellman mean: 2.1349410692526907 iter: 1 Game 6 start game Computed global error Bellman mean: 0.9118957326956381 iter: 5 Game 7 start game Computed global error Bellman mean: 0.447413797743634 iter: 5 Game 8 start game Computed global error Bellman mean: 0.9093983748700462 iter: 5 Game 9 start game Computed global error Bellman mean: 0.9353377222334376 iter: 5 Game 10 start game Computed global error Bellman mean: 0.4486237852201142 iter: 5 Game 11 start game Computed global error Bellman mean: 0.5199007140480226 iter: 5 Game 12 start game Computed global error Bellman mean: 0.028420690962717533 iter: 5 Game 13 start game Computed global error Bellman mean: 0.3258012670198149 iter: 5 Game 14 start game Computed global error Bellman mean: 0.12585790268695965 iter: 5 Game 15 | .. code-block:: Python # Importing necessary modules import codpy.KQLearning as KQLearning import gymnasium as gym import imageio import numpy as np import matplotlib.pyplot as plt import matplotlib.animation as animation env = gym.make("CartPole-v1", render_mode="rgb_array") frames = [] # The agent requires action and observation space dimensions agent = KQLearning.KQLearning(env.action_space.n, env.observation_space.shape[0], gamma=0.99) steps = 0 games = 0 while games < 15: print("start game") # Store the game history for training states, actions, next_states, rewards, dones = [], [], [], [], [] state, _ = env.reset() steps = 0 while steps < 1000: action = agent(state) next_state, reward, done, _, _ = env.step(action) frame = env.render() frames.append(frame) # The agent expects the game to be passed in reverse order ( states.insert(0, state), actions.insert(0, action), next_states.insert(0, next_state), rewards.insert(0, reward), dones.insert(0, done), ) steps += 1 state = next_state if done: break # You train your agent once at the end of every episode agent.train((states, actions, next_states, rewards, dones)) games += 1 print(f"Game {games}") env.close() # imageio.mimsave("cartpole.gif", frames, fps=30) from IPython.display import HTML, Video, display # HTML('') fig = plt.figure() plt.axis("off") im = plt.imshow(frames[0]) def update(frame): im.set_array(frame) return im, ani = animation.FuncAnimation( fig, update, frames=frames, # Pass the list of frames interval=50, # Delay between frames (ms) blit=True # Optimize rendering ) ani.save("cartpole.mp4", writer="ffmpeg", fps=30, dpi=100) display(Video("cartpole.mp4", embed=True)) # plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** (1 minutes 29.548 seconds) .. _sphx_glr_download_auto_ch8_ch8_example.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: ch8_example.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: ch8_example.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: ch8_example.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_