{"id":29321,"date":"2023-12-20T16:57:47","date_gmt":"2023-12-20T11:27:47","guid":{"rendered":"https:\/\/tocxten.com\/?page_id=29321"},"modified":"2023-12-20T21:31:17","modified_gmt":"2023-12-20T16:01:17","slug":"learning-agent","status":"publish","type":"page","link":"https:\/\/tocxten.com\/index.php\/learning-agent\/","title":{"rendered":"Learning Agents"},"content":{"rendered":"\n<p class=\"has-very-light-gray-to-cyan-bluish-gray-gradient-background has-background has-medium-font-size\"><strong>1. Vacuum Cleaner Learning agent.<\/strong><\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>Source Code<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import numpy as np\n\nclass VacuumLearningAgent:\n    def __init__(self, locations, actions):\n        self.locations = locations\n        self.actions = actions\n        self.q_table = {}\n\n    def perceive(self, environment):\n        return environment\n\n    def choose_action(self, state):\n        state_tuple = tuple(state)\n        \n        if state_tuple not in self.q_table:\n            self.q_table&#91;state_tuple] = {action: 0 for action in self.actions}\n\n        # Exploration-exploitation trade-off\n        if np.random.uniform(0, 1) &lt; 0.2:\n            action = np.random.choice(self.actions)  # Explore\n        else:\n            action = max(self.q_table&#91;state_tuple],  key=self.q_table&#91;state_tuple].get)  # Exploit\n\n        return action\n\n    def learn(self, state, action, reward, next_state):\n        state_tuple = tuple(state)\n        next_state_tuple = tuple(next_state)\n\n        if state_tuple not in self.q_table:\n            self.q_table&#91;state_tuple] = {a: 0 for a in self.actions}\n\n        if next_state_tuple not in self.q_table:\n            self.q_table&#91;next_state_tuple] = {a: 0 for a in self.actions}\n\n        # Q-learning update formula\n        self.q_table&#91;state_tuple]&#91;action] += 0.1 * (reward + 0.9 * max(self.q_table&#91;next_state_tuple].values()) - self.q_table&#91;state_tuple]&#91;action])\n\n    def act(self, percept):\n        state = percept&#91;'state']\n        action = self.choose_action(state)\n        print(f\"Agent performs action: {action}\")\n        return action\n\n\nclass VacuumEnvironment:\n    def __init__(self, locations):\n        self.locations = locations\n        self.states = &#91;(loc, 'clean') for loc in locations]\n\n    def update_state(self, location, status):\n        index = self.locations.index(location)\n        self.states&#91;index] = (location, status)\n\n    def get_reward(self, location, action):\n        if action == 'clean':\n            return 1 if self.states&#91;self.locations.index(location)]&#91;1] == 'dirty' else -1\n        else:\n            return 0  # No reward for moving\n\n# Demonstrate the working of the learning agent for the vacuum cleaner\nlocations = &#91;'A', 'B']\nactions = &#91;'clean', 'move_right', 'move_left']\nagent = VacuumLearningAgent(locations=locations, actions=actions)\nenvironment = VacuumEnvironment(locations=locations)\n\nfor _ in range(5):\n    percept = {'state': environment.states}\n    action = agent.act(percept)\n\n    # Update the environment state\n    location = input(\"Enter the location (A\/B): \").upper()\n    status = input(\"Enter the status (clean\/dirty): \").lower()\n    environment.update_state(location, status)\n\n    # Provide the agent with a reward and update its Q-values\n    reward = environment.get_reward(location, action)\n    agent.learn(percept&#91;'state'], action, reward, environment.states)<\/code><\/pre>\n\n\n\n<p><strong>Output<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Agent performs action: move_right\nEnter the location (A\/B): A\nEnter the status (clean\/dirty): dirty\nAgent performs action: clean\nEnter the location (A\/B): A\nEnter the status (clean\/dirty): clean\nAgent performs action: move_right\nEnter the location (A\/B): B\nEnter the status (clean\/dirty): dirty\nAgent performs action: move_left\nEnter the location (A\/B): A\nEnter the status (clean\/dirty): clean\nAgent performs action: clean\nEnter the location (A\/B): B\nEnter the status (clean\/dirty): dirty<\/code><\/pre>\n\n\n\n<p class=\"has-very-light-gray-to-cyan-bluish-gray-gradient-background has-background has-medium-font-size\"><strong>2. Prime number Learning Agent<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import numpy as np\n\nclass PrimeNumberLearningAgent:\n    def __init__(self, actions):\n        self.actions = actions\n        self.q_table = {}\n\n    def choose_action(self, state):\n        if state not in self.q_table:\n            self.q_table&#91;state] = {action: 0 for action in self.actions}\n\n        # Exploration-exploitation trade-off\n        if np.random.uniform(0, 1) &lt; 0.2:\n            action = np.random.choice(self.actions)  # Explore\n        else:\n            action = max(self.q_table&#91;state], key=self.q_table&#91;state].get)  # Exploit\n\n        return action\n\n    def learn(self, state, action, reward, next_state):\n        if next_state not in self.q_table:\n            self.q_table&#91;next_state] = {a: 0 for a in self.actions}\n\n        # Q-learning update formula\n        self.q_table&#91;state]&#91;action] += 0.1 * (reward + 0.9 * max(self.q_table&#91;next_state].values()) - self.q_table&#91;state]&#91;action])\n\n    def act(self, state):\n        action = self.choose_action(state)\n        print(f\"Agent predicts: {action}\")\n        return action\n\n\nclass PrimeNumberEnvironment:\n    def __init__(self):\n        pass\n\n    def is_prime(self, number):\n        if number &lt; 2:\n            return False\n        for i in range(2, int(number**0.5) + 1):\n            if number % i == 0:\n                return False\n        return True\n\n    def get_reward(self, number, action):\n        if action == 'is_prime':\n            return 1 if self.is_prime(number) else -1\n        elif action == 'not_prime':\n            return 1 if not self.is_prime(number) else -1\n        else:\n            return 0\n\n# Demonstrate the working of the learning agent for prime numbers\nactions = &#91;'is_prime', 'not_prime']\nagent = PrimeNumberLearningAgent(actions=actions)\nenvironment = PrimeNumberEnvironment()\n\nfor _ in range(10):\n    number = int(input(\"Enter a positive integer: \"))\n    action = agent.act(number)\n\n    # Provide the agent with a reward and update its Q-values\n    reward = environment.get_reward(number, action)\n    agent.learn(number, action, reward, number)\n<\/code><\/pre>\n\n\n\n<p class=\"has-medium-font-size\"><strong>Output<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Enter a positive integer: 1 Agent predicts: is_prime \nEnter a positive integer: 2 Agent predicts: is_prime \nEnter a positive integer: 3 Agent predicts: not_prime \nEnter a positive integer: 4 Agent predicts: is_prime \nEnter a positive integer: 5 Agent predicts: is_prime \nEnter a positive integer: 6 Agent predicts: is_prime \nEnter a positive integer: 7 Agent predicts: is_prime \nEnter a positive integer: 8 Agent predicts: not_prime \nEnter a positive integer: 9 Agent predicts: is_prime \nEnter a positive integer: 10 Agent predicts: is_prime<\/code><\/pre>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. Vacuum Cleaner Learning agent. Source Code Output 2. Prime number Learning Agent Output<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-29321","page","type-page","status-publish","hentry"],"post_mailing_queue_ids":[],"_links":{"self":[{"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/pages\/29321","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/comments?post=29321"}],"version-history":[{"count":7,"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/pages\/29321\/revisions"}],"predecessor-version":[{"id":29341,"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/pages\/29321\/revisions\/29341"}],"wp:attachment":[{"href":"https:\/\/tocxten.com\/index.php\/wp-json\/wp\/v2\/media?parent=29321"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}