One of the most important challenges facing the evolution of smart cities over the last decade has been the optimization of energy use. Also, artificial intelligence and its algorithms, such as reinforcement learning, have appeared as a catalyst in the process of designing and optimizing smart services in the urban space, and in this issue, the generation and use of energy are critical factors. Using a technique based on reinforcement learning, the authors of this research successfully decreased and optimised smart city energy use. The suggested reinforcement learning method uses a collection of agents to cooperate together to achieve a shared objective using an optimum energy distribution policy (value action function). Agents' ability to cooperate to optimise energy use and save expenses is only one example of the many advantages that will accrue from their concerted efforts. To determine the worth of each option, the suggested technique looks at energy consumption data and the degree to which the option has been implemented in the past. This architecture ensures the device achieves an optimal balance between its energy footprint and the dependability of its communications. The simulation findings reveal that the yearly energy consumption in the smart city may be reduced by more than 35%-40% via the optimization of energy consumption using the proposed reinforcement learning approach.