Self- Driving With Deep Reinforcement ( Survey )

  • Hayder Salah Abdalameer College of Computer Science and Information Technology, University of Al-Qadisiyah, Iraq
  • Ali Abid Al-Shammary College of Computer Science and Information Technology, University of Al-Qadisiyah, Iraq
Keywords: Autonomous driving, Deep Reinforcement Learning, Reinforcement learning, Time-to-Collision (TTC), conventional nural Network, Detection And Ranging ( LiDAR )


We will focus concentrating on the scientific and technological challenges that the development of autonomous vehicles brings to a number of different manufacturers and research organizations (AVs). It is anticipated that automobiles operated by humans will continue to be a common sight on the roads for the foreseeable future;  nevertheless, it is possible that autonomous vehicles may coexist in the same traffic environments. In order to keep traffic moving smoothly throughout the various sorts of mixed-use zones, autonomous vehicles (AVs) need to have driving rules and negotiation abilities that are analogous to those used by humans. In order to develop driving abilities that are comparable to those of humans, model-free deep reinforcement learning is used to simulate the actions of a skilled human driver. In this simulation, the difficulty of avoiding static obstacles on a two-lane roadway is investigated .


Download data is not yet available.


[1] S. Shalev-Shwartz, S. Shammah, and A. Shashua, “On a formal model of safe and scalable self-driving cars,” arXiv preprint arXiv:1708.06374, 2017.
[2] B Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion,Ahmad A. Al Sallab2, Senthil Yogamani and Patrick Pérez, “Deep Reinforcement Learning for Autonomous Driving” in 2021 arXiv:2002.00444v2 .
[3] L. Fridman, D. E. Brown, M. Glazer, W. Angell, S. Dodd, B. Jenik, J. Terwilliger, A. Patsekin, J. Kindelsberger, L. Ding, et al., “Mit advanced vehicle technology study: Large-scale naturalistic driving study of driver behavior and interaction with automation,” IEEE Access, vol. 7, pp. 102021–102038, 2019.
[4] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Second Edition). MIT Press, 2018.
[5] T. D. Team. Dimensions publication trends. [Online]. Available:
[6] A.M. Kareem, A.Obied, Testbed for intelligent agent : A survey , Journal of Al-Qadisiyah for computer science and mathematics 13 (2021) Page-23.
[7] M. Siam, S. Elkerdawy, M. Jagersand, and S. Yogamani, “Deep semantic segmentation for automated driving: Taxonomy, roadmap and challenges,” in 2017 IEEE 20th international conference on intelligent transportation systems (ITSC). IEEE, 2017, pp. 1–8.
[8] K. El Madawi, H. Rashed, A. El Sallab, O. Nasr, H. Kamel, and S. Yogamani, “Rgb and lidar fusion based 3d semantic segmentation for autonomous driving,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, 2019, pp. 7–12.
[9] M. Siam, H. Mahgoub, M. Zahran, S. Yogamani, M. Jagersand, and A. El-Sallab, “Modnet: Motion and appearance based moving object detection network for autonomous driving,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 2859–2864.
[10] V. R. Kumar, S. Milz, C. Witt, M. Simon, K. Amende, J. Petzold, S. Yogamani, and T. Pech, “Monocular fisheye camera depth estimation using sparse lidar supervision,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 2853– 2858.
[11] M. Uˇricᡠˇr, P. Kˇrížek, G. Sistu, and S. Yogamani, “Soilingnet: Soiling detection on automotive surround-view cameras,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, 2019, pp. 67–72.
[12] G. Sistu, I. Leang, S. Chennupati, S. Yogamani, C. Hughes, S. Milz, and S. Rawashdeh, “Neurall: Towards a unified visual perception model for automated driving,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, 2019, pp. 796–803.
[13] S. Yogamani, C. Hughes, J. Horgan, G. Sistu, P. Varley, D. O’Dea, M. Uricár, S. Milz, M. Simon, K. Amende et al., “Woodscape: A multi-task, multi-camera fisheye dataset for autonomous driving,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 9308–9318.
[14] S. Milz, G. Arbeiter, C. Witt, B. Abdallah, and S. Yogamani, “Visual slam for automated driving: Exploring the applications of deep learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 247–257.
[15] S. M. LaValle, Planning Algorithms. New York, NY, USA: Cambridge University Press, 2006.
[16] Y. Kuwata, J. Teo, G. Fiore, S. Karaman, E. Frazzoli, and J. P. How, “Real-time motion planning with applications to autonomous urban driving,” IEEE Transactions on Control Systems Technology, vol. 17, no. 5, pp. 1105–1118, 2009.
[17] B. Paden, M. Cáp, S. Z. Yong, D. Yershov, and E. Frazzoli, “A survey of ˇ motion planning and control techniques for self-driving urban vehicles,” 15 IEEE Transactions on intelligent vehicles, vol. 1, no. 1, pp. 33–55, 2016.
[18] T. M. Mitchell, Machine learning, ser. McGraw-Hill series in computer science. Boston (Mass.), Burr Ridge (Ill.), Dubuque (Iowa): McGrawHill, 1997.
[19] S. J. Russell and P. Norvig, Artificial intelligence: a modern approach (3rd edition). Prentice Hall, 2009.
[20] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st ed. New York, NY, USA: John Wiley & Sons, Inc., 1994.
[21] C. J. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, vol. 8, no. 3-4, 1992.
[22] S. Levine, V. Koltun, Continuous inverse optimal control with locally optimal examples, arXiv preprint arXiv:1206.4617 (2012).
[23] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, 2015.
[24] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. dissertation, King’s College, Cambridge, 1989.
[25] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic policy gradient algorithms,” in ICML, 2014.
[26] R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, pp. 229–256, 1992.
[27] J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International Conference on Machine Learning, 2015, pp. 1889–1897.
[28] E. Leurent, Y. Blanco, D. Efimov, and O.-A. Maillard, “A survey of state-action representations for autonomous driving,” HAL archives, 2018.
[29] R. S. Sutton, D. Precup, and S. Singh, “Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning,” Artificial intelligence, vol. 112, no. 1-2, pp. 181–211, 1999.
[30] A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16.
[31] S. Kardell and M. Kuosku, “Autonomous vehicle control via deep reinforcement learning,” Master’s thesis, Chalmers University of Technology, 2017.
[32] J. Chen, B. Yuan, and M. Tomizuka, “Model-free deep reinforcement learning for urban autonomous driving,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, 2019, pp. 2765– 2771.
[33] C.Li, K. Czarnecki, Urban driving with multi-objective deep reinforcement learning. arXiv preprint arXiv:1811.08586(2018).

[34] F. Rosique, P. J. Navarro, C. Fernández, and A. Padilla, “A systematic review of perception system and simulators for autonomous vehicles research,” Sensors, vol. 19, no. 3, p. 648, 2019.
[35] M. Cutler, T. J. Walsh, and J. P. How, “Reinforcement learning with multi-fidelity simulators,” in 2014 IEEE International Conference on 17 Robotics and Automation (ICRA). IEEE, 2014, pp. 3888–3895.
[36] F. C. German Ros, Vladlen Koltun and A. M. Lopez, “Carla autonomous driving challenge,”, 2019, [Online; accessed 14-April-2019].
[37] W. G. Najm, J. D. Smith, M. Yanagisawa et al., “Pre-crash scenario typology for crash avoidance research,” United States. National Highway Traffic Safety Administration, Tech. Rep., 2007.
[38] Kang, I. Obstacle Avoidance and Autonomous Driving by Embedded Deep Neural Networks. Master’s Thesis, Hanyang University, Seoul, Korea, 2020.
[39] Park, S.; Hwang, K.; Park, H.; Choi, Y.; Park, J. Application of CNN for steering control of autonomous vehicle. In Proceeding of the Spring Conference of the Korea Institute of information and Communication Sciences, Yeo-su, Korea, 20–22 May 2018; pp. 468–469.
[40] Pan, X.; You, Y.; Wang, Z.; Lu, G. Virtual to Real Reinforcement Learning for Autonomous Driving. arXiv 2017, arXiv:1704.03952.
[41] Mirchevska, B.; Blum, M.; Louis, L.; Boedecker, J.; Werling, M. Reinforcement learning for autonomous maneuvering in highway scenarios. Workshop Driv. Assist. Syst. Auton. Driv. 2017, 32–41.
[42] Li, D.; Zhao, D.; Zhang, Q.; Chen, Y. Reinforcement learning and deep learning based lateral control for autonomous driving. IEEE Comput. Intell. Mag. 2019, 14, 83–98. [CrossRef]
[43] Yi, H.; Park, E.; Kim, S. Multi-agent Deep Reinforcement Learning for Autonomous Driving. J.-Form. Sci. Comput. 2018, 24, 670–674.
How to Cite
Abdalameer, H., & Al-Shammary, A. (2022). Self- Driving With Deep Reinforcement ( Survey ). Journal of Al-Qadisiyah for Computer Science and Mathematics, 14(3), Comp Page 10-21.
Computer article