[28] Yapeng Gao, Jonas Tebbe, and Andreas Zell. Optimal
Stroke Learning with Policy Gradient Approach for
Robotic Table Tennis. CoRR, abs/2109.03100, 2021.
[29] Ali Ghadirzadeh, Atsuto Maki, Danica Kragic, and
M
˚
arten Bj
¨
orkman. Deep predictive policy training using
reinforcement learning. In 2017 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS),
pages 2351–2358. IEEE, 2017.
[30] Xavier Glorot, Antoine Bordes, and Yoshua Bengio.
Deep Sparse Rectifier Neural Networks. In Geoffrey
Gordon, David Dunson, and Miroslav Dud
´
ık, editors,
Proceedings of the Fourteenth International Conference
on Artificial Intelligence and Statistics, volume 15 of
Proceedings of Machine Learning Research, pages 315–
323, Fort Lauderdale, FL, USA, 11–13 Apr 2011.
PMLR.
[31] Sergio Guadarrama, Anoop Korattikara, Oscar Ramirez,
Pablo Castro, Ethan Holly, Sam Fishman, Ke Wang,
Ekaterina Gonina, Neal Wu, Efi Kokiopoulou, Luciano
Sbaiz, Jamie Smith, G
´
abor Bart
´
ok, Jesse Berent, Chris
Harris, Vincent Vanhoucke, and Eugene Brevdo. TF-
Agents: A library for Reinforcement Learning in Ten-
sorFlow, 2018.
[32] Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan,
George Tucker, and Sergey Levine. Learning to
walk via deep reinforcement learning. arXiv preprint
arXiv:1812.11103, 2018.
[33] Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and
Sergey Levine. Soft actor-critic: Off-policy maximum
entropy deep reinforcement learning with a stochastic
actor. In Proceedings of the 35th International Confer-
ence on Machine Learning, pages 1861–1870. PMLR,
2018.
[34] J. Hartley. Toshiba progress towards sensory control
in real time. The Industrial Robot 14-1, pages 50–52,
1983.
[35] Richard Hartley and Andrew Zisserman. Multiple view
geometry in computer vision. Cambridge university
press, 2003.
[36] Hideaki Hashimoto, Fumio Ozaki, and Kuniji Osuka.
Development of Ping-Pong Robot System Using 7
Degree of Freedom Direct Drive Robots. In Industrial
Applications of Robotics and Machine Vision, 1987.
[37] Kasun Gayashan Hettihewa and Manukid Parnichkun.
Development of a Vision Based Ball Catching Robot. In
2021 Second International Symposium on Instrumenta-
tion, Control, Artificial Intelligence, and Robotics (ICA-
SYMP), pages 1–5. IEEE, 2021.
[38] Matt Hoffman, Bobak Shahriari, John Aslanides,
Gabriel Barth-Maron, Feryal M. P. Behbahani, Tamara
Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang,
Kate Baumli, Sarah Henderson, Alexander Novikov,
Sergio G
´
omez Colmenarejo, Serkan Cabi, C¸ aglar
G
¨
ulc¸ehre, Tom Le Paine, Andrew Cowie, Ziyu Wang,
Bilal Piot, and Nando de Freitas. acme: A research
framework for distributed reinforcement learning.
[39] Yanlong Huang, Bernhard Sch
¨
olkopf, and Jan Peters.
Learning optimal striking points for a ping-pong playing
robot. IROS, 2015.
[40] Yanlong Huang, Dieter Buchler, Okan Koc¸, Bernhard
Sch
¨
olkopf, and Jan Peters. Jointly learning trajectory
generation and hitting point prediction in robot table
tennis. IEEE-RAS Humanoids, 2016.
[41] Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy,
Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and
Marco Hutter. Learning agile and dynamic motor skills
for legged robots. Sci. Robotics, 4(26), 2019.
[42] Sebastian H
¨
ofer, Kostas Bekris, Ankur Handa,
Juan Camilo Gamboa, Melissa Mozifian, Florian
Golemo, Chris Atkeson, Dieter Fox, Ken Goldberg,
John Leonard, C. Karen Liu, Jan Peters, Shuran
Song, Peter Welinder, and Martha White. Sim2Real
in Robotics and Automation: Applications and
Challenges. IEEE Transactions on Automation
Science and Engineering, 18(2):398–400, 2021. doi:
10.1109/TASE.2021.3064065.
[43] Sergey Ioffe and Christian Szegedy. Batch normaliza-
tion: Accelerating deep network training by reducing
internal covariate shift. In International conference on
machine learning, pages 448–456. PMLR, 2015.
[44] Wenzel Jakob, Jason Rhinelander, and Dean Moldovan.
pybind11 – Seamless operability between C++11 and
Python, 2017. https://github.com/pybind/pybind11.
[45] Gangyuan Jing, Tarik Tosun, Mark Yim, and Hadas
Kress-Gazit. An End-To-End System for Accomplish-
ing Tasks with Modular Robots. In Proceedings of
Robotics: Science and Systems, AnnArbor, Michigan,
June 2016. doi: 10.15607/RSS.2016.XII.025.
[46] R.E. Kalman. A new approach to linear filtering and
prediction problems. Journal of Basic Engineering, 82
(1):35–45, 1960.
[47] Peter Karkus, Xiao Ma, David Hsu, Leslie Kaelbling,
Wee Sun Lee, and Tomas Lozano-Perez. Differ-
entiable Algorithm Networks for Composable Robot
Learning. In Proceedings of Robotics: Science and
Systems, FreiburgimBreisgau, Germany, June 2019. doi:
10.15607/RSS.2019.XV.039.
[48] Chase Kew, Brian Andrew Ichter, Maryam Bandari,
Edward Lee, and Aleksandra Faust. Neural Collision
Clearance Estimator for Batched Motion Planning. In
The 14th International Workshop on the Algorithmic
Foundations of Robotics (WAFR), 2020.
[49] Piyush Khandelwal, James MacGlashan, Peter Wurman,
and Peter Stone. Efficient Real-Time Inference in Tem-
poral Convolution Networks. In 2021 IEEE Interna-
tional Conference on Robotics and Automation (ICRA),
pages 13489–13495, 2021. doi: 10.1109/ICRA48506.
2021.9560784.
[50] Diederik P. Kingma and Prafulla Dhariwal. Glow: Gen-
erative Flow with Invertible 1x1 Convolutions, 2018.
[51] John Knight and David Lowery. Pingpong-playing robot
controlled by a microcomputer. Microprocessors and