Q-Learning Tic-Tac-Toe, Briefly
Tic-tac-toe doesn’t call for reinforcement learning, except as an exercise or illustration. Recently, I saw several examples implementing Q-learning, all of which were rather long. I thought I’d give tic-tac-toe with Q-learning a try myself, using Python and TensorFlow, aiming for brevity. The project establishes two baseline strategies and then outperforms them with Q-learning. Many suggestions remain for extending the project further.
About the Engineer
Data scientist and software engineer for Deep Learning Analytics. Taught with Python and R for General Assembly and the Metis data science bootcamp. Worked with data at Booz Allen Hamilton, New York University, and the New York City Department of Education. Studied mathematics at the University of Wisconsin–Madison and teaching mathematics at Bard College. Career-best breakdancing result was advancing to the semi-finals of the R16 Korea 2009 individual footwork battle. Writing at planspace.org.
Senior Data Scientist and Software Engineer