Here is a very biased list of books and links that I found useful for students entering our lab (other labs may emphasize different aspects though):
Sipser’s broad Introduction to the Theory of Computation
A comprehensive Survey of Deep Learning
Bishop's Pattern Recognition and Machine Learning (bible of traditional machine learning, probabilistic view)
Thesis of Graves (ex-IDSIA) on Supervised Sequence Labelling with Recurrent Networks (RNNs, not much of this in Bishop's book)
Overview of recurrent neural networks with lots of papers
State of the art pattern recognition with deep neural nets on GPUs (lots of recent papers)
Sutton & Barto's Introduction to Reinforcement Learning (survey of traditional RL)
Kaelbling et al.s Broader Survey of Reinforcement Learning
Papers on CoSyNe and Natural Evolution Strategies
Other recent papers on RNNs that learn control without teachers, by Gomez, Koutnik, Wierstra, Schaul, Sehnke, Peters, Osendorfer, Rueckstiess, Foerster, Togelius, Srivastava, and others
Overviews of artificial curiosity and creativity
Theoretically optimal universal stuff:
M. Hutter (ex-IDSIA): Universal Artificial Intelligence. THE book on mathematically optimal universal AI / general problem solvers / universal reinforcement learners (goes far beyond traditional RL and previous AI methods)
Overview sites on universal RL/AI and Goedel machine and optimal program search and incremental search in program space
M. Li and P. M. B. Vitanyi. An Introduction to Kolmogorov Complexity and its Applications (2nd edition). Springer, 1997. THE survey of algorithmic information theory, based on the original work by Kolmogorov and Solomonoff. Foundation of universal optimal predictors and compressors and general inductive inference machines.