After the experience ⟨1,1,5,2⟩, which value of the table gets updated and what is its new value?
  • Q(1,1) = Q[s,a] + α(r + γmaxa' Q[s',a'] - Q[s,a]) = 1.5 + 0.1(5 + 0.5(3) - 1.5) = 2

Valid HTML 4.0 Transitional