Given a set of states S, a set of actions A, and an experience ⟨s,a,r,s'⟩, what is the time complexity to update the value of Q(s,a) using Q-learning?
  • O(|A|) where |A| is the cardinality of set A.

Valid HTML 4.0 Transitional