Sarsa is an on-policy TD control algorithm:
Q(S_t,A_t)←Q(S_t,A_t)+α[Rt+1+γQ(S_t+1,A_t+1)−Q(S_t,A_t)]
This update is done after every transition from a nonterminal state S_t. if S_t+1 is terminal, then Q(S_t+1,A_t+1) is defined as zero.
To design an on-policy control algorithm using Sarsa, we estimate q_π for a behaviour policy π and then change π towards greediness with respect to q_π.
Source: Sutton and Barto, Reinforcement Learning, 2nd Edition