Update input modality as early as possible.

Previously, we opened the mic but failed to transition to voice input
modality in response to the OnInteractionStarted event.

This causes some latency as the input modality would not be updated
until we received a SpeechRecognitionStarted event.

