Should recognition be done separately to the Voice?