Privacy and Security

Why is Privacy a Concern?

Because automatic transcription occurs in the cloud, data is shared with AI services.
All processing of data in the cloud contains some risk, but it is reasonable to think that more reputable vendors, who publish statements about how they use and store data, are of lower risk than vendors who are vague about where and how data is being processed.

Audio content may contain two types of identifiable information:

Voice content.
Voice content may include names, dates, addresses or other information that identifies individuals directly or can be used to identify individuals.
Voice recognition.
Voice recognition is generally classified as a biometric technology which allows the identification of a unique human characteristics. Unlike other identifiers (IDs, passwords), biometric identifiers (voice, fingerprint, iris) cannot be discarded or replaced and thus working with such data should be done with caution and with understanding of the privacy and security risks attached with it.
GDPR (EU’s General Data Protection Regulation (GDPR)
European Data Protect Board has taken the position that “voice recognition” is an example of a physical or physiological biometric identification technique. For businesses that process the personal data of data subjects (EU residents), those data subjects are granted an array of rights (e.g. right to access, right to delete) along with significant privacy and security obligations on the controllers and processors of that data.

Additional Information

Privacy Assessment

The following questions are a starting point to understanding more about the security and privacy of your data relating to the Automatic Speech Recognition:

Audio File Content
1. Any identifying information?
2. Any confidential or restricted information?
Sending Audio File to Vendor
1. Is file name randomised?
2. Is file scanned for malware or viruses?
3. If file is uploaded to a website, is the website secure using an SSL?
Receiving Transcript
1. Are you required to log-in? What is the authentication method?
2. If software is utilised, what encryption is used and how often is the software updated?
Data Access
1. How can you access your data?
2. Are you required to log-in every time? What is the authentication method?
3. Who within the vendor and externally has access to your data?
4. Can you share data access? If yes how and is it secure?
Data (Audio File and Transcript) Retention and Storage
1. How long is data kept for?
2. Where is your data kept?
3. Is your data encrypted? If yes at what level?
4. What are your rights to deletion?
Vendor Auditing and Reporting
1. Does the vendor undergo auditing and reporting of its technology?
2. How often is auditing completed
3. What types of controls are in place to ensure security and minimise risk? (e.g. data encryption systems, NDAs)
4. What accreditations does the vendor hold? (e.g. ISO accreditation, GDPR compliancy)

The above privacy assessment is sourced from:

Speech-to-text data policies of popular services

Google

Google Data logging

Microsoft

Microsoft’s Dictate and Translator tools are part of a set of cloud-based services that it calls 'connect experiences'.
In relation to Transcribe, Microsoft state that:

Your audio files will be sent to Microsoft and used only to provide you with this service. When the transcription is done your audio and transcription results are not stored by our service.

Microsoft state that their Australian data is stored in Sydney and Melbourne. This includes Microsoft Teams, Office Online, OneDrive and Stream.
It's unclear whether that Microsoft treats user corrections as product improvement data.
Azure Data and Privacy for Speech-to-text

Device based speech recognition

One way of avoiding sharing data in the cloud is by using a speech recognition program that operates on your device, rather than in the could.
Microsoft offers device-based speech recognition, but it isn't as accurate:
- You can use device-based speech recognition without sending your voice data to Microsoft. However, the Microsoft cloud-based speech recognition technologies provide more accurate recognition than the device-based speech recognition. When the Online speech recognition setting is turned off, speech services that don’t rely on the cloud and only use device-based recognition—like the Narrator app or the Windows Speech Recognition app—will still work, and Microsoft won’t collect any voice data.
Google Recorder, available on Google Pixel phones, also provides device-based speech recognition.