Avoid using Alexa, it remembers everything you tell it

; Date: Wed Jul 03 2019

Tags: Amazon »»»» Internet Privacy »»»» Privacy

Alexa is a nice service from Amazon, being a box where you can ask questions and request that Amazon (or partners) perform services. That's a nice application of voice recognition and artificial intelligence. We get a voice activatable information robot that can perform our bidding. You might not remember asking Alexa to start the charging station on your car, but it turns out that Alexa records every query or command it services.

The information comes due to a press release (below) from Senator Coons regarding a letter he received from Amazon in response to questions he submitted to Amazon.

The issue is whether the Amazon Alexa service keeps records of requests made by people of the Alexa service, and if so what happens with those records.

Amazon admits that voice recordings are kept "forever" until/unless the customer requests deletion. There are two named purposes:

  1. The need to preserve data about a transaction, since the transaction often includes an exchange of money. It's likely this is so the service provider can prove if necessary that the customer did make the request.
  2. Further training of artificial intelligence models.

Supposedly it's possible for a customer to have those voice recordings deleted. But I imagine most won't even think about doing so. Therefore the default state is for Amazon to keep voice recordings of everyone who owns an Alexa device. Since most folks won't even think to change the default, the vast majority of Alexa customers unknowingly have had their requests recorded by Amazon.

You might think this is benign. But there are a couple scenarios (at least) where this is dangerous.

  1. A cyber thief could break in and steal the voice recordings and other data. Those voice recordings could be used for a long list of possible nefarious goals.
  2. Governments could subpoena those records and Amazon would have to comply

Therefore it's better to just avoid using the thing in the first place.

Sources:

Sen. Coons presses Amazon on consumer privacy practices for Alexa devices

MAY 23, 2019

WASHINGTON, D.C. – U.S. Senator Chris Coons (D-Del.), a member of the Senate Judiciary Committee, today sent a letter to Amazon Chief Executive Officer Jeff Bezos requesting information about the company’s privacy and data security practices for Alexa devices. The letter follows recent news reports indicating that Amazon stores and indefinitely preserves text transcripts of user voice recordings, a practice that potentially puts Amazon users’ privacy at risk.

Last year, Coons and Senator Jeff Flake (R-Ariz.) sent a letter to Bezos regarding privacy concerns about the Amazon Echo and its voice-activated software, Alexa. In response, Amazon highlighted many of its privacy protections, including the ability for a consumer to delete any and all voice recordings that Alexa sends to the cloud.

“Unfortunately, recent reporting suggests that Amazon’s customers may not have as much control over their privacy as Amazon had indicated,” Senator Coons wrote. “While I am encouraged that Amazon allows users to delete audio recordings linked to their accounts, I am very concerned by reports that suggest that text transcriptions of these audio records are preserved indefinitely on Amazon’s servers, and users are not given the option to delete these text transcripts. The inability to delete a transcript of an audio recording renders the option to delete the recording largely inconsequential and puts users’ privacy at risk.”

In the letter, Coons requests information on the types of data Amazon collects, stores, and preserves, as well as the degree to which consumers control their personal information.

Download PDF

Mr. Jeff Bezos
Chief Executive Officer Amazon, Inc. 410 Terry Avenue North Seattle, WA 98109

Dear Mr. Bezos:

I write regarding Amazon’s privacy and data security practices in light of recent news reports indicating that your company stores and indefinitely preserves text transcriptions of user voice recordings, a practice that potentially puts Amazon users’ privacy at risk.

Last year, I wrote a letter with Senator Flake regarding concerns about the Echo, Amazon’s Internet-connected personal assistant-style device, and its voice-activated software, Alexa. In response to this letter, Amazon highlighted many of its privacy protections. These included assurances about Alexa’s on-device buffer, as well as the ability for a consumer to delete any and all voice recordings that the Alexa system sends to the cloud.

Unfortunately, recent reporting suggests that Amazon’s customers may not have as much control over their privacy as Amazon had indicated. In the letter that we received on July 27, 2018, Amazon wrote that “[c]ustomers can review and listen to the voice recordings associated with their account in the Alexa app, and delete them individually or all at once, which also deletes them from our server.” While I am encouraged that Amazon allows users to delete audio recordings linked to their accounts, I am very concerned by reports that suggest that text transcriptions of these audio records are preserved indefinitely on Amazon’s servers, and users are not given the option to delete these text transcripts. The inability to delete a transcript of an audio recording renders the option to delete the recording largely inconsequential and puts users’ privacy at risk.

In Amazon’s response to my concerns last year, I was assured that “[f]rom early-stage development, [Amazon] built privacy deeply into the hardware and service by design, and with Alexa and Amazon’s Alexa-enabled products we strive to put the control with our customers.” If consumers cannot delete transcripts of their voice recordings, I am concerned that Amazon has not lived up to this standard.

The increasing popularity of in-home, Internet-connected devices and voice-activated technologies raises questions about the types of data they collect, store, and share, as well as the degree to which consumers control their personal information. Companies like Amazon that offer services through these devices should address these concerns by prioritizing consumer privacy and protecting sensitive personal information.

As a member of the Judiciary Committee, I have a longstanding interest in the privacy and security of consumers’ personal data, including information collected by in-home, Internet-connected devices and voice-activated technologies.

I therefore request that Amazon provide answers to the following questions.

  1. Regarding transcripts of user voice recordings that have been sent to the cloud:
    • How long does Amazon store the transcripts of user voice recordings?
    • Do users have the ability to delete any or all of these transcripts?
    • Are there any transcripts that a user cannot delete? If so, why is a user permitted to delete voice recordings but not the corresponding text transcript?
    • For what purpose does Amazon use these transcripts?
    • Does Amazon take any measures to anonymize user identity and other information in relation to these transcripts?
  2. Does Amazon preserve, either in audio or textual form, the Alexa system’s responses to user questions? If so, is a user able to delete the audio or the transcript of such a response?
  3. Regarding the wake word processing system:
    • In determining when a user has finished issuing a command, for how long does the system wait until it stops recording, on average?
    • Is any audio that is captured by the on-device buffer ever sent to the cloud if the wake word is not detected?
    • Is audio that is captured and stored in the device’s temporary memory transcribed by the automatic speech recognition system? If so, are those transcripts similarly stored only in the device’s temporary memory, or are they sent to the cloud?
    • Amazon’s July 27, 2018 letter indicates that the Alexa system comes with a setting whereby a user can allow Alexa to respond to a series of requests without the customer needing to repeat the wake word.
    • Is this a default setting, or does a consumer need to affirmatively enable this setting?
    • For how long does Alexa listen for subsequent commands after the wake word is spoken when this setting is enabled?

I appreciate your prompt attention to this matter and would respectfully request a response by June 30, 2019.

Amazon responds to Sen. Coons queries

Download PDF

June 28, 2019 The Honorable Christopher A. Coons United States Senate 218 Russell Senate Office Building Washington, D.C. 20510

Dear Senator Coons,

Thank you for your May 23, 2019 letter regarding Amazon's privacy and data security practices with respect to our Alexa voice service.

At Amazon, customer trust is our highest priority, and we know we must get privacy right in order to meet our customers’ high expectations. Our customer-centric approach has led us to follow privacy-by-design principles since we introduced the Echo device and Alexa service. For example, as we described in our response to your letter from last year, we designed Echo devices and Alexa to use on-device “keyword spotting” technology to detect when a customer intends to interact with Alexa; to use visual and audible signals to clearly indicate to customers when audio is being recorded for streaming to the cloud; to continually attempt to determine when a customer’s request to Alexa has ended so we can minimize the amount of audio we stream to the cloud; to allow customers to see, hear, and delete the audio that was streamed to the cloud; and to let customers control when their Echo device’s microphone is enabled through a microphone on/off button. We use the customer data we collect to provide the Alexa service and improve the customer experience, and our customers know that their personal information is safe with us.

The answers to your questions are as follows:

  • 1(a). How long does Amazon store the transcripts of user voice recordings?

We retain customers’ voice recordings and transcripts until the customer chooses to delete them.

  • 1(b). Do users have the ability to delete any or all of these transcripts?

Customers can review, listen to, and delete voice recordings associated with their account using the Voice History feature available in the Alexa app and the Alexa Privacy Hub, located at www.amazon.com/alexaprivacy. Customers can delete individual voice recordings, voice recordings from particular timeframes, or all of their voice recordings.

When a customer deletes a voice recording, we delete the transcripts associated with the customer’s account of both of the customer’s request and Alexa’s response. We already delete those transcripts from all of Alexa’s primary storage systems, and we have an ongoing effort to ensure those transcripts do not remain in any of Alexa’s other storage systems. We do not store the audio of Alexa’s response. However, we may still retain other records of customers’ Alexa interactions, including records of actions Alexa took in response to the customer’s request. And when a customer interacts with an Alexa skill, that skill developer may also retain records of the interaction. For example, for many types of Alexa requests – such as when a customer subscribes to Amazon Music Unlimited, places an Amazon Fresh order, requests a car from Uber or Lyft, orders a pizza from Domino’s, or makes an in-skill purchase of premium digital content – Amazon and/or the applicable skill developer obviously need to keep a record of the transaction. And for other types of Alexa requests – for instance, setting a recurring alarm, asking Alexa to remind you of your anniversary, placing a meeting on your calendar, sending a message to a friend – customers would not want or expect deletion of the voice recording to delete the underlying data or prevent Alexa from performing the requested task.

  • 1(c). Are there any transcripts that a user cannot delete? If so, why is a user permitted to delete voice recordings but not the corresponding text transcript?

Please see the answer to 1(b).

  • 1(d). For what purpose does Amazon use these transcripts?

Alexa is designed to get smarter every day – this is accomplished through Amazon’s cloud services, including machine learning software. Our speech recognition and natural language understanding systems use machine learning to adapt to customers’ speech patterns and vocabulary, informed by the way customers use Alexa in the real world.

To work well, machine learning systems need to be trained using real world data. Speech is nuanced, dynamic, and has tremendous variation based on region, dialect, context, environment, and the individual speaker, including their age, whether they are a native or non-native speaker of the language, and whether they have a speech impairment. Training Alexa with voice recordings and transcripts from a diverse range of customers helps ensure Alexa works well for everyone.

In addition to using the transcripts to improve Alexa and the customer experience, we use the transcripts to provide transparency to our customer about what Alexa thought it heard and what Alexa provided as a response. Our Alexa’s Voice History feature allows customers to play the actual audio that was streamed to the cloud, review the text transcript of what Alexa thought the customer said, and review Alexa’s response. This helps customers to understand how Alexa works.

If Alexa makes a mistake, this helps customers to understand what may have gone wrong (for instance, customers can see and hear how Alexa may have misunderstood a particular word or phrase). Providing customers with the transcript also allows customers to understand and inspect exactly what Alexa is, and is not, recording.

  • 1(e). Does Amazon take any measures to anonymize user identity and other information in relation to these transcripts?

As described in answer 1(d), we allow customers to review the transcripts of their Alexa interactions as part of our Voice History feature, which requires us to keep the transcripts associated with the customer’s account.

    1. Does Amazon preserve, either in audio or textual form, the Alexa system’s responses to user questions? If so, is a user able to delete the audio or the transcript of such a response?

Please see the answer to 1(b).

  • 3(a). In determining when a user has finished issuing a command, for how long does the system wait until it stops recording, on average?

Once activated by the wake word or the action button, the Echo device opens an audio stream to the cloud and sends the request to Alexa to respond accordingly. The audio stream closes immediately once our automatic speech recognition system determines the customer has stopped speaking the request. A blue light illuminates on the Echo device to indicate when audio is being streamed to the cloud, and customers can also enable an audible tone that plays when their Echo device begins and ends streaming audio to the cloud.

  • 3(b). Is any audio that is captured by the on-device buffer ever sent to the cloud if the wake word is not detected?

No audio is sent to the cloud unless the Echo device detects the wake word (or customers press a button to speak to Alexa).1

  • 3(c). Is Audio that is captured and stored in the device’s temporary memory transcribed by the automatic speech recognition system? If so, are those transcripts similarly stored only in the device’s temporary memory, or are they sent to the cloud?

The short buffer on the device is continuously overwritten and is not transcribed by the automatic speech recognition system. Instead, our wake word technology works by identifying acoustic patterns that match the wake word. Once the Echo device has detected the wake word, a recording is sent to our cloud where our systems for speech recognition and natural language understanding process and respond to the customer’s request. Alexa is designed to record and process as little audio from customers as possible. Processing audio that was not intended for Alexa is costly, provides no value to Amazon, and is detrimental to our customers’ Alexa experience.

  • 3(d). Amazon’s July 17, 2018 letter indicates that the Alexa system comes with a setting whereby a user can allow Alexa to respond to a series of requests without the customer needing to repeat the wake word.
    • (i) Is this a default setting, or does a consumer need to affirmatively enable this setting?

“Follow-Up Mode” allows customers to ask Alexa multiple questions and commands without having to use the wake word each time. Follow-Up Mode must be affirmatively enabled by the customer in their Alexa app. Like other customer requests to Alexa, when a customer has enabled Follow-Up Mode, Alexa will end the stream immediately once our automatic speech recognition system determines the customer has stopped speaking to Alexa. A blue light illuminates on the Echo device to indicate when audio is being streamed to the cloud, and customers can also enable an audible tone that plays when their Echo device begins and ends streaming audio to the cloud.

  • 3(d).
    • (ii) For how long does Alexa listen for subsequent commands after the wake word is spoken when this setting is enabled?

Alexa will end the stream immediately once the user ends the conversation or if Alexa detects silence or speech that isn’t intended for Alexa. Customers can find information about the Follow-Up Mode in the Alexa App and on Amazon.com (https://www.amazon.com/gp/help/customer/display.html?nodeId=202201630).

Thank you again for your interest in Amazon's privacy and data security practices with respect to our Alexa voice service.

About the Author(s)

(davidherron.com) David Herron : David Herron is a writer and software engineer focusing on the wise use of technology. He is especially interested in clean energy technologies like solar power, wind power, and electric cars. David worked for nearly 30 years in Silicon Valley on software ranging from electronic mail systems, to video streaming, to the Java programming language, and has published several books on Node.js programming and electric vehicles.