docs/accessibility/os/select_to_speak.md - chromium/src - Git at Google

 # Select to Speak (for developers)

 Select to Speak is a Chrome OS feature to read text on the screen out loud.


 There are millions of users who greatly benefit from some text-to-speech but
 don’t quite need a full screen reading experience where everything is read
 aloud each step of the way. For these users, whether they are low vision,
 dyslexic, neurologically diverse, or simply prefer to listen to text read
 aloud instead of visually reading it, we have built Select-to-Speak.

 ## Using Select to Speak

 Go to Chrome settings, Accessibility settings, “Manage accessibility Features”,
 and enable “Select to Speak”. You can adjust the preferred voice, highlight
 color, and access text-to-speech preferences from the settings page.

 With this feature enabled, you can read text on the screen in one of three ways:

 - Hold down the Search key, then use the touchpad or external mouse to tap or
 drag a region to be spoken

 - Tap the Select-to-Speak icon in the status tray and use the mouse or
 touchscreen to select a region to be spoken

 - Highlight text and use Search+S to speak only the selected text.

 Read more on the
 [Chrome help page](https://support.google.com/chromebook/answer/9032490?hl=en)
 under “Listen to part of a page”.

 ## Reporting bugs

 Use bugs.chromium.org, filing bugs under the component
 [UI>Accessibility>SelectToSpeak](https://bugs.chromium.org/p/chromium/issues/list?sort=-opened&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified&q=component%3AUI%3EAccessibility%3ESelectToSpeak%20&can=2).

 ## Developing

 *Select to Speak will be abbreviated STS in this section.*

 ### Code location

 STS code lives mainly in three places:

 - A component extension to do the bulk of the logic and processing,
 chrome/browser/resources/chromeos/accessibility/select_to_speak/

 - An event handler, ash/events/select_to_speak_event_handler.h

 - The status tray button, ash/system/accessibility/select_to_speak/select_to_speak_tray.h

 - Floating panel, system/accessibility/select_to_speak_menu_bubble_controller.h

 In addition, there are settings for STS in
 chrome/browser/resources/ash/settings/os_a11y_page/select_to_speak_subpage.*

 ### Tests

 Tests are in ash_unittests and in browser_tests:

 ```
 out/Release/ash_unittests --gtest_filter=”SelectToSpeak*”
 out/Release/browser_tests --gtest_filter=”SelectToSpeak*”
 ```
 ### Debugging

 Developers can add log lines to any of the C++ files and see output in the
 console. To debug the STS extension, the easiest way is from an external
 browser. Start Chrome OS on Linux with this command-line flag:

 ```
 out/Release/chrome --remote-debugging-port=9222
 ```

 Now open http://localhost:9222 in a separate instance of the browser, and
 debug the Select to Speak extension background page from there.

 ## How it works

 Like [Chromevox](chromevox.md), STS is implemented mainly as a component
 Chrome extension which is always loaded and running in the background when
 enabled, and unloaded when disabled. The only STS code outside of the
 extension is an EventRewriter which forwards keyboard and mouse events to
 the extension as needed, so that the extension can get events systemwide.

 The STS extension does the following, at a high level:

 1. Tracks key and mouse events to determine when a user has either:

     a. Held down “search” and clicked & dragged a rectangle to specify a
     selection

     b. Used “search” + “s” to indicate that selected text should be read

     c. Has requested speech to be canceled by tapping ‘control’ or ‘search’
     alone

 2. Determines the Accessibility nodes that make up the selected region

 3. Sends utterances to the Chrome Text-to-Speech extension to be spoken

 4. Tracks utterance progress and updates the focus ring and highlight as needed.

 ### Select to Speak extension structure

 Most STS logic takes place in
 [select_to_speak.js](https://cs.chromium.org/chromium/src/chrome/browser/resources/chromeos/accessibility/select_to_speak/select_to_speak.js).

 #### User input

 Input to the extension is handled by input_handler.js, which handles user
 input from mouse, keyboard, and touchscreen events. Most logic here revolves
 around keeping track of state to see if the user has requested text using
 one of the three ways to activate the feature, search + mouse, tray button
 + mouse, or search + s.

 #### Determining selected content

 Once input_handler determines that the user did request text to be spoken,
 STS must determine which part of the page to read. To do this it requests
 information from the Automation API, and then generates a list of
 AutomationNodes to be read.

 ##### With mouse or touchpad

 select_to_speak.js fires a HitTest to the Automation API at the center of
 the rect selected by the user. When the API gets a result it returns via
 SelectToSpeak.onAutomationHitTest_. This function walks up from the hit
 test node to the nearest container to find a root, then back down through
 all the root’s children to find ones that overlap with the selected rect.
 Walking back down through the children occurs in NodeUtils.findAllMatching,
 and results in a list of AutomationNodes that can be sent for speech.

 If the rect size is below a certain threshold, all nodes within overlapped
 block parent are selected.

 ##### With search + s

 select_to_speak.js requests focus information from the Automation API. The
 focus result is sent to SelectToSpeak.requestSpeakSelectedText_, which
 uses Automation selection to determine which nodes are selected. The
 complexity of logic here is converting between Automation selection and
 its deep equivalent, i.e. from parent nodes and offsets to their leaves.
 This occurs in NodeUtils.getDeepEquivalentForSelection. When the first and
 last nodes in selection are found, SelectToSpeak.readNodesInSelection_ is
 used to determine the entire list of AutomationNodes which should be sent
 for speech.

 #### Speaking selected content

 SelectToSpeak.startSpeechQueue_ takes a list of AutomationNodes, determines
 their text content, and sends the result to the Text to Speech API for
 speech. It begins by mapping the text content of the nodes to the nodes
 themselves, so that STS can speak smoothly across node boundaries (i.e.
 across line breaks) and follow speech progress with a highlight. The mapping
 between text and nodes occurs in repeated calls to
 ParagraphUtils.buildNodeGroup to build lists of nodes that should be spoken
 smoothly.


 Each node group is sent to the Text to Speech API, with callbacks to allow
 for speech progress tracking, enabling the highlight to be dynamically
 updated with each word.

 #### Highlighting content during speech

 On each word boundary event, the TTS API sends a callback which is handled
 by SelectToSpeak.onTtsWordEvent_. This is used to check against the list of
 nodes being spoken to see which node is currently being spoken, and further
 check against the words in the node to see which word is spoken.

 #### Edge cases

 STS must also handle cases where:

 - Nodes become invalid during speech, i.e. if a page was closed. Speech
 should continue, but highlight stops.

 - Nodes disappear and re-appear during speech (a user may have switched
 tabs and switched back, or scrolled). Highlight should resume.

 This occurs in SelectToSpeak.updateFromNodeState_.

 ### Communication with SelectToSpeakTray

 STS runs in the extension process, but needs to communicate its three states
 (Inactive, Selecting, and Speaking) to the STS button in the status tray.
 It also needs to listen for users requesting state change using the
 SelectToSpeakTray button. The STS extension uses the AccessibitilityPrivate
 method setSelectToSpeakState to inform the SelectToSpeakTray of a
 status change, and listens to onSelectToSpeakStateChangeRequested to know
 when a user wants to change state. The STS extension is the source of truth
 for STS state.

 ### Navigation features

 STS will display a floating control panel when activated. The control panel
 hosts controls for pause/resume, updating reading speed, navigating by sentence
 or paragraph, and deactivating STS.

 #### Floating control panel

 The panel is implemented as a native ASH component
 [select_to_speak_menu_bubble_controller.h](https://source.chromium.org/chromium/chromium/src/+/main:ash/system/accessibility/select_to_speak/select_to_speak_menu_bubble_controller.h).
 Similar to focus rings, the STS component extension communicates with the panel
 via the `chrome.accessibilityPrivate` API. The
 `chrome.accessibilityPrivate.updateSelectToSpeakPanel` API controls the
 visibility and button states, and panel actions are communicated back to the
 extension by adding a listener to
 `chrome.accessibilityPrivate.onSelectToSpeakPanelAction`.

 When the panel is displayed, STS will no longer dismiss itself when TTS
 playback is complete. The user must quit STS either from the panel or
 the tray button.

 ##### Keyboard shortcuts

 When the panel is displayed, it is initially focused and captures keypresses to
 implement keyboard shortcuts:

 *  Space - activates currently focused button, which is 'Pause/Resume'
    initially.
 *  Left Arrow - Navigate to previous sentence (for RTL languages, this is Right
    Arrow)
 *  Right Arrow - Navigate to next sentence (for RTL languages, this is Left
    Arrow)
 *  Up Arrow - Navigate to previous paragraph
 *  Down Arrow - Navigate to next paragraph

 If the panel loses focus, keyboard shortcuts will no longer work. User can press
 Search+S keyboard shortcut (with no text selection) to restore focus to the
 panel.

 ##### Disallowed nodes

 The panel is not shown when STS is activated on nodes where navigation features
 do not add value, such as in system UI or top-level windows.

 *  System UI nodes - any nodes that have a `root` with role `desktop`
 *  Root nodes that are children of the root `desktop` node

 #### Pause/Resume

 Since `chrome.tts.pause` and `chrome.tts.resume` are not consistently
 implemented across all TTS engines, STS implements pause/resume functionality
 using the `chrome.tts.stop` and `chrome.tts.speak` APIs. While TTS is playing,
 STS keeps track of the current word offset, and when TTS is resumed, it will
 call `speak` with text trimmed to the start of the last spoken word.

 Resuming TTS behaves differently depending on the context:

 *  If TTS was paused within the user-selected text, resuming will play until
    the end of the selected text.
 *  If TTS stopped when it reached the end of the selected text, but before the
    end of the paragraph, resuming will continue from that point to the end of
    the paragraph.
 *  If TTS stopped when it reached the end of a paragraph, resuming will speak
    the next paragraph.

 #### Paragraph navigation

 Users can navigate to adjacent paragraphs from the current block parent when
 Select-to-speak is active. A 'paragraph' is any block element as defined by
 [ParagraphUtils.isBlock](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/chromeos/accessibility/select_to_speak/paragraph_utils.js)
 and the navigation occurs in DOM-order.

 #### Sentence navigation

 Paragraphs are split into sentences based on the `sentenceStarts` property of
 an AutomationNode. Users can skip to previous and next sentences using similar
 technique as pause/resume (`stop` then `speak` with trimmed text). See
 [sentence_utils.js](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/chromeos/accessibility/select_to_speak/sentence_utils.js)
 for logic on breaking node groups into sentences.

 #### Reading speed

 Users can slow down or speed up TTS speaking rate using the floating control
 panel. The rate the user selects in the panel is multiplied by the system
 default TTS rate. So if the user selects 1.2x reading speed in the panel and
 has a system default of 2.0x, the effective TTS rate will be 2.4x.

 When users adjust reading speed, `chrome.tts.stop` is called, and
 `chrome.tts.speak` is then called with text trimmed to the current word
 position, passing in the new effective TTS rate as an option.

 ### Special case: Google Drive apps

 Google Drive apps require a few work-arounds to work correctly with STS.

 - Any time a Google Drive document is loaded (such as a Doc, Sheet or Slides
 document), the script
 [select_to_speak_gdocs_script](https://cs.chromium.org/chromium/src/chrome/browser/resources/chromeos/accessibility/select_to_speak/select_to_speak_gdocs_script.js?q=select_to_speak_gdocs_script.js+file:%5Esrc/chrome/browser/resources/chromeos/accessibility/select_to_speak/+package:%5Echromium$&dr)
 must be executed to remove aria-hidden from the content container.

 - Using search+s to read highlighted text uses the clipboard to get text data
 from Google Docs, as selection information may not be available in the
 Automation API. This happens mostly in input_handler.js.

 ### Enhanced network voices

 As of M94, Select-to-speak supports natural, server-generated voices. When
 enhanced network voices are enabled, Select-to-speak passes the user's selected
 natural voice name to `chrome.tts.speak`. The TTS request is handled by the
 [Enhanced Network TTS engine](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/chromeos/accessibility/enhanced_network_tts/).
 The TTS engine then passes the request to native code
 ([ EnhancedNetworkTts](https://source.chromium.org/chromium/chromium/src/+/main:chromeos/ash/components/enhanced_network_tts/enhanced_network_tts_impl.h)),
 which in turn sends a network request to the ReadAloud API, which produces
 synthesized audio.

 For instructions on how to add new voices, see
 [go/chromeos-natural-voices](go/chromeos-natural-voices).


 ## For Googlers

 For more, Googlers could check out the Select to Speak feature design docs
 for more details on design as well as UMA.

 - Overall product design, [go/select-to-speak-design](go/select-to-speak-design)

 - On-Screen UI for touch and tablet modes,
 [go/chromeos-sts-on-screen-ui](go/chromeos-sts-on-screen-ui)

 - Reading text at keystroke,
 [go/chromeos-sts-selection-keystroke](go/chromeos-sts-selection-keystroke)

 - Reading text at keystroke in Google Drive apps, [go/sts-selection-in-drive](go/sts-selection-in-drive)

 - Per word highlighting,
 [go/chrome-sts-sentences-and-words](go/chrome-sts-sentences-and-words) and
 [go/chromeos-sts-highlight](go/chromeos-sts-highlight)

 - Navigation features, [go/enhanced-sts-dd](go/enhanced-sts-dd)

 - Enhanced network voices, [go/wavenet-chromeos-dd](go/wavenet-chromeos-dd)
	# Select to Speak (for developers)

	Select to Speak is a Chrome OS feature to read text on the screen out loud.


	There are millions of users who greatly benefit from some text-to-speech but
	don’t quite need a full screen reading experience where everything is read
	aloud each step of the way. For these users, whether they are low vision,
	dyslexic, neurologically diverse, or simply prefer to listen to text read
	aloud instead of visually reading it, we have built Select-to-Speak.

	## Using Select to Speak

	Go to Chrome settings, Accessibility settings, “Manage accessibility Features”,
	and enable “Select to Speak”. You can adjust the preferred voice, highlight
	color, and access text-to-speech preferences from the settings page.

	With this feature enabled, you can read text on the screen in one of three ways:

	- Hold down the Search key, then use the touchpad or external mouse to tap or
	drag a region to be spoken

	- Tap the Select-to-Speak icon in the status tray and use the mouse or
	touchscreen to select a region to be spoken

	- Highlight text and use Search+S to speak only the selected text.

	Read more on the
	[Chrome help page](https://support.google.com/chromebook/answer/9032490?hl=en)
	under “Listen to part of a page”.

	## Reporting bugs

	Use bugs.chromium.org, filing bugs under the component
	[UI>Accessibility>SelectToSpeak](https://bugs.chromium.org/p/chromium/issues/list?sort=-opened&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified&q=component%3AUI%3EAccessibility%3ESelectToSpeak%20&can=2).

	## Developing

	Select to Speak will be abbreviated STS in this section.

	### Code location

	STS code lives mainly in three places:

	- A component extension to do the bulk of the logic and processing,
	chrome/browser/resources/chromeos/accessibility/select_to_speak/

	- An event handler, ash/events/select_to_speak_event_handler.h

	- The status tray button, ash/system/accessibility/select_to_speak/select_to_speak_tray.h

	- Floating panel, system/accessibility/select_to_speak_menu_bubble_controller.h

	In addition, there are settings for STS in
	chrome/browser/resources/ash/settings/os_a11y_page/select_to_speak_subpage.*

	### Tests

	Tests are in ash_unittests and in browser_tests:

	```
	out/Release/ash_unittests --gtest_filter=”SelectToSpeak*”
	out/Release/browser_tests --gtest_filter=”SelectToSpeak*”
	```
	### Debugging

	Developers can add log lines to any of the C++ files and see output in the
	console. To debug the STS extension, the easiest way is from an external
	browser. Start Chrome OS on Linux with this command-line flag:

	```
	out/Release/chrome --remote-debugging-port=9222
	```

	Now open http://localhost:9222 in a separate instance of the browser, and
	debug the Select to Speak extension background page from there.

	## How it works

	Like [Chromevox](chromevox.md), STS is implemented mainly as a component
	Chrome extension which is always loaded and running in the background when
	enabled, and unloaded when disabled. The only STS code outside of the
	extension is an EventRewriter which forwards keyboard and mouse events to
	the extension as needed, so that the extension can get events systemwide.

	The STS extension does the following, at a high level:

	1. Tracks key and mouse events to determine when a user has either:

	a. Held down “search” and clicked & dragged a rectangle to specify a
	selection

	b. Used “search” + “s” to indicate that selected text should be read

	c. Has requested speech to be canceled by tapping ‘control’ or ‘search’
	alone

	2. Determines the Accessibility nodes that make up the selected region

	3. Sends utterances to the Chrome Text-to-Speech extension to be spoken

	4. Tracks utterance progress and updates the focus ring and highlight as needed.

	### Select to Speak extension structure

	Most STS logic takes place in
	[select_to_speak.js](https://cs.chromium.org/chromium/src/chrome/browser/resources/chromeos/accessibility/select_to_speak/select_to_speak.js).

	#### User input

	Input to the extension is handled by input_handler.js, which handles user
	input from mouse, keyboard, and touchscreen events. Most logic here revolves
	around keeping track of state to see if the user has requested text using
	one of the three ways to activate the feature, search + mouse, tray button
	+ mouse, or search + s.

	#### Determining selected content

	Once input_handler determines that the user did request text to be spoken,
	STS must determine which part of the page to read. To do this it requests
	information from the Automation API, and then generates a list of
	AutomationNodes to be read.

	##### With mouse or touchpad

	select_to_speak.js fires a HitTest to the Automation API at the center of
	the rect selected by the user. When the API gets a result it returns via
	SelectToSpeak.onAutomationHitTest_. This function walks up from the hit
	test node to the nearest container to find a root, then back down through
	all the root’s children to find ones that overlap with the selected rect.
	Walking back down through the children occurs in NodeUtils.findAllMatching,
	and results in a list of AutomationNodes that can be sent for speech.

	If the rect size is below a certain threshold, all nodes within overlapped
	block parent are selected.

	##### With search + s

	select_to_speak.js requests focus information from the Automation API. The
	focus result is sent to SelectToSpeak.requestSpeakSelectedText_, which
	uses Automation selection to determine which nodes are selected. The
	complexity of logic here is converting between Automation selection and
	its deep equivalent, i.e. from parent nodes and offsets to their leaves.
	This occurs in NodeUtils.getDeepEquivalentForSelection. When the first and
	last nodes in selection are found, SelectToSpeak.readNodesInSelection_ is
	used to determine the entire list of AutomationNodes which should be sent
	for speech.

	#### Speaking selected content

	SelectToSpeak.startSpeechQueue_ takes a list of AutomationNodes, determines
	their text content, and sends the result to the Text to Speech API for
	speech. It begins by mapping the text content of the nodes to the nodes
	themselves, so that STS can speak smoothly across node boundaries (i.e.
	across line breaks) and follow speech progress with a highlight. The mapping
	between text and nodes occurs in repeated calls to
	ParagraphUtils.buildNodeGroup to build lists of nodes that should be spoken
	smoothly.


	Each node group is sent to the Text to Speech API, with callbacks to allow
	for speech progress tracking, enabling the highlight to be dynamically
	updated with each word.

	#### Highlighting content during speech

	On each word boundary event, the TTS API sends a callback which is handled
	by SelectToSpeak.onTtsWordEvent_. This is used to check against the list of
	nodes being spoken to see which node is currently being spoken, and further
	check against the words in the node to see which word is spoken.

	#### Edge cases

	STS must also handle cases where:

	- Nodes become invalid during speech, i.e. if a page was closed. Speech
	should continue, but highlight stops.

	- Nodes disappear and re-appear during speech (a user may have switched
	tabs and switched back, or scrolled). Highlight should resume.

	This occurs in SelectToSpeak.updateFromNodeState_.

	### Communication with SelectToSpeakTray

	STS runs in the extension process, but needs to communicate its three states
	(Inactive, Selecting, and Speaking) to the STS button in the status tray.
	It also needs to listen for users requesting state change using the
	SelectToSpeakTray button. The STS extension uses the AccessibitilityPrivate
	method setSelectToSpeakState to inform the SelectToSpeakTray of a
	status change, and listens to onSelectToSpeakStateChangeRequested to know
	when a user wants to change state. The STS extension is the source of truth
	for STS state.

	### Navigation features

	STS will display a floating control panel when activated. The control panel
	hosts controls for pause/resume, updating reading speed, navigating by sentence
	or paragraph, and deactivating STS.

	#### Floating control panel

	The panel is implemented as a native ASH component
	[select_to_speak_menu_bubble_controller.h](https://source.chromium.org/chromium/chromium/src/+/main:ash/system/accessibility/select_to_speak/select_to_speak_menu_bubble_controller.h).
	Similar to focus rings, the STS component extension communicates with the panel
	via the `chrome.accessibilityPrivate` API. The
	`chrome.accessibilityPrivate.updateSelectToSpeakPanel` API controls the
	visibility and button states, and panel actions are communicated back to the
	extension by adding a listener to
	`chrome.accessibilityPrivate.onSelectToSpeakPanelAction`.

	When the panel is displayed, STS will no longer dismiss itself when TTS
	playback is complete. The user must quit STS either from the panel or
	the tray button.

	##### Keyboard shortcuts

	When the panel is displayed, it is initially focused and captures keypresses to
	implement keyboard shortcuts:

	* Space - activates currently focused button, which is 'Pause/Resume'
	initially.
	* Left Arrow - Navigate to previous sentence (for RTL languages, this is Right
	Arrow)
	* Right Arrow - Navigate to next sentence (for RTL languages, this is Left
	Arrow)
	* Up Arrow - Navigate to previous paragraph
	* Down Arrow - Navigate to next paragraph

	If the panel loses focus, keyboard shortcuts will no longer work. User can press
	Search+S keyboard shortcut (with no text selection) to restore focus to the
	panel.

	##### Disallowed nodes

	The panel is not shown when STS is activated on nodes where navigation features
	do not add value, such as in system UI or top-level windows.

	* System UI nodes - any nodes that have a `root` with role `desktop`
	* Root nodes that are children of the root `desktop` node

	#### Pause/Resume

	Since `chrome.tts.pause` and `chrome.tts.resume` are not consistently
	implemented across all TTS engines, STS implements pause/resume functionality
	using the `chrome.tts.stop` and `chrome.tts.speak` APIs. While TTS is playing,
	STS keeps track of the current word offset, and when TTS is resumed, it will
	call `speak` with text trimmed to the start of the last spoken word.

	Resuming TTS behaves differently depending on the context:

	* If TTS was paused within the user-selected text, resuming will play until
	the end of the selected text.
	* If TTS stopped when it reached the end of the selected text, but before the
	end of the paragraph, resuming will continue from that point to the end of
	the paragraph.
	* If TTS stopped when it reached the end of a paragraph, resuming will speak
	the next paragraph.

	#### Paragraph navigation

	Users can navigate to adjacent paragraphs from the current block parent when
	Select-to-speak is active. A 'paragraph' is any block element as defined by
	[ParagraphUtils.isBlock](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/chromeos/accessibility/select_to_speak/paragraph_utils.js)
	and the navigation occurs in DOM-order.

	#### Sentence navigation

	Paragraphs are split into sentences based on the `sentenceStarts` property of
	an AutomationNode. Users can skip to previous and next sentences using similar
	technique as pause/resume (`stop` then `speak` with trimmed text). See
	[sentence_utils.js](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/chromeos/accessibility/select_to_speak/sentence_utils.js)
	for logic on breaking node groups into sentences.

	#### Reading speed

	Users can slow down or speed up TTS speaking rate using the floating control
	panel. The rate the user selects in the panel is multiplied by the system
	default TTS rate. So if the user selects 1.2x reading speed in the panel and
	has a system default of 2.0x, the effective TTS rate will be 2.4x.

	When users adjust reading speed, `chrome.tts.stop` is called, and
	`chrome.tts.speak` is then called with text trimmed to the current word
	position, passing in the new effective TTS rate as an option.

	### Special case: Google Drive apps

	Google Drive apps require a few work-arounds to work correctly with STS.

	- Any time a Google Drive document is loaded (such as a Doc, Sheet or Slides
	document), the script
	[select_to_speak_gdocs_script](https://cs.chromium.org/chromium/src/chrome/browser/resources/chromeos/accessibility/select_to_speak/select_to_speak_gdocs_script.js?q=select_to_speak_gdocs_script.js+file:%5Esrc/chrome/browser/resources/chromeos/accessibility/select_to_speak/+package:%5Echromium$&dr)
	must be executed to remove aria-hidden from the content container.

	- Using search+s to read highlighted text uses the clipboard to get text data
	from Google Docs, as selection information may not be available in the
	Automation API. This happens mostly in input_handler.js.

	### Enhanced network voices

	As of M94, Select-to-speak supports natural, server-generated voices. When
	enhanced network voices are enabled, Select-to-speak passes the user's selected
	natural voice name to `chrome.tts.speak`. The TTS request is handled by the
	[Enhanced Network TTS engine](https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/chromeos/accessibility/enhanced_network_tts/).
	The TTS engine then passes the request to native code
	([ EnhancedNetworkTts](https://source.chromium.org/chromium/chromium/src/+/main:chromeos/ash/components/enhanced_network_tts/enhanced_network_tts_impl.h)),
	which in turn sends a network request to the ReadAloud API, which produces
	synthesized audio.

	For instructions on how to add new voices, see
	[go/chromeos-natural-voices](go/chromeos-natural-voices).


	## For Googlers

	For more, Googlers could check out the Select to Speak feature design docs
	for more details on design as well as UMA.

	- Overall product design, [go/select-to-speak-design](go/select-to-speak-design)

	- On-Screen UI for touch and tablet modes,
	[go/chromeos-sts-on-screen-ui](go/chromeos-sts-on-screen-ui)

	- Reading text at keystroke,
	[go/chromeos-sts-selection-keystroke](go/chromeos-sts-selection-keystroke)

	- Reading text at keystroke in Google Drive apps, [go/sts-selection-in-drive](go/sts-selection-in-drive)

	- Per word highlighting,
	[go/chrome-sts-sentences-and-words](go/chrome-sts-sentences-and-words) and
	[go/chromeos-sts-highlight](go/chromeos-sts-highlight)

	- Navigation features, [go/enhanced-sts-dd](go/enhanced-sts-dd)

	- Enhanced network voices, [go/wavenet-chromeos-dd](go/wavenet-chromeos-dd)