Voice Input

DataWedge 13.0

Overview

Voice Input enables DataWedge to convert spoken entries into text as if they were typed or acquired from a scan. Voice Input uses the Google speech recognition engine included on GMS devices. Voice-to-data capture can be useful in cases when a barcode is wet, damaged, covered with stray markings or otherwise cannot be scanned.

Voice Input options:

  • Trigger voice capture by pressing the PTT button
  • Terminate voice capture with a timeout value
  • Set voice commands to navigate within the foreground app or issue specific key presses: TAB, ENTER, NEXT, PREVIOUS, ESC, Clear.
  • Limit returned data to alpha or numeric characters
  • Play an audio prompt when waiting for data capture
  • Validate spoken data, edit acquired data as needed
  • Works offline

This feature is supported only on Zebra GMS devices with Android Nougat and later.

Watch the DevTalk presentation on DataWedge Voice Input (Note: Start phrase and its related options are discontinued in DataWedge 13 and higher) :

How it Works

Voice data capture is activated either by pressing the PTT button or via DataWedge Soft Trigger intent API. When relying on the PTT button, Voice Input is configured in the DataWedge profile. The application receiving the voice capture data must be associated to the profile. When the application is launched, a message appears: "Press and hold PTT button to speak data." When pressing the PTT button, an audio tone is heard indicating that voice capture can begin. The user speaks the data to be captured and when done, releases the PTT button. The spoken data is displayed.

If using intent output to output the data, the data source can be identified as "voice" to differentiate between the other input sources and the voice data can be processed according to application requirements. Barcode scanning and voice input can exist in the same DataWedge profile so that both data capture methods can be used interchangeably.

Note: When Voice Input is enabled, the Google voice engine included with Google App version 11.21.9.21 (and later) plays a notification tone every few seconds to indicate that it is listening in the background. To minimize the frequency of this tone, Zebra recommends to create a profile (separate from Profile0) and associate it with the required applications/activities. This limits sound notifications to be heard only when the app is in the foreground.

Main Features

Voice Input features are accessible from the DataWege profile.

image

Voice input settings

  • Enabled - Enables voice input.

  • Data capture start option - Select the option to trigger voice capture:

    • PTT button - Sets the PTT button, if it exists on the device, to trigger voice capture. For devices that do not have a PTT button, the PTT button may be mapped to an available button on the device.
    • None - Voice capture is enabled only via intent API; see Soft Trigger API.
  • End detection timeout - Sets the timeout value (in seconds) for the data capture. If the value is set to "0", it waits infinitely for the data capture. This timeout is approximate, as it may encounter a 1 to 2 second delay. The default value is "0."

  • Voice commands - Configure and set voice commands to navigate a foreground application. Commands are supported only when PTT button is selected as the start option.

  • Tab command - Sends a tab key event when speaking the specified phrase.
    • Enabled – Enable/disable the tab command.
    • Phrase - Sets the command phrase to send a tab key. The default phrase is "send tab.”
  • Enter command - Sends an enter key event when speaking the specified phrase.
    • Enabled – Enable/disable the enter command.
    • Phrase - Sets the command phrase to send an enter key. The default phrase is "send enter.
  • Move next command - Move to the next input field when speaking the specified phrase.
    • Enabled – Enable/disable the move next command.
    • Phrase - Sets the command phrase to move to the next input field. The default phrase is “move next.”
  • Move previous command - Move to the previous input field when speaking the specified phrase.
    • Enabled - Enable/disable move previous command
    • Phrase - Sets the command phrase to move to the previous input field. The default phrase is “move previous.”
  • Escape command - Send the escape (ESC) key when speaking the specified phrase.
    • Enabled – Enable/disable escape command
    • Phrase - Sets the command phrase to send an ESC key. The default phrase is "send escape.”
  • Clear command - Clear the current input field in focus when speaking the specified phrase.
    • Enabled – Enable/disable the clear command.
    • Phrase - Sets the command phrase to clear the field. The default phrase is “clear.”
  • Data type - Configures the data type to be returned, with selections of: Any, Alpha, or Numeric. The data type is required to restrict data captured according to the preferences. Data type selections:
  • Any - All scanned data is returned. For example, if the barcode ABC123 is scanned, it will return ABC123 as is.
  • Alpha - Only alpha characters are returned. For example, if the barcode ABC123 is scanned, it will return ABC only.
  • Numeric - Only digits are returned. For example, if the barcode ABC123 is scanned, it will return 123 only.
  • Data capture waiting tone - Enables/disables audio feedback when the device is waiting to capture data.

  • Offline speech recognition - Enables offline speech recognition when there is no access to the internet. This uses an offline recognition speech engine to detect the data spoken.

  • Validation window - Validates the result after speaking, displaying the spoken data and provides for editing the data on the same screen, if needed. This is useful in offline mode, since the results received in this mode might not be accurate.

    image
    Validation window


See Limitations below.

Configuration

Voice Input Parameters

DataWedge Voice Input can be controlled programmatically with DataWedge APIs. Refer to DataWedge Voice Input Plugin in Set Config API to configure the following Voice Input parameters:

Param Name Param Values
voice_input_enabled true
false
voice_data_capture_start_option 1 - PTT button (default)
voice_data_capture_start_phrase start (default)
voice_data_capture_end_phrase [blank] (default)
voice_end_detection_timeout 0-30 (in seconds)
voice_command_tab_enabled true
false (default)
voice_command_tab_phrase send tab (default)
voice_command_enter_enabled true
false (default)
voice_command_enter_phrase send enter (default)
voice_command_move_next_enabled true
false (default)
voice_command_move_next_phrase move next (default)
voice_command_move_previous_enabled true
false (default)
voice_command_move_previous_phrase move previous (default)
voice_command_escape_enabled true
false (default)
voice_command_escape_phrase send escape (default)
voice_command_clear_enabled true
false (default)
voice_command_clear_phrase clear (default)
voice_data_type 0 - Any
1 - Alpha
2 - Numeric
voice_start_phrase_waiting_tone true
false
voice_data_capture_waiting_tone true
false
voice_validation_window true
false
voice_offline_speech true
false

Set Voice Input Configuration Sample

Refer to DataWedge Set Config API.

Limitations

  • Voice Input is validated only with English. For use with other languages, the device must be connected to the internet.
  • Offline speech recognition provides lower accuracy levels.
  • In GMS Restricted mode with the use of App Manager's DisableGMSApps action, Voice Input will not work since it relies on Google speech recognition.
  • Do not use Google Assistant while DataWedge Voice Input is in use, as it can lead to undesirable behavior.
  • Voice Input is not supported if Enterprise Home Screen (EHS) is in restricted mode. However, enabling all the privilege settings in EHS will reinstate Voice Input in DataWedge.
  • If the PTT (push-to-talk) button is released during voice capture, there can be a 1 to 2 second delay to display the captured data due to the speech engine continually listening at that moment.
  • When PTT Express is enabled and running, Voice Input should not be used simultaneously, otherwise it can lead to unexpected behavior.
  • When Voice Input is enabled in an active DataWedge profile, the media volume stream is muted via the volume control of the device.

Related guides: