NuanceAndroidCore (NAC) was designed to be a highly customizable library, facilitating creation of Siri-like voice assistants, without imposing constraints on the look and feel of these assistants. The platform controls the voice dialog and provides callbacks to the UI layer for displaying recognition results and status of the recording and recognition. Consumers of the library render this feedback as they see fit.
The library also performs data lookups such as calendar events and alarms within a range of dates; and actions such as launching an application, adding/deleting/changing alarms, events, and playing music. Much of the benefit of the library comes from its ability to perform all this seamlessly and automatically. However, in many cases, individual customers might have their own implementations for one or more of these applications, with their own API's, and sometimes with added functionality. The challenge was to come up with the best way to support these customizations, control and manage the intricacies of the voice dialog, and supply default behaviors for everything else.
This is where the extensible factory proved to be useful. NAC used this pattern in several places, including creation of its action and lookup handlers. NAC registered its own implementations of each handler (for example a handler for sending an SMS message). Each customer could then customize specific actions by registering a different factory to override the default implementation, should they desire different behavior. In this manner, they could leverage everything else the library provides, including everything related to the voice dialog for that action, and any actions that didn't need customizations, and only provide code for specific overrides of the default behavior.
Each of the action handler factories are chosen from a map keyed off an element of the server response (for example, an action of type "sms" would key an action handler of type SMS). Building off the sample code introduced in the last post about dynamic enums, the factory method looks like:
(Disclaimer: This snippet was hand-edited from the original code and not compiled. The possibility exists for compilation errors.)
This technique relies on reflection to build up the handlers, but performance is not a concern because the construction of the handlers is a very small portion of the overall time to perform an action, typically dominated by the time in getting recognition results returned from the server.
The voice recognition logic and dialog management for this system is server-side.
So the opportunity exists for new voice domains to be added and deployed to the field that NAC is not configured for, after NAC has already been released. This is where the dynamic enum class discussed in the last post became useful. With the dynamic enum, NAC could register all its known handlers with immutable keys, while also supporting key types it doesn't know about. For example, the SMS key type would be defined and used by a customer wishing to override default SMS handling, but the customer could create a new key, say "sports" on the fly should the "sports" domain be added to the server domains after the NAC library has already been released.
In the next post, I will explore an alternate approach using Dagger2 for dependency injection.
No comments:
Post a Comment