W3C | Submissions

Team Comment on the XHTML+Voice Submission

W3C is pleased to receive the XHTML+Voice submission from IBM, Motorola and Opera Software. This submission describes a means to modularize VoiceXML 2.0 and outlines how these modules can be combined with XHTML for multimodal interaction, where users can interact with web pages using a combination of visual and aural interaction. The submission includes voice modules that support speech synthesis, speech dialogs, command and control, speech grammars, and the ability to attach Voice handlers for responding to specific DOM events, thereby re-using the event model familiar to web developers. Voice interaction features are integrated directly with XHTML and CSS, and can consequently be used directly within XHTML content.

The W3C Voice Browser Working Activity is responsible for the development of VoiceXML and associated specifications for speech synthesis and speech recognition. The submission provides modular XML Schemas and DTDs and makes the (incorrect) assumption that these are in the same namespace.

The IPR declarations provided with the submission reveal that both IBM and Motorola may own patents or patent applications that apply to the XHTML+Voice submission. Both companies state that they are prepared to offer a non-exclusive license under such patents on reasonable and non-discriminatory (RAND) terms.

Update: An updated version of XHTML+Voice (v 1.1) was contributed to the Voice Browser and Multimodal Interaction Working Groups on 11th March 2003. Both Motorola and IBM have revised their IPR disclosures, agreeing to provide a nonexclusive royalty-free licence for any related patent claims they may have.

Next Steps

The submission will be brought to the attention of the Voice Browser Working Group with a view to stimulating work on modularizing VoiceXML. The Working Group has already made a start on using XML Schema to formalize VoiceXML 2.0 and the associated specifications for speech synthesis and speech grammars, assigning separate XML namespaces to each of these. This work should be done in consultation with the HTML Working Group to take advantage of their experience with the modularization of XHTML in XML Schema.

W3C is proposing to set up a new activity on multimodal interaction. Should this activity be started, the submission will be brought to the attention of the Multimodal Interaction Working Group. It is unfortunate that the submission is encumbered with RAND terms as this may preclude the application of the technology as the basis for a W3C Recommendation for multimodal interaction. For further information on W3C's ongoing work on patent policy, please refer to the Patent Policy Working Group.

Feedback on this technology is encouraged on the www-voice mailing list (public archive). To send mail to this list you must first subscribe by sending an email message to www-voice-request@w3.org with the word subscribe in the subject line (include the word unsubscribe if you want to unsubscribe from the list).

Disclaimer: Placing a Submission on a Working Group/Interest Group agenda does not imply endorsement by either the W3C Staff or the participants of the Working Group/Interest Group, nor does it guarantee that the Working Group/Interest Group will agree to take any specific action on a Submission.


Dave Raggett, Team Contact for the Voice Browser Working Group <dsr@w3.org>