W3C Architecture Domain XML

The W3C Workshop on Binary Interchange
of XML Information Item Sets

Call for Participation

24th, 25th and 26th September, 2003
Santa Clara, California, USA

Nearby: Workshop Report

Contents

Workshop Scope

Section 1.1 of the Extensible Markup Language (XML) gives as a design goal that Terseness in XML markup is of minimal importance. The Standard Generalized Markup Language (SGML), of which XML is a Profile, has a number of features intended to reduce typing when humans are entering markup directly, or to reduce file sizes, but these features were not included in XML.

The resulting XML specification gave us a highly regular language, but one that can use a considerable amount of bandwidth to transmit in any quantity. Furthermore, although parsing has been greatly simplified in terms of code complexity and run-time requirements, larger data streams necessarily entail greater I/O activity, and this can be significant in some applications.

There has been a steadily increasing demand to find ways to transmit pre-parsed XML documents and Schema-defined objects, in such a way that embedded, low-memory and/or low bandwidth devices can make use of an interoperable, accessible, internationalized, standard representation for structured information, yet without the overhead of parsing an XML text stream.

Multiple separate experimenters have reported significant savings in bandwidth, memory usage and CPU consumption using an ASN.1-based representation of XML documents. Others have claimed that gzip is adequate.

Advantages of a binary representation of a pre-parsed stream of Information Items (as defined by the XML Infoset) might include:

  1. It would not be restricted to a single schema or vocabulary, and hence could be interoperable between vocabularies;
  2. It would not be restricted to a single application or hardware device, and hence could be interoperable between implementations;
  3. Improved network efficiency and reduced storage needs: compression techniques that make use of domain-specific knowledge often do better than more generic compression;
  4. Sending pre-parsed data could reduce the complexity of applications, and may facilitate creation of simpler internal data structures.
  5. Web Services may need more efficiency, and a pre-parsed binary transmission format may help people to continue to work with Web Services rather than to explore proprietary interfaces.

One potential and very serious disadvantage is that one might lose the View Source Principle which has helped the Web to spread.

The purpose of the Workshop, then, is to study methods to compress XML documents, comparing Infoset-level representations with other methods, in order to determine whether a W3C Working Group might be chartered to produce an interoperable specification for such a transmission format.

Expected Audience

We expect several groups to contribute to the workshop:

Although the Workshop is public, it is restricted to approximately 60 places, with at most two attendees per organization. In addition, people wishing to attend must submit a position paper, and will be informed by the Program Committee of the success of their application. The intent is to make sure that participants have an active interest in the area.

Per W3C Process, attendance is on a strict first-come first-served basis!

Deliverables

The Workshop will produce the following:

These will be published on the W3C Web site by the end of October, 2003

Registration

There will be a limit of 60 participants at the Workshop. To ensure maximum diversity amongst participants, only two participants may attend per organization.

Position papers are required in order to participate in this workshop. Each organization or individual wishing to participate must submit a position paper explaining their interest in the workshop no later than 11th August 2003.

There will be no participation fee.

To attend, you must register by filling out the registration form. The URI for the registration form will be sent to you after your position paper is accepted. Send papers (in XHTML/HTML, DocBook or PDF) directly to the conference chair, Liam Quin.

Position Papers

Organizations wishing to participate in the Workshop must submit a position paper. Position papers can be anywhere from one page to 20 pages or more, but must address at least the following questions:

Position papers are due no later than the 11th of August, 2003

The Position Papers will be made public no later than one month after the Workshop.

Any future work in this area will be governed by the W3C Patent Policy, and will be on Royalty Free terms.

Workshop Organization

The final agenda will be announced in August. The outline will be as follows:

Wednesday

  1. Introductions
  2. Motivation (W3C)
  3. Selected Position Papers

Thursday

  1. Introduction
  2. Break-out Groups to discuss Requirements
  3. Summaries of Break-Out Work

Friday Morning

  1. Discussion of Further Action
  2. Closing Summary and Luncheon

The W3C Contact and Conference Chair is Liam Quin <liam@w3.org>.

Venue

The Workshop will be hosted in the Bay Area, by Sun Microsystems, at the corner of Montague and Lafayette in Santa Clara, CA (see directions).

Airport:
San Jose International Airport,California, USA, is the closest major airport.
Local Hotels:
Biltmore Hotel: 2151 Laurelwood Road Santa Clara, CA 95054; +1 408-988-8411
Embassy Suites: 2885 Lakeside Drive Santa Clara, CA 95054; +1 408-496-6400
Four Points By Sheraton: 1250 Lakeside Drive Sunnyvale, CA 94086; +1 408-738-4888
The Westin: 5101 Great America Parkway Santa Clara, CA 95054; +1 408-986-0700

Discussion

Participation by teleconference and by Internet Relay Chat (IRC) may be arranged in some cases. Subsequent discussion is expected to occur on a publicly-readable mailing list.

Resource Statement

This activity will consume 30% of the time of one W3C staff member for chairing the workshop, and 10% of the time of one W3C staff member for managing the workshop website. This workshop is part of the W3C XML Activity.


Valid XHTML 1.0!