Wednesday, April 14, 2010

FastInfoset (FI) - 'binary' webservice

The key idea is that the speed of marshalling and unmarshalling content can be improved by moving away from a textual markup language to a binary one. It aims to provide more efficient serialization than the text-based XML format. In short, it adopts concept of Binary XML.

This is done at the expense of human readability or accessibility. No longer can you look at messages through a logging proxy server. No longer can you write normative 'correct' requests and responses to XML files in your SCM repository, run ant's over them and check that your XSD matches expectations.

As such, it is firmly in the camp of XML is a detail end users don't need to care about, which is not far behind WSDL and XSD are things the toolkit handles for you. It is also not XML, not in its classic sense.

One can think of FI as gzip for XML, though FI aims to optimize both document size and processing performance, whereas gzip optimizes only the size. While the original formatting is lost, no information is lost in the conversion from XML to FI and back to XML.

Between nodes that support it, FastInfoset may deliver tangible speedups through improvements in parse time. This may be at the expense of flexibility, but if the endpoint is written in a contract-last form, there is usually little flexibility in there anyway. Few Java methods are set up to deal with arbitrary amounts of incoming XML content, especially in unknown schemas, so will not be any less flexible by adopting FastInfoset.

Concerns & Facts:
There are no intellectual property restrictions on its implementation and use.

A common misconception is that FI requires ASN.1 tool support. Although the formal specification uses ASN.1 formalisms, ASN.1 tools are not required by implementations.

The other issue with FastInfoset is security. Whoever implements the parser had better design it to resist malicious content.

Genuine sacrifice would human readability and xml schema validation.

Reference Implementation

A Java implementation of the FI specification [http://fi.dev.java.net/] is available as part of the GlassFish project. The library is open source and is distributed under the terms of the Apache License 2.0. Several projects use this implementation, including the reference implementation for JAX-RPC and JAX-WS

About Performance
In addition to a significant reduction in document size of Fast Infoset with respect to standard XML 1.0, SAX-type parsing performance of Fast Infoset is much greater than parsing performance of XML 1.0. Typical increases in parsing speed observed for the reference Java implementation are a factor of 10 compared to Java Xerces, and a factor of 4 compared to the Piccolo driver (one of the fastest Java-based XML parsers)

Typical Applications

* Portable Devices - With mobile devices typically having access to low bandwidth data connections, and have slower CPUs. This can make Fast Infoset a better choice, lowering both data transmission and data processing times.

Persisting Large Volumes of Data - When persisting XML either to file or a database, the volume of data your system produces can often get out of hand. This has a number of detrimental effects; the access times go up as you're reading more data, CPU load goes up as XML data takes more effort to process, and your storage costs go up. By persisting your XML data in Fast Infoset format, it is possible to reduce the data volume by up to 80 percent.

Passing XML via the internet - As soon as an application starts passing information over the internet, one of the main bottlenecks is bandwidth. If you send reasonable chunks of data, this bottleneck can seriously degrade the performance of your client applications and limit your server's ability to process requests. Reducing the amount of data moving across the internet reduces the time it takes a message to be sent or received, while increasing the number of transactions a server can process per hour.

Supporting Java WS Stacks:
Metro, Axis2 & CXF

No comments:

Post a Comment