Google today released Protocol Buffers. Protocol Buffers is their “language-neutral, platform-neutral, extensible mechanism for serializing structured data”. In general it’s pretty interesting stuff, and looking over the docs, seems pretty well thought out.
I agree XML is bulky and wasteful for the task. There’s a reason why many web developers prefer JSON rather than actual XML when using xmlHttpRequest: XML parsing can be a real performance killer. JSON in my mind is currently the winner in this department since it’s light weight, simple, and a can be interpreted by pretty much any language on the planet (may need to install a module, gem, extension, or include a class). The downside to JSON is that it doesn’t really allow you to define structure. JSON also is still not binary format, so you have a performance penalty to parse the string. The upside is that JSON is rather easy for humans to read (great for debugging). The NY Times even made a database abstraction layer called DBSlayer that interfaces using JSON.
Serialized PHP has become somewhat popular (Yahoo Developer Network API’s support it), but it’s language specific, though interpreters that can read/write it exist for other languages including Perl, Python and Java. It’s also somewhat complicated for what it provides. At a glance it’s a string of garbage until you break it down.
It looks like Google already has support for Java, Python, and C++. It’s only a matter of time before Perl, PHP, and Ruby get support for Protocol Buffers as well.
I could see Protocol Buffers being pretty useful in combination with Memcached.
It’s great to see Google open sourcing stuff like this.
PHP is crucial for broad adoption of the PB format and eventually become a (vice) standard.