Search This Blog

Thursday, December 29, 2005

Developer Perspective: The painful evolution of SIP

Several years ago, I was the lead architect for a team that was building an H.323 Gatekeeper. I remember having attended a conference where a mild mannered bearded professor talked excitedly about something called ‘ZIP’ which seemed to be an IETF initiative that was pitched as an alternative to H.323 complexities. When I got back to my office, I researched on this new thing called ‘ZIP’ and the promise it had but found nothing. I then realized that ZIP was the german equivalent of the english SIP (when pronounced). I still remember having read the first IETF draft on SIP – it was a breath of fresh air compared to what I was building as far as protocols went. Actually, even though the first draft was out in 1996, I got involed only in 1998 when the SIP draft was still a breezy 101 pages cover-cover. Infact, I was so convinced about SIP that I started a SIP group with no budget funding and working in spare time. That was till we made our first release and sold to a whole bunch of customers. That's when the company woke up to it and suddenly everyone was interested.

Well, that was then. This is now. After having spent several years in building SIP components, Proxies and App server frameworks and then talking to customers who built on top of them (my past company is one of the most successful SIP infrastructure vendors in the market), I cannot help but think whether SIP was really ‘simple’ to start with simply because the standards never really encompassed everything that a real-life telecom deployment needed (which to a great degree H.323 did) . Over the years, SIP has grown into a veritable behemoth (269 pages in the SIP Standard, 25 pages of the SDP offer answer model and over a 150 drafts related to all things SIP) and I bet no one really thinks it to be simple anymore.

Infact, I think SIP has pretty much become a subject of its own market hype as well as lack of completeness of thought in the original ideas espoused. As far as I am concerned, today, SIP stands in the market as a ‘HTTP similar expandable protocol’ but to the developer, stands as a ‘Massive hack of spaghetti headers and rules”.

Of course, this does not really affect the top level app developers. The folks making middleware are the ones who deal with this mess. So if you are building a great new service on SIP, you probably have no idea what I am talking about. Heck, you probably don’t even need to know much about SIP. You probably just invoke APIs like CreateConference(,, and ta-dah ! You are done ! I’m not trivializing what you are doing – moving to a protocol independent service based architecture is what it should eventually come down to. But then, it still does not take away the fact that the folks doing application and stack platforms (BEA, Oracle, Radvision, Flextronics Software Systems, Ubiquity) are still cussing out loud. And their cussing matters, because if your infrastructure ain’t upto it, your pretty apps will eventually come crashing down.

So here are my thoughts on what’s wrong in SIP:

Text based protocol: Yes, a text based protocol is great to read and see as it flies over the wire. But who on earth needs to ? Do you really want me to believe that a person debugging a protocol cannot have a simple binary decoder built into his analyzer ? While being text based and touting that its ‘similar to HTTP’ excites the market, it is meaningless to a developer. Thanks to it being a text protocol with no message boundaries, we worry about buffer-underflows, overflows, not getting content-length in the chunk of messages that we receive and similar. Not to mention that we need expensive parsers that try and detect all gotchas to ensure that some sleazy string trick does not crash my parser. (Show me a simple SIP parser and I will show you 10 ways to break it). And then you need stuff like sigcomp to compress it so the wireless world can use it for the air interface.

The refusal of the IETF to standardize services: ‘We would like to encourage multiple ways to do one thing. It encourages development.’ Yes, I agree, when it comes to geeks in labs. In real deployment, we want two phones to be able to transfer a call successfully. Little wonder why it took upto 2003 (SIPit) for two different SIP phones to actually do a successful blind transfer. Had there been a recommended profile to begin with, we would not have un-successful half-calls for the next 2 years and by now would have moved on to better interop in more advanced scenarios such as conferencing (which is still progressing at a dreadfully slow rate over at XCON – mostly still stuck at requirements and framework level)

The Royal Routing Mess: SIP’s routing logic is probably the best example of how it has evolved to become a hack protocol. Strict routing evolved to Loose Routing. To make sure implementations don’t break, it was decided to support both. Then came the logic for loop-detection which got so painful that it was then suggested to do away with it and just fall back to Max-Forwards (keep looping till a count goes to 0 then break the loop). As if that was not enough, the recently introduced concepts of Globally Routable UA Uri (GRUU) essentially another hacked mechanism to differentiate between an instance of a UA vs. the general AOR that of a UA that could reach any instance of it (complicated ? don’t worry, most of us find it messy too) make this space even more miserable.

Bloat Bloat Bloat: SIP is a bloated protocol. As as said before, show me a minimal SIP parser and I will show you 10 ways to break it. As the market rides the hype of SIP, our poor friends at the cell phone companies are struggling to fit in good robust ‘text-based-HTTP-similar’ SIP protocol stacks into their phone while making sure a simple message to it does not reboot their phone. On top of it the complex 300 page RFC with processing rules just adds fuel to the fire.

‘This is not what SIP is meant for’: unless you are blind to the market, you would notice that 90% of SIP deployments that make money are in networks that choose to do in SIP what the PSTN did. Continuous harping on ‘This is not what SIP is meant for’ is meaningless. If this is true, then close the case for SIP. Don’t complain about the PSTN mapping to SIP when almost all deployments making money want that. This very ‘holier than thou’ attitude is one strong reason why SIP deployment is just about nearing the mature stage where it should have several years ago. (Do you know, doing a CLIP/CLIR in SIP was fairly undefined till a year ago ?). Every time folks at groups such as TISPAN post in the SIP working groups, they are met by technology zealots who repeat the ‘This is not what SIP is meant for’ to a point of fatigue.

Forking:This one construct is the achilles heel that adds un-neccessary complications to message handling. Who really needs it and why does it have to be an integral part of the base specification. If it is a part of a base spec, everyone has to implement it,
whether they need it or not. Move it to application scope and don't force 99% of the implementation world to worry about providing for situations such as HERFP.

‘End-End Architecure’ : The End-End Architecture and concepts are great. I understand that is what drives the IETF. But realize that SIP is being used in many networks that do not strictly follow the puristic Internet Architecture. So instead of shunning. I’m not knocking the End-End Architecture – its great, but even the original authors of such architecture ideas talk about situations where the purity of the end-end architecture is not realistic.

Losing ground of reality while dreaming about ‘Web’, ‘Presence’ and ‘Revolution’: SIP enables a lot of new services – but losing sight of reality while dreaming of next wave solutions such as servelets and SIP-CGI is again, not being realistic (for example, the long argument at SIPPING about draft-stein-great which raises real problems which seemingly are not important enough for SIP). Again, please look around you at the deployed networks and see what they are doing. And while you are looking, look at IMS too. Hopefully you will wake up then. It’s all traditional networking folks. As an example It’s unbelievable, that the Working Group has such a hard time accepting the fact that Session Border Controllers (SBCs) are a reality. Infact, when Skype first announced that SIP was not ‘good enough’ (this is one of the later links – I lost the original post by the Skype architect about SIP not being good enough) the SIP stalwarts responded by blasting the Skype architecture and why it was not ‘scalable’ (again, lost that link). Well, today Skype has done in one year what SIP could not do in five. The SIP folks finally woke up to the reality and are now trying to define P2P overlays on top of SIP. I applaud the effort, but it’s a little too late.

Backward-compatibility burden: SIP is burdened by the load of backward compatibility more than it should be. While this is good talking material to the press ‘We designed a protocol that is always backward compatible’ it’s a living nightmare for developers. If there was short-sightedness in a specification, admit it and move on, and if the backward-compatibility is expensive, drop it. Developers will fix their code. Really. It’s much better than having a brittle implementation with a long ‘case’ statement.

In Conclusion

The SIP that the marketing world sees is very different from the ‘under-the-hood’ look. I just hope that the process of SIP evolution does not come crashing down. Infact, the only way to stop that eventuality is to move to a services based architecture where protocol becomes irrelevant (once the bad work is done, let it hide deep inside, never to be seen again). So Yay for SIP servelets ! I just hope the working groups move fast enough and not pursue their goal of complete end-end purity at every step, because if they did, by the time they had a solution that was perfect, teleportation would be a common means of travel, the single pill that cures all diseases known to makind would have been discovered, and we would all have bought vacation homes in Mars – no one really would care much about SIP then.


  1. I think you forgot to mention a major SIP stack vendor - this is RADVISION

  2. I did mention Radvision - please check again :-)

  3. Two of the most important things that I personally dislike about SIP and I am glad you mentioned both. 1) Why the heck have a text based protocol? Text based protocol in an embedded solution? Boy. 2) No architectural outlines.