Gaaah. I happened to bump into Slashdot today and read this:
“…is that when you install the Skype client, it will drain system resources by running as a supernode from time to time”
and finally concluding that the author will more likely use SIP over Skype.
Implictly implying that with SIP, you are free from such issues !
Let’s get the facts straight:
- P2P is an architecture, SIP is a protocol. Skype is a product, and Skype uses its own proprietary protocol (you can call it ‘Skype Protocol’ if you want).
- A SuperNode system forms a fundamental design choice of many existing P2P networks, including Skype, Kazaa, Grokster and several other massively scaled networks.
- Today, most of SIP’s deployment uses a centralized architecture. In other words, all your SIP phones register with some central server and some central proxy. Your calls are routed through them. If they fail, you cannot reach other users, or, will have to attempt to call them directly (not as simple, because the person who is sitting in your buddy list as email@example.com may actually be firstname.lastname@example.org and this complex ID is mapped to its simpler one by the proxy /location server that went down.
- There is current frantic work going on in the p2psip mailing list which is attempting to solve the following issues:
- How does one map a SIP flow over a P2P network ?
- Does it make sense to deploy a SIP overlay over a P2P network with common architectural principals of existing P2P networks (like DHT, for instance), or,
- Does it make sense to deploy a P2P network over the SIP protocol ?
Supernodes is often quoted as a necessary evil for a largely scaled P2P network. Let us first spend a bit of time, understanding how P2P networks differ from centralized networks.
The biggest difference, is that in a pure P2P network, there is no well known or centralized node that is mostly always available. P2P networks are plagued by problems of churn (a node may be in a network at a particular point of time, and may disappear the next moment because the user logs out), location & routing (how do you locate a user, Joe, if you only know his name, but not how to get to him?)
To address such elementary issues, which do not exist in centralized networks, several implementations have implemented very effective algorithms, such as DHT (Distributed Hash Tables), which try and establish an analogy between a unique encoded key and the contents that need to be retrieved in such a way that the key could be used as a primary identifer to locate the data. For a client that is trying to locate that data, it would generate the key and this key would traverse a P2P network, using a defined protocol, till the node(s) that store data related to the key responds. Ofcourse, this is an oversimplification. There have been several improvements to optimized DHT routing, including alternate architectural suggestions on routing and location for P2P networks.
Now enter supernodes. Why do we need it ?
Well, let’s put it this way. Networks are not made the same around the world. At any one point of time, there will be users on a high speed cable, a medium speed dsl, or a low speed dial up. And they all want to communicate, and locate, effectively. Supernodes are inbetween nodes, between the source and the destination, which can provide additional services to other clients. Here are somethings a supernode could do:
- If User A wants to reach User B, and SuperNode (SN) knows a shorter way to reach B, it may act as a router to route User A directly to B, and avoid the message having to hop across multiple networks
- If User A is behind a firewall, and wants to talk to User B, but needs a Media Relay server outside its firewall to route media through, the SN, if it has enough CPU cycles empty, may agree to serve as A’s external Relay server for its session. During the session, if the SN CPU gets busy (say the owner of the SN decides to make a call), it can drop the role of a relay server and A will look for another
- If User A tries to reach User B and finds B offline, the SN may agree to be a ‘voice mail’ service for A, to receive its voice mail, then send it off to a central voice mail server and delete its copy. Ofcourse, the voice mail is typically encrypted with a key that the SN does not know of, so it is, for the most part, storing a bunch of encrypted bits for A.
But that’s the challenge of a P2P architecture. People choose a P2P network because it is scalable, and fault tolerant. In principle, there is no centralization, and each node can take over a functionality required to keep the network running, and there is a discovery protocol defined to find out such ad-hoc nodes. However, in gaining distribution and fault tolerancy, P2P networks need to deal with efficiency (how fast is the network) and effects of churn (if there is no accountability for nodes, and you cannot make guarantees of their availability status in the network [churn]. how do you provide services, like, say, voice mail, - who accepts the voicemail ?)
If we don’t have supernodes, the network wil need to re-discover itself each time, and will not be as efficient as users want it to be. If we don’t have supernodes, navigating networks and solving their challenges increase (firewall was one example above). If we don’t have supernodes, that is, if a client refuses to behave as anything more than a client, how do we provide services which, for example, need to kick in if some client is not online ?
The problem with SuperNodes is not in architecture, its in….
…Implementation! The problem with supernodes is that some networks do not allow you to specify when you choose to be a Supernode. And not without reason. If each participant decides to switch of Supernode functionality, the network performance degrades substantially. On the other hand, if the network decides for you, what the SuperNode rate threshold is, then you have no control over your computer’s resources. You need to trust the network.
An implementation can take it to limits – you may find your CPU choked at times – which is typically a result of bad threshold implementations.
Unfortunately the industry is now calling SuperNodes Malware ! I’ve seen lawsuit applications, I’ve seen marketing statements from new VoIP companies that say ‘We don’t do Supernodes’ like as if its evil.
You can’t have the cake
and eat it too. Supernodes are an important piece of the puzzle of well performing P2P networks. If you take them away, you may as well go back to the centralized model of operation, and keep P2P for your marketing collateral as what happens when your clients can directly call each other’s IP in a peer2peer fashion. Hooray !