Search This Blog

Wednesday, January 25, 2006

Lampost Wi-Fi (They call it Mesh, Mun-Fi and other snazzy names)


Notice: The content of this article is based on publicly available knowledge. Whereever applicable, the author has provided links to articles that have published this information. Specifically, no confidential information, unavailable on public forums is presented here.
I just got back from a week long trip to the Bay Area (almost my second home). One of the nicest things about the Bay Area is the constant spirit of innovation and entrepreneurship that drives the people there. I remember meeting an old friend of mine three years ago, when he excitedly described that he is working on an initiative to “unwire” large scale areas (metro) by placing access points and implementing an efficient inter-access point routing protocol to offer a paraNetwork. When I heard it then, I wished him best of luck, but was fairly positive that this effort would be a disaster. After all, there is the Internet, there are cable companies, there are DSL providers. So why do we need this technology ?

Today, I had lunch with him again. And now, he is excited, not because he thinks it is a great concept, but is excited because his dream is being implemented. Google partnered with them recently to unwire Mountain View (ref: here ). Earthlink signed up with them and Motorola to provide city wide wifi access to 5 major cities (ref: here) , including the now famous ‘Philadelphia – City of Brotherly love wifi initiative’ (ref: here ).

The company is Tropos Networks, based in Sunny Vale, CA.

This article, is however, more about the implications of this technology. Till a technology is proven, there are always cynics (including me). I have been involved in the VoIP industry for many years to have seen the same cynicism when VoIP first promised to be an alternate solution to the PSTN. Till Vonage came in and proved it, no one really bought it. Today, pretty much every company is in this game. Broadband IP (wired) has come of age and has matured. Unwired broadband is going through the same cycles of ‘proof lies in the pudding’ that its wired sibling faced a couple of years ago.

The right technology, The right time, The right cost.

Verizon today talks about delivering FTOS with whopping megabit speeds and so does cable. Compared to that, the ‘proven’ wifi technologies go to 54mbps (I am not yet talking about the 200+ mbps promises of 802.11n and similar – IEEE has just ratified 802.11n and 802.16e – its years away till it gets ratified in the consumer space. OEM buildout is happening today, obviously). However, infrastructure is all about ‘doing the job well at the right time’. Ethernet’s popularity was not about being the most efficient protocol. It clearly was not, when introduced. Alternate protocols did a better job of collision management and guarantee of delivery. However, Ethernet met the requirements and was cheap and simple. SMTP is another example.

Slapping in a Wifi access point inside your house is a very different beast from a multi access point technology that can provide you connectivity as you are moving from one coverage area to another, or, say, driving at 75mph on your local highway. The biggest challenge is to provide the user a reliable high bandwidth connection with a constant IP address while at the same time addressing true mobility. This is what companies such as Tropos provide. There are many more in the fray, including Cisco and Nortel (more on that later).

So anyway, back to the fundamental question “Why do we need lamp-post networking when we have Cable, DSL and similar ? They offer higher bandwidth at home. On the other hand, carriers like Verizon are deploying HSDPA and similar technologies that will also offer high speed data access with mobility”

The simple answer is timing. Today, Tropos is deployed at over 300 sites successfully. Cisco has seen this success and has put their entire weight behind their own version of mesh-networking. Unlike Nortel, which chose to deploy OSPF as a routing protocol between access points, Cisco has developed its own ‘secret sauce routing protocol’ just as Tropos had done years ago. Today, Tropos is clearly 2 years ahead of the game. Whether Cisco or others can catch or not is another question. The point to consumers is that competition is healthy. Bigwigs such as Cisco will channel in millions of dollars in this space perfecting the technology while smaller fish like Tropos will continue to innovate to stay a leader. Net-Net: consumer profits irrespective of which company wins.

Alternate solutions such as HSDPA or mobility solutions from Verizon’s FiOS and similar are not there yet. It takes years to perfect and ensure that power, signal loss, interference and security are addressed appropriately before being an acceptable solution to the market.

While such technology is perfected, companies like Cisco and Tropos are proving that mesh works in a Municipal area with well proven 802.11b and 802.11g technologies. The core to this technology is efficient routing. With the IEEE ratification of 802.16e and 11n, plugging in these higher bandwidth chips into these access points becomes a much simpler problem to solve incrementally. Most users today really don’t need more than 802.11g (typically 20mbps in real life). With the convergence of Rich Media, it is expected that bandwith requirements will increase to 100mbps – but we are not there yet. By the time we are, I bet the newer .11x standards will be hardened as well to serve that market.

The cost for this technology is just about right as well. As an example, to unwire Mountain view, Google’s install cost for 400 nodes is $1,000,000 (approx) and a recurring cost of $17,000 pa (ref: here )

In other words, just for a million bucks, you can set up a complete access and distribution infrastructure to offer high speed bandwith with mobility to around 70,000 residents. If 60% of the population uses the service (and why not) at, say, $15 per month, with say 30% of that going as profit to the ISP (after taking into account maintainence, support and data center costs) do the math. Recovering investments is a non-issue. This same model can be applied to any large community.


Single vs. Dual radio

While Tropos uses a single radio channel in their access points (2.4GHz), companies such as Nortel and Cisco chose to use a dual channel in their access points (2.4Ghz and 5GHz). This is all about claims of bandwidth. Cisco and Nortel claim that providing a dual radio channel provides them with more spectrum to offer more bandwidth to their users. Folks like Tropos claim that
dual radio is more hype than reality and the cost of dual radio interference negates its supposed benefit. This difference of opinion is documented all around the web, including here.

It was about time someone did a more practical test and not theoretical formula. Rongdi Chan, working at the Network Research Center at Tsinghua University recently published his findings of performance throughput of the dual radio Nortel solution vs. that of the single radio Tropos solution. That report can be found here.

In summary, what he found was that as the number of hops increases (acess point – access point transfers) the Tropos thoughput fall-off was at 1/n while Nortel throughput decreased at (1/2)^n . Quite contrary to the theoretical assumptions of the double advantage of double radio, eh ? The reason? An efficient routing protocol affects throughput rates in a more significant way than adding another radio channel. I expect the Cisco performance to be better, though, assuming that their routing protocol is more efficient.


Net-Net (ElusiveCheese hates the term 'Net-Net' - he thinks it makes me a marketing dweeb)

To the consumer, such innovative technologies drive down cost and present incremental benefits quickly. Majority of the applications needs today are met by 802.11g and even 802.11b throughput rates. Leveraging this technology and providing true mobility is a far better approach than waiting to offer a similar service only when there is a possibility of providing 500mbps. This way, infrastructure gets validated incrementally, people get used to this new freedom, and application providers can start offering innovative solutions over such a paranet right away and take quick baby steps to perfect the ecosystem as oppsed to waiting for a revolutionary change 3 years down the line only to see it come crashing down a year later because it is not just robust enough. There is a reason why any successful product rolls out as prealpha, beta, 0.1 and then 1.0, people ! Starting with 2.0 right away, is , well, a recipe for disaster.

Tuesday, January 17, 2006

The Truth Hits Home




We all get so caught up on the mega-trends of technology that we sometimes miss the simple events that reminds us just how transforming IP is.

I had one such experience I want to share.

I came to the US in 1990 from India to go to grad school. In those days, AT&T used to charge me $2.65/min to call home. Communication was typically facilitated by having a network of expats who would carry home letters, gifts, and photographs! The Internet happened, at&t happened, and now I have literally free communication services to India. Great!

However, my parents lives were pretty much the same the past decade. They live in a humble home in a decent neighborhood and spend their retirement years with friends and family on a very modest pension.

And then, my Dad who is 68, learned about the Internet. He bought a computer, got broadband access, taught himself a variety of web programming languages, setup a family website, and finally setted on Flash programming. I absent mindedly encouraged him and assumed it was a passing fad. Boy, was I wrong!

The past year, my dad figured out that he could use his computer skills to rekindle his love for Math and Physics. He started building flash movies on a variety of basic physics concepts. He built simulators, arcade games, etc. He joined the flash developer's forum. He even helped a local (US) organization publish a car game for kids.

His work is beginning to get noticed in his community of interest. He's gotten emails complimenting his work from all over the world. Rafael, 75, a physics enthusiast and a mechanical engineer from Venezula wrote:



"I am a 75 year old retired mechanical engineer who uses Flash and Physics as a passtime. Have just come across your project and, believe me, I have spent some time observing the behaviour of the ball for different values of the elastic constants and find the whole very, very interesting. I sincerely congratulate you. I intend to keep playing with this because I am sure there is a lot to learn from it."


Recently, a midwestern company contacted him and gave him a contract programming job to build some physics simulations. He setup a paypal account and gets paid for having fun!

The truth just hit home: It is not all about companies attempting labor arbitrage or VCs demanding R&D centers in far flung places. It's not about jobs "moving"...

It is more about people all over the world empowered to express their creativity on a global scale and experience the Network. Broadband IP is just beginning.

Go Dad!

Friday, January 13, 2006

Life-graph of a Startup


There are hundreds of articles around the web from entreprenuers on the 'gotchas' of a starup and what to expect. One of the best visuals I saw anywhere is represented here. These folks are investors who work in the Indian market. I have personally known and worked with some on them in the past, and I mean it when I say that some of those folks running the show are the smartest I have ever worked with. This particular company, to the best of my knowledge ran a job placement website along with some social networking.

So here is a lifecycle graph of their company, which began as a startup, crashed during 9-11 and the dot com bubble, refocussed their priorities, got bought by a big company and eventually stabilized. Besides the last part, I am sure many of us have seen this exact scenario in the Valley.

The graph represents "team" optimism over a time scale (click for larger version) [reproduced after due permission from the author]


I think this graph has a lot of touchpoints for anyone who has gone through the pains and joys of starting a business (or being an integral part of it) : (let's ignore the sharp decline between Sep 2000 - Dec 2000 - no amount of planning and vision could have seen that 9-11 impact. Let's write it off as a sudden occurrence that simply couldn't be planned for).

I hope the authors of the article help us in answering some of these questions, that could be key learnings for us.

  • First signs of Dotcom burst: One of the most important things in a startup is to have people who have a finger on then nerve of their market. For this company "first signs of dotcom burst" occurs at June 2000 and the sudden sharp decline begins at around Sep 2000 (which obviously is due to the 9-11 attacks adding to the sliding dot com burst). What were the 'first signs' of June 2000 as seen by their leaders ? Was it the first 'customer hit directly impacting them' ? Since this graph talks about motivation of the entire team, it is hard to figure out when the leadership team really detected the 'burst' and what was their strategy to be pro-active in sensing the market as opposed to being hit first and then wake up.

  • 3 months from Death: One of the hardest times to retain employees who are not founders. The reason for this is actually quite simple and has direct correlation to monetary gains. The financial gain a founding member has from a successful business is usually magnitudes higher than what anyone else can hope to gain (10x, 20x or even much more ,depending on different companies). Motivation can essentially be broken up into 'intrinsic' and 'extrinsic' motivation (HBR on Breakthrough Thinking). 'Intrinsic' is when the motivation is fuelled internally within the employee (the desire to solve) while 'extrinsic' is fuelled due to monetary awards and the like. Founders and key employees usually have more intrinsic motivation (whether they like it or not, it is significantly also linked to the fact that they have financially more to gain or lose by the failure or success of the company, in addition, ofcourse, to wanting to 'make a difference'). Intrinsic motivation, however, usually decreases as you go further down into the employees and is usually minimal in the 1-4 year experience range. If you need to recover from the "3 months from Death" stage, you need to chip away at the overheads and ensure that the folks directly contributing to building the product remain. This stage is tricky, because it is hard to offer extrinsic motivation at this stage. So how do you go about retaining the employees at this stage ? What did the leaders of this company do different ? How different was the strategy that took them out of this stage when compared to their original desire when the company started ?

  • Monster acquires Jobahead: An acquisition could either be a great thing or a bad thing. Several companies here in the US get acquired for pittance (I've seem companies here sold for a total of $20m when their burn rate was twice this amount) while others, like Skype get acquired for ridiculously high amounts. Looking at the optimism graph here,one of two things might have happened: a) An acquisition was an exit strategy of this company to begin with - the company was modelled around selling out after a few years, or, b) Acquisition as an exit strategy was an evolution during the "3 months to death period". Either way, a job well done. However, one must ask at this stage: It is very unlikely that the level of optimism during an acquisition is the same across the company. An acquisition, being an exit criteria in a business plan usually implies payouts and/or conversion (for the investing VCs, preferred series A holders and founder stock pool). At this stage, it is hard for me to understand how this optimism can permeate throughout the entire company - it seems more like 'euphoria' (I define 'euphoria' as 'possibly irrational optimism', while 'optimism' is 'possibly irrational reality' :-) )

  • Business Settles: Interesting to note that the level of optimism seems to be a tad higher than when the company first started. Speaking from personal experience, this usually signifies a truly successful business and most importantly, a business that has executed well. People start out with great ambitions but very few can execute a business so that every employee feel a part of the whole as opposed to a hole in the part.
All in all, a very nice graph and trend.

Thursday, January 5, 2006

Security and Convergence in your Palm


Playing Devil's advocate (or when keeping 4 devices instead of one is a good thing):

One Device. In your pocket. Large Screen, hi speed over the air connection. Your phone, your planner, your email, your presence, your Instant Messaging client, your gaming console, your shopping center. It remembers all your preferences and your details. Just ‘one click and go’. Watch TV, order PPV – just one click. Listen to music, buy new albums – one click. Read documents, interact.

In other words, that little PDA of yours is your home away from home. Always connected. How much better can it get ?

One Word: Security

As phones get more 'powerful' they morph into general purpose machines, succeptible to the same remote exploits, DoS and security issues an open PC on the internet is. To top it off, many phones work on embedded OSes that cannot offer expensive virtual address space and address locking mechanisms making it easier for one application to write over the address space of others (think heap and stack exploits). Proof of concept viruses for smart phones are already old news (here and here. ) Most of the attacks on PDA phones use the basic concept of buffer overflow techniques - which are very powerful. The idea is this: Whenever a function is called, the return address to the calling function is stored in a stack. When the function exits, the return address is popped off the stack and control transfers to that return address. The idea, then, is to somehow overwrite that stored return address with one that points to malicious code. For example: if a badly written app does an strcpy(pFoo,fnData) and fnData is a userinput, I could craft a string for fnData that is large enough,that in trying to store fnData, the application overwrites it stack; and that string is actually a binary coded exploit that knows exactly where to place a modified return address to point back to another place in that same string, that is the malicious code. Ta-dah. We have an exploit. This is nothing new - techniques for buffer overflow exploits have existed for years. Just that our phones were too 'specific purpose' for it to do much harm.

Why is this only a problem related to 'smarter phones' ? Well, what is the worst that has happened to your ‘old’ phone ? You downloaded a game and it crashed your system. That’s it, right ? What is the worst that has happened to your email ? You clicked on a link, and it installed a trojan and your mail server sent a 100 viagra emails to all your buddies, from you. Ouch. What can happen if your phone presence is compromised ? Now put them all together as a converged application platform, where one application compromise can lead to a trojan compromising other installed applications. What happens if you click on your outlook client in your phone and that installs a trojan that takes over your phone control ? How cool – hackers now have multiple application routes and ports from which they can think of attacking your complete phone. Hey, its not a phone anymore – it’s a converged device ! A powerful computer in your palm. And since all these applications sit next to each other in your phone, if one application is compromised, a trojan can attack other applications within your phone – and no firewall can help, because its already in your phone ! Hooray. And if you don't think this is real, there are already theories out there which acheive buffer exploits via SMS messages.

So again, security. I hope application developers and phone OEMs realize that badly written applications lend themselves to easy exploits and as the convergence dream is racing ahead, so are security concerns. Gee, this was nothing. Here is a paranoid look at how wonderful it can get (click on image for larger version).


Tuesday, January 3, 2006

SPAM over Internet Telephony (SPIT, SPIM)


Happy New Year !!!
Click
here to get your FREE iPod Nano !

No Wait ! This is not a SPAM post. Rather its a post about SPAM !

There has been a lot of discussion in the past on how serious of a problem spam for Internet Telephony (SPIT) and Spam for Instant Messaging (SPIM) is as VoIP deployments increase their market share. Spam itself as we all know is all-pervasive in the email world. I was reading an interesting report from Symantec which reports that 67% of email is spam these days. While the percentage is staggering, there is some(?) comfort in knowing that this percentage has ‘stabilized’, which might mean that spam filters/gateways are maturing at a rate that is able to cope with the mutation of spam tricks.

The interesting thing is that even though pundits scream about the problem of spam over Internet Telephony, not too many carriers are biting, at least, for now. It’s not that they don’t think its important, but it just seems that they have other problems to solve (like it or not, security is the hardest and the last solved problem in real life).

So here is my perspective of SPIT/SPIM: I am going to make sure I pepper this article with enough 4 lettered acronyms so that you are left dazed and impressed.

Incidentally, here is a screenshot of IM SPAM I received today:




Is SPIT a real problem ?

In short, I feel that SPIT is a problem, that will eventually surface. The logic: the infrastructure needed for SPIT is very similar to the infrastructure needed for spam. The cost is very low, and there are no ‘Do-not-call’ registry’s on the internet (at least for now). In addition, it is harder to detect a valid IP address of source as compared to an originating phone number. All in all, the Internet offers SPITers better anonymity and lower costs – so why not ? Infact, there are companies who already think it’s a big problem and have products out. Not to mention that this space will soon be chock-a-block with patents.

However, I must mention that I really don’t think SPIT is a huge problem for ‘walled-garden’ networks – it’s a much bigger problem when you have peer2peer neworks.

Why is SPIT a bigger challenge than SPAM ?

There are a few reasons for this:

a) More Intrusive: a telephone is intrusive by nature where as email is not: When you get a phone call, it rings, and loud. You cannot just ignore the ring, nor can you ‘move on to the next call’ – you need to answer it. Compare that to a spam email, which you can choose to ignore and move ahead and eventually move to junk

b) Harder to detect: Today’s spam filters are fairly advanced. In addition to white and black lists, a lot of them implement heuristics which try and detect the language patter to filter out possible spam guised as valid content. With SPIT, you are not looking at text – it’s a voice. Detecting heuristics in a voice stream is much harder – add to that the fact that a single sentence can sound very different depending on who is speaking (accent, amongst other things)


SPIT in Walled Garden vs. Peer2Peer networks

One of the strongest and most effective ways to avoid SPIT is by good authentication. In an ideal world, if every user could be authenticated for its identity, the chances that a spam bot gets to you reduces greatly. Walled garden networks usually operate in ‘circles of trust’. Let me explain: lets suppose bob@verizon.com were to call sue@att.net via SIP. For the att.net proxy to accept the call from bob, it would most likely expect a digital authentication certificate from Verizon’s proxy telling the at&t proxy that this call actually came from verizon. In other words, AT&T trusts the Verizon network and expects the Verizon network to have authenticated its users. This sort of ‘trust circles’ between service providers is common. AT&T cannot go ahead and authenticate each and every user on a foreign domain, so instead it decides to trust the network, based on strong authentication. It is verizon’s responsibility to make sure that bob is a valid user. For a spammer to break such a network would require that a) It manages to successfully authenticate with verizon (which may well be an identity theft case) ,or, b) Manage to hijack the network authentication certificate and key to be able to ‘fake its network identity’ (which is not easy to do), or c) Be able to reach sue’s phone directly, bypassing both the proxies.

Case c) works if the UA itself accepts calls from any source. A secure UA should be configured to reject any calls that are not passed on to them from their inbound proxy via TLS. In other words, if bob@verizon.com were to call sue@att.net, Sue’s UA should detect if it came along with credentials from its authorized inbound SIP proxy and if not, it should be rejected, prompting the caller to go through the proxy.

If all UAs were indeed configured this way, then the problem of SPIT reduces a great deal (not eliminated, but significantly reduced). However, not all UAs are configured this way. In addition, while this can be enforced within a walled garden network, problems arise when:
  • The UA’s are part of a peer2peer network where there may not be any ‘central agreements of trust’
  • An adhoc-network tries to call into a walled garden network (Say, your uncle in Korea has set up his own FWD SIP phone and tried to call into you, a member of MCI’s SIP network –and MCI has no idea about @fwd.pulver.com – just an example)

One way around it is for UA’s to allow caller authentication. In other words, even if a call comes via unknown channels, instead of rejecting it, challenge the caller for identity, while at the same time, don’t make it cumbersome for the called user (for example, if every SPIT phone rang your phone, and then challenged it, you’d switch back to PSTN within a day ! At least in PSTN, the FCC has done an effective job with the DoNotCall registry). For example, Vonage, which is an example of a Walled Garden implementation requires authentication at the very least to let calls in.


Various ways to address SPIT

SPIT, just like its older sibling spam, is a prime example of ‘many partial solutions lend to a stronger overall solution’. In otherwords, there is no one mechanism that can effective
ly eliminate SPIT. The solution is to deploy various levels of defence in the hope that one of them catches the call before it reaches you. Here are some of them (since we mentioned, that content filtering which work with spam is mostly useless with voice)

  • Strong Authentication – is an important first step in filtering SPIT. As discussed in detail above, one of the best ways to filter spit is by network and UA participation, where the UA accepts calls only from a TLS route from its inbound proxy and the Network authenticates users. However, this is far from the current deployment situation (many UA’s don’t yet support TLS).
  • Reputation Based Systems – in addition to network identity, a reputation system works by assigning scores to sources. This score is a statistical formula based on its history. As an example, if a spit-er ‘seller@tek.ru’ makes a lot of calls to a network, and the users ‘flag’ the call as a ‘bad call’, then, depending on several factors, that identity could be marked as ‘bad’ and this reputation could be distributed across the network to warn other others. There are ofcourse several challenges to this: a) Spam agents often change identifies b) It is easy to poision such a system – for example, force negative feedback even when its not true. In other words, a good reputation system is a harder solution than it sounds
  • Central black lists – Not a complete solution, but an effective one. Spammers will keep creating new addresses while the black lists will keep adding to their repository. The lists eventually get more complete and more effective. Such lists exist today, and are very helpful.
  • Puzzles – Another mechanism is when a call arrives from an unknown source, throw an automated challenge. An example: If I receive a call for the very first time from user “Joe” instead of ringing my phone, redirect him to an IVR that asks “Joe” to press some random 4 digit sequence as a security verification. This is an irritant for Joe, especially if he is a valid user – but needs to be done only once. For this to work, however, either the called UA or the called UA’s network server needs to remember whether this is a first call, or whether Joe has called before. I personally think this is an effective solution, albeit with an irritant factor for the caller, but hopefully only once. The bigger concern, however, is who remembers this is the first call or the 1 millionth, especially when thousands call you over years. We need Google’s infinite database utilization techniques !
  • ­Payment systems – Folks like microsoft pushed this hard for emails. The idea is simple – for you to make a call, you first need to deposit a payment via some payment gateway for the called network. If the call is accepted, you get refunded, if not, you lose your money. I personally dislike this solution and don’t think it will work at all for end user voip networks – the very idea of having a credit card on file or doing a transaction per new call to a new network is a turn-off for consumers.

Conclusion
There are, ofcourse many more solutions and more in-depth discussions we could get into. But that would go into many more pages and a lot more of in-depth arguments. In short, it doesn't take much more than Spam's infrastructure to do SPIT effectively. The problem, as I see it, is more pronounced for de-centralized and peer2peer networks, but is also a problem that can plague walled garden networks (for example, even if the UA were to reject calls that were not routed by its trusted proxy, how many such requests are needed for a denial-of-service attack directly on the UA ?). A lot of SPIT can be avoided if an 'onion ring' spit detection and filtering mechanism is deployed, where both the network and the client participates in the effort.

Hope this whets your appetite as you read more.

Further reading: here, here, here