Search This Blog

Friday, June 25, 2010

Facetime on Iphone 4: Vanilla unencrypted STUN and SIP

(July 13: sorry for the downtime, looks like my bandwidth limits were exceeded. Upgraded my hosting package - fixed)

(note: Only the call part is Vanilla SIP. The procedure for registering a Facetime user into their servers etc. is all non-SIP, encrypted/ciphered.)

(for my user review of the iphone4 and bumper read here)

Well heck, good job Apple! I just tested facetime and did a quick check on its protocol. No hacking needed - just an on the wire black box inspection - its just plain SIP and STUN for firewall discovery. Apple plans to make this protocol public, and they seem to have done an excellent job. And thanks for showing the world that you don't need complicated encryption and proprietary tunneling tricks for an excellent experience. You need a good codec set, a good media stack that can adaptively switch codecs and manage buffers  and a good 'point-of-presence' network for the most part.

I am just going to restrict this post to an overview of the flow.

Enjoy:

click on each image for a larger size (if they are small)

This is a facetime all flow - good, plain, SIP (they use MESSAGE for some proprietary data exchange during the call)

rest is perfect sip.



The protocols are here to see (besides SIP)



Ah here is their 200OK for INVITE



A quick look at their RTP stream:



Good Job Apple. Thanks for putting in an excellent quality, SIP client embedded into your dialer experience.

47 comments:

  1. Hi,
    Does this mean that we can call a iphone 4 user from an regular SIP ? Every single iphone 4 user must have it's own SIP address that we can contact or ?

    donnib

    ReplyDelete
  2. good point, whats the complete URI, is see its blocked out in your packet captures. i see support for iLBC, which is good also, could work with skype.

    ReplyDelete
  3. also, i want to point out that this really isnt all that suprising, 3gpp is all about the SIP - the proprietary data schema is in the IMS AS & HSS and the SCIM function of CSCF.

    ReplyDelete
  4. @jason: The complete req-URI in INVITE is user@myip:port - basically, facetime, in this version is sending it to the IP address+port of my iphone. On SIP/IMS: well, this is really plain SIP. It does not use any of the mandatory IMS extensions like PANI or others.

    ReplyDelete
  5. @donnib - that would really depend on how apple will authenticate/admit users. As I mentioned, while most of this is vanilla SIP, there is proprietary stuff going on, along with new headers in the message exchange (primarily MESSAGE). Anyhow, I'd prefer for Apple to first publish their protocol formally before I post a blog on these details.

    ReplyDelete
  6. It seems to me that the most impressive step occurs right before the SIP INVITE.

    They are doing a smooth transition from a 3G call into the VoIP session. Somehow, they are mapping a phone number to a visible IP address. Impressive enough in the simple cases, but downright amazing in the face of multiple operators, hidden caller-id, etc.

    How are they working this magic?

    ReplyDelete
  7. Did you happen to spot how it maps the called party's telephone number to their current IP address?

    ReplyDelete
  8. We...ll, only the call part is SIP. There is a lot of cipher/TLS/SSL exchanges going on to authenticate a facetime user - so don't expect to make a call to a Facetime SIP client using X-lite anytime soon ;-)

    ReplyDelete
  9. [...] de vidéoconférence ? La question reste entière… mais le sujet suscite manifestement un certain intérêt. Et justement, tant Apple – indirectement – que l’examen des trames échangées [...]

    ReplyDelete
  10. Thanks! I've been looking for this for a while.

    ReplyDelete
  11. @David, well, it seems pretty straightforward. Remember, Facetime does not do both hand-out (from CS-Wifi) and hand-in (Wifi-CS). It only does Hand-out. Once in wifi, you can't get back to CS - that call is dropped. Doing a handover from CS to Wifi is pretty straightforward. Basically, a Wifi call can be set up in the background while the CS call is active. If the Wifi call fails for any reason, the CS call continues. I can't speak for iphone, but in many other phones (like android), a CS call media is handled at the baseband level and for a VoIP call, it would be at the media framework level. Trying to establish a voip call does not interfere with the CS call at all. In fact, the part FaceTime does *not* do is more complicated - handover back from WiFi to CS - thats the more challenging part with respect to smooth media transition.

    It's not hard for apple at all to map the PSTN # to a VoIP #. My strong guess is that Apple has already authenticated the binding with their facetime servers via their TLS/SSL exchanges (try it out, disable/enable facetime in settings and each time you do it, you will see these new security associations being set up)

    With the identity authenticated, all apple needs to do really is to send you an INVITE to the IP:port that is discovered by STUN (maybe they use other ICE procedures if STUN fails). As far as the # that is displayed on your screen during facetime, that is just the From header text in the SIP INVITE (which is fine, because Apple has already authenticated the identity outside of SIP). Similarly, now, apple can use the same PSTN # (Which is unique to every phone) to differentiate VoIP users too- this is typical VoIP stuff - see the Contact header for example, in the INVITE that is received.

    ReplyDelete
  12. Has anyone tried sending an INVITE to the phone to see if it answers?

    ReplyDelete
  13. [...] its promise to publish the FaceTime video calling protocol, some details are starting to emerge. Arjun Roychowdhury did a little packet sniffing and reports that the calls seem to go over vanilla SIP and STUN. The [...]

    ReplyDelete
  14. I don't think David's question is answered yet: How does Apple get the two phones' IP addresses from the phone call, unless every iPhone 4 pings Apple every time it makes a call to report it's current IP.

    ReplyDelete
  15. @Marcus, well, if you look at it, there are many ways Apple may know your phone's IP. The entire framework of push notifications in iphone is based on a foundation of the apple push servers maintaining a persistent TCP connection as much as possible with your phone. There is HTTP traffic that also flows between your iphone and apple - when connected through WiFi, that would be your WiFi IP address. I don't know in the case of Facetime, which channel it uses to get your IP, but my point is there are several channels as described above - to get the initial INVITE to your phone (apple uses a different port for SIP). Then STUN comes in before the media starts flowing (All of this is a guess - but I think it is reasonable)

    ReplyDelete
  16. The SIP session is pretty standard, and then SDP will be sent in the INVITE to negotiate media endpoints and codecs to use to setup the call. This is where STUN comes in, as it allows media traversal through NAT.

    The FaceTime servers must have a media relay capability as well, because there will be many situations where two iPhones can't connect directly to one another and must use something in the cloud to pass the media between the two.

    ReplyDelete
  17. Thanks for the interesting information, that you have provided. In the following I have some remarks:

    The initial INVITE is sent, when you have answered the call at the called side.
    Have a look at the time stamps of the SIP messages, especially 180 Ringing and 200 OK! The SIP message flow does not correspond with the real call states.

    There is only one port (16402) for SIP signalling, RTP streams and RTCP!

    Apple does not use a SIP registrar / proxy. The session is established directly between the user agents.

    STUN does not address a STUN server in the cloud, but is end-to-end too. It seems, that it is used only to create the bindings in the NAT tables of the routers, simultaneously from both sides.

    Arjun, do you find any phone number, which is involved in the call, in the From, To, Via or Contact header?
    Or are only IP addresses used?
    Is the FQDN of the XMPP server part of the SIP URIs?
    Is it possible to post the XMPP server name / IP address?

    ReplyDelete
  18. @Matthias:
    1) Well, this session was when I received a Facetime call (INVITE came to my iphone 4) - INVITE is the invitation I got to answer the caller's facetime call. I looked at the call flow again - its absolutely in line with a standard SIP call - first I got an INVITE, then I sent 100, then I sent 180, then I sent 200, then I received ACK.

    2) Yes, the call is P2P as far as SIP goes, no proxy cuteness as far as I could see (looking at Via) - don't remember this fully, I'll check again, but I think thats correct. (As far as I could tell, apple is using encrypted HTTP and potentially SMS to assert the identity and routing path to the user)

    3) I'll take a look at the RTP packets again tomorrow, but no, I don't believe I saw SIP and RT/C/P on the same ports

    4) Yes, I found phone numbers. From, To and Contact.

    5) No, I am not comfortable posting the server IPs - I really don't want to give out apple server IPs at this stage (I fully understand anyone with FaceTime can easily see a wireshark dump for themselves, just that I don't think it is kosher for me to post it)

    ReplyDelete
  19. Where are the ICE attributes in the SDP answer in 200 OK message? Did Apple skipped some steps in ICE to optimize NAT discovery?

    ReplyDelete
  20. @Arjun:

    1) The time between initial INVITE and ACK is 146 msec, between 180 and 200 OK only 19 msec. It is not possible to answer to the call so quickly. That's because the real call establishment happens in XMPP (which is encrypted and you have no chance to decode it). The SIP session is used only to establish the media streams.

    2) Sure, P2P is a very basic SIP scenario, but the 'normal' way is to use Registrar and Proxy.

    3) You can also look at the port information in the From header and the m-lines for audio and video in SDP of the initial INVITE, that you published.

    ReplyDelete
  21. Let's note that the IP addresses of both parties as well as the phone numbers are transmitted in clear text in the SIP packets.

    Also, the conversation is over RTP, not SRTP, which means it is not encrypted.

    ReplyDelete
  22. [...] la gestion de la session) et STUN (pour passer à travers les proxys ou les firewalls) qui sont utilisés dans l’implémentation de Facetime. Le protocole RTP est utilisé pour le transport des paquets et l’encodage de la vidéo est [...]

    ReplyDelete
  23. To answer the question, how does Apple map IP to phone number: when the phone is set up, there is communication to registration.ess.apple.com, afterwards in all calls only to invitation.ess.apple.com. Further, the phone sends a SMS to Apple (in Europe via a UK number, you can check your bill) to link phone to number. If you change the SIM card, this happens again. Then, before call set up, the calling phone asks Apple for the IP of the target phone.

    ReplyDelete
  24. [...] it should be possible to connect via FaceTime using other methods. iConverged discovered the video connection is done through SIP and STUN and routed through the IP address and [...]

    ReplyDelete
  25. [...] FaceTime: SIP/STUN, + Verschlüsselter Kontakt zu Apple-Servern [...]

    ReplyDelete
  26. [...] bit technical but very interesting you may read to understand how FaceTime work. Really work ;-) iConverged Facetime on Iphone 4: Vanilla unencrypted STUN and SIP Leaked: Apple Stealing All FaceTime Information, AT&T Locks Users via OTA Updates Regards, [...]

    ReplyDelete
  27. [...] Facetime on Iphone 4: Vanilla unencrypted STUN and SIP – roychowdhury.org No hacking needed – just an on the wire black box inspection – its just plain SIP and STUN for firewall discovery. [...]

    ReplyDelete
  28. [...] read this post about a guy who has sniffed the iPhone4’s FaceTime [...]

    ReplyDelete
  29. how is the handover done exactly? are the CS and Wifi operating at the same time for a few seconds, ie CS audio + Wifi audio on mute, Wifi video running ?

    ReplyDelete
  30. As 3GPP has defined the Voice Call Continuity feature to allow voice calls to switch b/t WiFi and CS cellur, and it requires the handset support 3GPP IMS SIP. I guess FT can not support it now.

    ReplyDelete
  31. [...] Here are some quick pictures showing the process from behind the scenes (picture credits): [...]

    ReplyDelete
  32. [...] after FaceTime for iOS was released a few network dumps of FT calls were published online. What is clear from those dumps is that Apple built its own, proprietary peer [...]

    ReplyDelete
  33. Correct. There are no VCC procedures here.

    ReplyDelete
  34. Hello would you mind letting me know which webhost you're using? I've loaded your blog in 3 completely different internet browsers and I must say this blog loads a lot quicker then most. Can you suggest a good hosting provider at a fair price? Thanks a lot, I appreciate it!

    ReplyDelete
  35. After such a long time, anyone able to grab video stream from the packet and verified it is valid H.264 stream? It seems to me the video has been encrypted or manipulated so it does not look like a valid video stream.

    ReplyDelete
  36. OK, but apart from all the tech details, if I make a Facetime call to my wife in the UK, while I am away in the USA, it won't use mobile roaming data services, but will be billed as a normal phone call?

    ReplyDelete
  37. I use "A Small Orange"

    ReplyDelete
  38. Yes, the folks at packetscan did a much deeper analysis and reported that the video feed was not encrypted. Note that this was an early version of facetime (they did the analysis a few weeks after I did mine) - so I don't know if recent updates to facetime encrypted it. Read http://www.packetstan.com/2010/07/special-look-fa... for details.

    ReplyDelete
  39. SRTP packets *can* look just like RTP packets, if the optional fields are not provided. So the fact thy the conversation appears to be RTP is not conclusive evidence that the media payloads are not encrypted.

    See: http://images.apple.com/ipad/business/docs/iOS_Se...

    ReplyDelete
  40. An impressive share! I have just forwarded this onto a co-worker who had been conducting
    a little homework on this. And he actually bought me breakfast because I stumbled upon it for him.
    .. lol. So allow me to reword this.... Thank YOU for
    the meal!! But yeah, thanks for spending time to talk about this topic here on your blog.

    ReplyDelete
  41. The information about iphone 4is unique here and I think you shared all most all the points here.

    ReplyDelete
  42. 2013 Doudoune moncler pas cher,Moncler Femme Soldes En France

    ReplyDelete
  43. There is no need to prep them, just pop open a
    bag of baby carrots and enjoy the sweet crunchy
    taste. Whey protein powder is not only great for weight loss but it
    will help keep you satisfied until your next meal or snack.

    ReplyDelete