doc_nat.h 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414
  1. /*
  2. * Copyright (C) 2008-2011 Teluu Inc. (http://www.teluu.com)
  3. *
  4. * This program is free software; you can redistribute it and/or modify
  5. * it under the terms of the GNU General Public License as published by
  6. * the Free Software Foundation; either version 2 of the License, or
  7. * (at your option) any later version.
  8. *
  9. * This program is distributed in the hope that it will be useful,
  10. * but WITHOUT ANY WARRANTY; without even the implied warranty of
  11. * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  12. * GNU General Public License for more details.
  13. *
  14. * You should have received a copy of the GNU General Public License
  15. * along with this program; if not, write to the Free Software
  16. * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
  17. */
  18. /**
  19. @defgroup nat_intro Introduction to Network Address Translation (NAT) and NAT Traversal
  20. @brief This page describes NAT and the problems caused by it and the solutions
  21. \section into Introduction to NAT
  22. NAT (Network Address Translation) is a mechanism where a device performs
  23. modifications to the TCP/IP address/port number of a packet and maps the
  24. IP address from one realm to another (usually from private IP address to
  25. public IP address and vice versa). This works by the NAT device allocating
  26. a temporary port number on the public side of the NAT upon forwarding
  27. outbound packet from the internal host towards the Internet, maintaining
  28. this mapping for some predefined time, and forwarding the inbound packets
  29. received from the Internet on this public port back to the internal host.
  30. NAT devices are installed primarily to alleviate the exhaustion of IPv4
  31. address space by allowing multiple hosts to share a public/Internet address.
  32. Also due to its mapping nature (i.e. a mapping can only be created by
  33. a transmission from an internal host), NAT device is preferred to be
  34. installed even when IPv4 address exhaustion is not a problem (for example
  35. when there is only one host at home), to provide some sort of security/shield
  36. for the internal hosts against threats from the Internet.
  37. Despite the fact that NAT provides some shields for the internal network,
  38. one must distinguish NAT solution from firewall solution. NAT is not
  39. a firewall solution. A firewall is a security solution designed to enforce
  40. the security policy of an organization, while NAT is a connectivity solution
  41. to allow multiple hosts to use a single public IP address. Understandably
  42. both functionalities are difficult to separate at times, since many
  43. (typically consumer) products claims to do both with the same device and
  44. simply label the device a "NAT box". But we do want to make this distinction
  45. rather clear, as PJNATH is a NAT traversal helper and not a firewall bypass
  46. solution (yet).
  47. \section problems The NAT traversal problems
  48. While NAT would work well for typical client server communications (such as
  49. web and email), since it's always the client that initiates the conversation
  50. and normally client doesn't need to maintain the connection for a long time,
  51. installation of NAT would cause major problem for peer-to-peer communication,
  52. such as (and especially) VoIP. These problems will be explained in more detail
  53. below.
  54. \subsection peer_addr Peer address problem
  55. In VoIP, normally we want the media (audio, and video) to flow directly
  56. between the clients, since relaying is costly (both in terms of bandwidth
  57. cost for service provider, and additional latency introduced by relaying).
  58. To do this, each client informs its media transport address to the other
  59. client , by sending it via the VoIP signaling path, and the other side would
  60. send its media to this transport address.
  61. And there lies the problem. If the client software is not NAT aware, then
  62. it would send its private IP address to the other client, and the other
  63. client would not be able to send media to this address.
  64. Traditionally this was solved by using STUN. With this mechanism, the client
  65. first finds out its public IP address/port by querying a STUN server, then
  66. send sthis public address instead of its private address to the other
  67. client. When both sides are using this mechanism, they can then send media
  68. packets to these addresses, thereby creating a mapping in the NAT (also
  69. called opening a "hole", hence this mechanism is also popularly called
  70. "hole punching") and both can then communicate with each other.
  71. But this mechanism does not work in all cases, as will be explained below.
  72. \subsection hairpin Hairpinning behavior
  73. Hairpin is a behavior where a NAT device forwards packets from a host in
  74. internal network (lets call it host A) back to some other host (host B) in
  75. the same internal network, when it detects that the (public IP address)
  76. destination of the packet is actually a mapped IP address that was created
  77. for the internal host (host B). This is a desirable behavior of a NAT,
  78. but unfortunately not all NAT devices support this.
  79. Lacking this behavior, two (internal) hosts behind the same NAT will not
  80. be able to communicate with each other if they exchange their public
  81. addresses (resolved by STUN above) to each other.
  82. \subsection symmetric Symmetric behavior
  83. NAT devices don't behave uniformly and people have been trying to classify
  84. their behavior into different classes. Traditionally NAT devices are
  85. classified into Full Cone, Restricted Cone, Port Restricted Cone, and
  86. Symmetric types, according to <A HREF="http://www.ietf.org/rfc/rfc3489.txt">RFC 3489</A>
  87. section 5. A more recent method of classification, as explained by
  88. <A HREF="http://www.ietf.org/rfc/rfc4787.txt">RFC 4787</A>, divides
  89. the NAT behavioral types into two attributes: the mapping behavior
  90. attribute and the filtering behavior attribute. Each attribute can be
  91. one of three types: <i>Endpoint-Independent</i>, <i>Address-Dependent</i>,
  92. or <i>Address and Port-Dependent</i>. With this new classification method,
  93. a Symmetric NAT actually is an Address and Port-Dependent mapping NAT.
  94. Among these types, the Symmetric type is the hardest one to work with.
  95. The problem is because the NAT allocates different mapping (of the same
  96. internal host) for the communication to the STUN server and the
  97. communication to the other (external) hosts, so the IP address/port that
  98. is informed by one host to the other is meaningless for the recipient
  99. since this is not the actual IP address/port mapping that the NAT device
  100. creates. The result is when the recipient host tries to send a packet to
  101. this address, the NAT device would drop the packet since it does not
  102. recognize the sender of the packet as the "authorized" hosts to send
  103. to this address.
  104. There are two solutions for this. The first, we could make the client
  105. smarter by switching transmission of the media to the source address of
  106. the media packets. This would work since normally clients uses a well
  107. known trick called symmetric RTP, where they use one socket for both
  108. transmitting and receiving RTP/media packets. We also use this
  109. mechanism in PJMEDIA media transport. But this solution only works
  110. if a client behind a symmetric NAT is not communicating with other
  111. client behind either symmetric NAT or port-restricted NAT.
  112. The second solution is to use media relay, but as have been mentioned
  113. above, relaying is costly, both in terms of bandwidth cost for service
  114. provider and additional latency introduced by relaying.
  115. \subsection binding_timeout Binding timeout
  116. When a NAT device creates a binding (a public-private IP address
  117. mapping), it will associate a timer with it. The timer is used to
  118. destroy the binding once there is no activity/traffic associated with
  119. the binding. Because of this, a NAT aware application that wishes to
  120. keep the binding open must periodically send outbound packets,
  121. a mechanism known as keep-alive, or otherwise it will ultimately
  122. loose the binding and unable to receive incoming packets from Internet.
  123. \section solutions The NAT traversal solutions
  124. \subsection stun Old STUN (RFC 3489)
  125. The original STUN (Simple Traversal of User Datagram Protocol (UDP)
  126. Through Network Address Translators (NATs)) as defined by
  127. <A HREF="http://www.ietf.org/rfc/rfc3489.txt">RFC 3489</A>
  128. (published in 2003, but the work was started as early as 2001) was
  129. meant to be a standalone, standard-based solution for the NAT
  130. connectivity problems above. It is equipped with NAT type detection
  131. algoritm and methods to hole-punch the NAT in order to let traffic
  132. to get through and has been proven to be quite successful in
  133. traversing many types of NATs, hence it has gained a lot of popularity
  134. as a simple and effective NAT traversal solution.
  135. But since then the smart people at IETF has realized that STUN alone
  136. is not going to be enough. Besides its nature that STUN solution cannot
  137. solve the symmetric-to-symmetric or port-restricted connection,
  138. people have also discovered that NAT behavior can change for different
  139. traffic (or for the same traffic overtime) hence it was concluded that
  140. NAT type detection could produce unreliable results hence one should not
  141. rely too much on it.
  142. Because of this, STUN has since moved its efforts to different strategy.
  143. Instead of attempting to provide a standalone solution, it's now providing
  144. a part solution and framework to build other (STUN based) protocols
  145. on top of it, such as TURN and ICE.
  146. \subsection stunbis STUN/STUNbis (RFC 5389)
  147. The Session Traversal Utilities for NAT (STUN) is the further development
  148. of the old STUN. While it still provides a mechanism for a client to
  149. query its public/mapped address to a STUN server, it has deprecated
  150. the use of NAT type detection, and now it serves as a framework to build
  151. other protocols on top of it (such as TURN and ICE).
  152. \subsection midcom_turn Old TURN (draft-rosenberg-midcom-turn)
  153. Traversal Using Relay NAT (TURN), a standard-based effort started as early
  154. as in November 2001, was meant to be the complementary method for the
  155. (old) STUN to complete the solution. The original idea was the host to use
  156. STUN to detect the NAT type, and when it has found that the NAT type is
  157. symmetric it would use TURN to relay the traffic. But as stated above,
  158. this approach was deemed to be unreliable, and now the prefered way to use
  159. TURN (and it's a new TURN specification as well) is to combine it with ICE.
  160. \subsection turn TURN (draft-ietf-behave-turn)
  161. Traversal Using Relays around NAT (TURN) is the latest development of TURN.
  162. While the protocol details have changed a lot, the objective is still
  163. the same, that is to provide relaying control for the application.
  164. As mentioned above, preferably TURN should be used with ICE since relaying
  165. is costly in terms of both bandwidth and latency, hence it should be used
  166. as the last resort.
  167. \subsection b2bua B2BUA approach
  168. A SIP Back to Back User Agents (B2BUA) is a SIP entity that sits in the
  169. middle of SIP traffic and acts as SIP user agents on both call legs.
  170. The primary motivations to have a B2BUA are to be able to provision
  171. the call (e.g. billing, enforcing policy) and to help with NAT traversal
  172. for the clients. Normally a B2BUA would be equipped with media relaying
  173. or otherwise it wouldn't be very useful.
  174. Products that fall into this category include SIP Session Border
  175. Controllers (SBC), and PBXs such as Asterisk are technically a B2BUA
  176. as well.
  177. The benefit of B2BUA with regard to helping NAT traversal is it does not
  178. require any modifications to the client to make it go through NATs.
  179. And since basically it is a relay, it should be able to traverse
  180. symmetric NAT successfully.
  181. However, since it is a relay, the usual relaying drawbacks apply,
  182. namely the bandwidth and latency issue. More over, since a B2BUA acts
  183. as user agent in either call-legs (i.e. it terminates the SIP
  184. signaling/call on one leg, albeit it creates another call on the other
  185. leg), it may also introduce serious issues with end-to-end SIP signaling.
  186. \subsection alg ALG approach
  187. Nowdays many NAT devices (such as consumer ADSL routers) are equipped
  188. with intelligence to inspect and fix VoIP traffic in its effort to help
  189. it with the NAT traversal. This feature is called Application Layer
  190. Gateway (ALG) intelligence. The idea is since the NAT device knows about
  191. the mapping, it might as well try to fix the application traffic so that
  192. the traffic could better traverse the NAT. Some tricks that are
  193. performed include for example replacing the private IP addresses/ports
  194. in the SIP/SDP packet with the mapped public address/port of the host
  195. that sends the packet.
  196. Despite many claims about its usefullness, in reality this has given us
  197. more problems than the fix. Too many devices such as these break the
  198. SIP signaling, and in more advanced case, ICE negotiation. Some
  199. examples of bad situations that we have encountered in the past:
  200. - NAT device alters the Via address/port fields in the SIP response
  201. message, making the response fail to pass SIP response verification
  202. as defined by SIP RFC.
  203. - In other case, the modifications in the Via headers of the SIP
  204. response hides the important information from the SIP server,
  205. nameny the actual IP address/port of the client as seen by the SIP
  206. server.
  207. - Modifications in the Contact URI of REGISTER request/response makes
  208. the client unable to detect it's registered binding.
  209. - Modifications in the IP addresses/ports in SDP causes ICE
  210. negotiation to fail with ice-mismatch status.
  211. - The complexity of the ALG processing in itself seems to have caused
  212. the device to behave erraticly with managing the address bindings
  213. (e.g. it creates a new binding for the second packet sent by the
  214. client, even when the previous packet was sent just second ago, or
  215. it just sends inbound packet to the wrong host).
  216. Many man-months efforts have been spent just to troubleshoot issues
  217. caused by these ALG (mal)functioning, and as it adds complexity to
  218. the problem rather than solving it, in general we do not like this
  219. approach at all and would prefer it to go away.
  220. \subsection upnp UPnP
  221. The Universal Plug and Play (UPnP) is a set of protocol specifications
  222. to control network appliances and one of its specification is to
  223. control NAT device. With this protocol, a client can instruct the
  224. NAT device to open a port in the NAT's public side and use this port
  225. for its communication. UPnP has gained popularity due to its
  226. simplicity, and one can expect it to be available on majority of
  227. NAT devices.
  228. The drawback of UPnP is since it uses multicast in its communication,
  229. it will only allow client to control one NAT device that is in the
  230. same multicast domain. While this normally is not a problem in
  231. household installations (where people normally only have one NAT
  232. router), it will not work if the client is behind cascaded routers
  233. installation. More over uPnP has serious issues with security due to
  234. its lack of authentication, it's probably not the prefered solution
  235. for organizations.
  236. \subsection other Other solutions
  237. Other solutions to NAT traversal includes:
  238. - SOCKS, which supports UDP protocol since SOCKS5.
  239. \section ice ICE Solution - The Protocol that Works Harder
  240. A new protocol is being standardized (it's in Work Group Last Call/WGLC
  241. stage at the time this article was written) by the IETF, called
  242. Interactive Connectivity Establishment (ICE). ICE is the ultimate
  243. weapon a client can have in its NAT traversal solution arsenals,
  244. as it promises that if there is indeed one path for two clients
  245. to communicate, then ICE will find this path. And if there are
  246. more than one paths which the clients can communicate, ICE will
  247. use the best/most efficient one.
  248. ICE works by combining several protocols (such as STUN and TURN)
  249. altogether and offering several candidate paths for the communication,
  250. thereby maximising the chance of success, but at the same time also
  251. has the capability to prioritize the candidates, so that the more
  252. expensive alternative (namely relay) will only be used as the last
  253. resort when else fails. ICE negotiation process involves several
  254. stages:
  255. - candidate gathering, where the client finds out all the possible
  256. addresses that it can use for the communication. It may find
  257. three types of candidates: host candidate to represent its
  258. physical NICs, server reflexive candidate for the address that
  259. has been resolved from STUN, and relay candidate for the address
  260. that the client has allocated from a TURN relay.
  261. - prioritizing these candidates. Typically the relay candidate will
  262. have the lowest priority to use since it's the most expensive.
  263. - encoding these candidates, sending it to remote peer, and
  264. negotiating it with offer-answer.
  265. - pairing the candidates, where it pairs every local candidates
  266. with every remote candidates that it receives from the remote peer.
  267. - checking the connectivity for each candidate pairs.
  268. - concluding the result. Since every possible path combinations are
  269. checked, if there is a path to communicate ICE will find it.
  270. There are many benetifs of ICE:
  271. - it's standard based.
  272. - it works where STUN works (and more)
  273. - unlike standalone STUN solution, it solves the hairpinning issue,
  274. since it also offers host candidates.
  275. - just as relaying solutions, it works with symmetric NATs. But unlike
  276. plain relaying, relay is only used as the last resort, thereby
  277. minimizing the bandwidth and latency issue of relaying.
  278. - it offers a generic framework for offering and checking address
  279. candidates. While the ICE core standard only talks about using STUN
  280. and TURN, implementors can add more types of candidates in the ICE
  281. offer, for example UDP over TCP or HTTP relays, or even uPnP
  282. candidates, and this could be done transparently for the remote
  283. peer hence it's compatible and usable even when the remote peer
  284. does not support these.
  285. - it also adds some kind of security particularly against DoS attacks,
  286. since media address must be acknowledged before it can be used.
  287. Having said that, ICE is a complex protocol to implement, making
  288. interoperability an issue, and at this time of writing we don't see
  289. many implementations of it yet. Fortunately, PJNATH has been one of
  290. the first hence more mature ICE implementation, being first released
  291. on mid-2007, and we have been testing our implementation at
  292. <A HREF="http://www.sipit.net">SIP Interoperability Test (SIPit)</A>
  293. events regularly, so hopefully we are one of the most stable as well.
  294. \section pjnath PJNATH - The building blocks for effective NAT traversal solution
  295. PJSIP NAT Helper (PJNATH) is a library which contains the implementation
  296. of standard based NAT traversal solutions. PJNATH can be used as a
  297. stand-alone library for your software, or you may use PJSUA-LIB library,
  298. a very high level library integrating PJSIP, PJMEDIA, and PJNATH into
  299. simple to use APIs.
  300. PJNATH has the following features:
  301. - STUNbis implementation, providing both ready to use STUN-aware socket
  302. and framework to implement higher level STUN based protocols such as
  303. TURN and ICE.
  304. - NAT type detection, useful for troubleshooting purposes.
  305. - TURN implementation.
  306. - ICE implementation.
  307. More protocols will be implemented in the future.
  308. Go back to \ref index.
  309. */