This was solved by using STUN. With this mechanism, the client first finds out its public IP address/port by querying a STUN server, then send this public address instead of its private address to the other client. When both sides are using this mechanism, they can then send media packets to these addresses, thereby creating a mapping in the NAT (also called opening a "hole", hence this mechanism is also popularly called "hole punching") and both can then communicate with each other.