Lec 20: Case study: Tor Past few lectures: security in the content of storage systems Today: case study of security in a communication systems - general idea: MixNet, which most anonymous communication tools are based on - Tor: a specific implementation of these ideas Goal: anonymous communication e.g., hide IP address from where user is browsing why? protect user from being identified by a web service to maintain privacy from a private entity protect user from oppressive government, employer, etc. intellectually challenging problem General method: mixnet if Alice wants to send message to Bob without anyone (including Bob) knowing who Alice is, she can do this by sending the message around a set of nodes that only know next node to send data to. large network of mixes sender: chose a random subset of all mixes create message put it in envelop 0 and put recipient's address on it put envelop 0 in envelop 1 with the address for random mix 1 put envelop 1 in envelop 2 with the address of random mix 2 etc. send message: send message to mix n mix n opens envelop n and forwards envelop n-1 to mix n-1 etc. potential additional measures: ensure that mix cannot open and modify messages mix send and i setup keys to encrypt and authenticate envelop i (and its content) thwart traffic analysis by global observer, that gets to see all messages send all data in fixed-size messages (pad the message)---that way you can't track message around network by figuring out what size it would be if you removed the envelope from it and sent it out of a node. delay and reorder messages---don't put a FIFO queue at each node, or else you'd know which packet is being forwarded around the network generate cover traffic---nodes generate random traffic so that it's harder for global observer to track any one message recipient or adversary: knows that the message came from mix n-1, but (hopefully) cannot find path through mixnet and identify the IP address of the sender. attacks what if adversary compromises/provides some mixes? - for each node they own, all they can tell is pre- and post-nodes. what if adversary can change content of some messages? what if adversary can modify message? - can't, they are encrypted. the worst they can do is cause message checksum to not work, or send it in the wrong direction, and just drop all messages what if adversary can observe some traffic in network? - unless they own all nodes, they can't follow the packet around - unless you encrypt message in upper-level protocol, the exit node can read your traffic to the destination. So use https, or something like that to ensure exit nodes can't spy on you. what if adversary can observe all incoming and outgoing traffic? what if adversary can generate traffic? - they might send traffic in the reverse direction to try to get response from Alice to see what she replies etc. weakness---need a central location to store all keys for mix nodes. So if adversary can get control of that file, they can control the system. Tor: the Second-Generation Onion Router Dingledine, Mathewson, and Syverson goals: low-latency anonymous TCP connections - delaying packets is hard - timing at least one end of the channel lets you profile packets as they are sent/received deployability usability - with deployability, helps increase anonymity, as it's harder to profile content flexibility simple threat model; no global adversary! no end-to-end confirmation attacks no intersection attacks possibly reasonable: if it's truly global, no one government or organization can view all packets traffic analysis attacks by observing some fraction of network traffic make it difficult for attacker to learn nodes to attack e.g., for confirmation attack, where start and end nodes are controlled by collaborators no insertion attacks---can't observe what's connected in the system modify, delete, delay traffic adversary can operate a few mixes (not all of them) adversary can compromise a few mixes at a time - nodes must change encryption keys pretty frequently To achieve usability, make it easy to install Tor and use it as a standard web proxy that you can configure your system to use. No cover traffic---if routers generated random traffic, you'd waste their traffic Sysadmin can restrict which ports/ips to support, so that you can disallow yourself to act as an email proxy for spammers, etc. basic plan: network of onion routers (ORs) connects in a full mesh (all routers talk to all other routers) with TLS (the protocol behind SSL) encrypted links ORs know the public keys of all other ORs. This is maintained in a directory service with list of all ORs and their keys. Can OR trust public key? Multiple directory services are available, and directory services can be rated and are a bit more trusted through backchannels. all traffic is sent in fixed-size cells (512 bytes) user configure browser to use onion proxys (OPs) as Web proxy. OP speaks SOCKS OP sets up and breaks down a circuit through a path of ORs OP knows public keys of all ORs OP incrementally builds up circuit sets up shared keys with each OR---uses symmetric key encryption on each OR along the path to ensure that routing tables at each OR don't contain identifying information for each session Frequently rebuild circuit. why? to make sure a circuit isn't identified user enter URL OP receives URL from browser danger: browser resolved DNS name! avoid problems: DNS must run through TOR or use an anonymizing service. OP sets up stream to an exit OR on a circuit stream control cells are relayed like data cells OP relays cells along circuit to chosen exit OR OP encrypts messages recursively. why? So no OR can see/change what other ORs will read. messages have a digest. why? To make sure nothing was modified along the way. only exit OR checks digest. why? speed exit OR connects to Web server and tell OP when successful OP accepts data from browser and relays it through the stream Web server knows IP address of exit OR send response to exit OR, which relays relays through the stream each OR encrypts multiple TCP connection over a single circuit---for low latency several users may share OR features: forward secrecy. why? if you change keys frequently, then if adversary steals/breaks them, they can't keep monitoring traffic. no mixing or traffic shaping, but they do padding, fixed-size packets. why no mixing/traffic shaping? would increase latency leaky-pipe topology. why? congestion control. why? to ensure no one can flood nodes to ensure that they are more overloaded than yours trusted directory servers. why? this is self-policing variable exit policies. why? rendezpoints and hidden services. why? so a webserver can provide a non-DNS name and advertise itself to ORs by which a client can find it. Attacks - adversary controls mixnet? can only control a few - adversary controls webserver? bad---can send malicious javascript or cookies to track you. This is bad for usability, since the only solution is to turn off these features - adversary modified messages? hard due to encryption/digest - adversary OR generates traffic? can have it generate traffic to slow down another OR and see how that affects packets you are sending at an expected rate to see what paths it is slowed down on. You can keep following slow links until you've mapped out whole circuit. http://en.wikipedia.org/wiki/Tor_(anonymity_network) http://www.cl.cam.ac.uk/users/sjm217/papers/oakland05torta.pdf