Profiling Bittorrent

INTRODUCTION

BitTorrent is a protocol used to distribute large amounts of data over the Internet using the peer to peer file sharing. It has emerged as one of the best methods to tranfer large files over the internet.

A BRIEF NOTE ON PROTOCOLS AND P2P

In the realm of communications, the term protocol refers to a system of digital rules for data exchange within or between computers. Protocols are to communications as programming languages are to computations.

Protocols exist at several levels in a telecommunication connection. For example, there are protocols for the data interchange at the hardware device level and protocols for data interchange at the application program level. In the standard model known as Open Systems Interconnection (OSI), there are one or more protocols at each layer in the telecommunication exchange that both ends of the exchange must recognize and observe. Protocols are often described in an industry or international standard.

The TCP/IP Internet protocols, a common example, consist of:
• Transmission Control Protocol (TCP), which uses a set of rules to exchange messages with other Internet points at the information packet level
• Internet Protocol (IP), which uses a set of rules to send and receive messages at the Internet address level
• Additional protocols that include the Hypertext Transfer Protocol (HTTP) and File Transfer Protocol (FTP), each with defined sets of rules to use with corresponding programs elsewhere on the Internet

So, for instance, when a user accesses a website, he/she uses HTTP to transfer content from the server. During downloading simple files from the internet, usually though websites, the FTP is used instead.
Conventionally, in terms of usage, the TCP/IP protocol has been the most popular networking model. However, more recently, especially since Napster, the Peer-to-Peer (P2P) model is increasingly being used.

P2P NETWORKING

The traditional protocols of data exchange are based on a client-server model – a single server hosts the data and provides all the upload bandwidth and clients just access this server and download data. This approach proves to be both expensive as well as relatively slow – expensive to maintain the server at all times and slow because of bandwidth limitations of a single system.

P2P or peer to peer is a decentralized communications model in which each party has the same capabilities and either party can initiate a communication session. Unlike the client/server model, in which the client makes a service request and the server fulfills the request, the P2P network model allows each node to function as both a client and server.

P2P systems can be used to provide anonymized routing of network traffic, massive parallel computing environments, distributed storage and other functions. Most P2P programs are focused on media sharing and P2P is therefore often associated with software piracy and copyright violation.

p2p-networks.jpg

Here is a link to an interactive video which explains how P2P and BitTorrent work: https://www.youtube.com/watch?v=6PWUCFmOQwQ

SOME TYPICAL BITTORRENT TERMS

Client: The BitTorrent client is the application you use to “load” the .torrent file so that you can connect to other people. There are a lot of different torrent clients available. The three most popular clients are Bitcomet, Azureus and uTorrent.

Indexer: Indexers are websites which list (index) .torrent files (myBitTorrent, Torrentz, Mininova etc.).

Leecher: A leecher is someone who is downloading (and uploading) a file. You are a leecher if you do not have a complete copy of the file you’re trying to get. Note that a leecher normally is someone who’s not uploading, that’s not true in the BitTorrent jargon.

Peer: A peer is the same as a leecher, but without the negative connotation.

Ratio: The data you uploaded divided by the data you downloaded. A ratio higher than 1.00 means that you upload more than you download, which is a good thing. Most private trackers keep track of your ratio and will ban or block you if you have a bad ratio. Try to get at least a 1.00 or higher ratio.

Scrape: Scraping means that your BitTorrent client is requesting info from the “tracker” about other people who are down- or uploading the file. This is important because you need to know who has pieces of the file you still need.

Seeder: A seeder is someone who has a complete version of the file you are downloading. If there are no seeders, you probably won’t be able to get the file. So seeders are extremely important, make sure to “seed” the torrent once you finished downloading.

Tracker: The tracker is a server that has all the info about the people that are down- and uploading the file. The tracker itself does not have a copy of the file, it only tracks the up- and downloaders and makes sure people are able to connect to each other. A tracker is not the same as a website that hosts torrents. Mininova for example is not a tracker, just a “torrent-site”.

Super-Seed: Some clients have the option to “super-seed”. Super seeding is different from seeding because it tries to send out pieces of the file that have not been sent before. So instead of sending the same piece to several peers, it tries to send a unique piece to everyone so that other peers can swap those pieces.

Swarm: The swarm are all seeds and peers that are connected together. So if your client shows 5 seeds and 10 peers then that’s your swarm.
DHT: DHT stands for “Distributed Hash Table”. DHT layers “decentralize” torrents what make them more stable and less reliant on the web based trackers. If a web based tracker goes down, the torrents stay alive because peers can act as “nodes” keeping the swarm intact.

HOW DOES IT WORK?

Rather than downloading a file from a single source server, the BitTorrent protocol allows users to join a "swarm" of hosts to download and upload from each other simultaneously. Using the BitTorrent protocol, several basic computers, such as home computers, can replace large servers while efficiently distributing files to many recipients. This lower bandwidth usage also helps prevent large spikes in internet traffic in a given area, keeping internet speeds higher for all users in general, regardless of whether or not they use the BitTorrent protocol.

A user who wants to upload a file first creates a small torrent descriptor file that they distribute by conventional means (web, email, etc.). They then make the file itself available through a BitTorrent node acting as a seed. Those with the torrent descriptor file can give it to their own BitTorrent nodes, which, acting as peers or leechers, download it by connecting to the seed and/or other peers.

The file being distributed is divided into segments called pieces. As each peer receives a new piece of the file it becomes a source (of that piece) for other peers, relieving the original seed from having to send that piece to every computer or user wishing a copy. With BitTorrent, the task of distributing the file is shared by those who want it; it is entirely possible for the seed to send only a single copy of the file itself and eventually distribute to an unlimited number of peers.

Each piece is protected by a cryptographic hash contained in the torrent descriptor. This ensures that any modification of the piece can be reliably detected, and thus prevents both accidental and malicious modifications of any of the pieces received at other nodes. If a node starts with an authentic copy of the torrent descriptor, it can verify the authenticity of the entire file it receives.

Pieces are typically downloaded non-sequentially and are rearranged into the correct order by the BitTorrent Client, which monitors which pieces it needs, and which pieces it has and can upload to other peers. Pieces are of the same size throughout a single download (for example a 10 MB file may be transmitted as ten 1 MB pieces or as forty 256 KB pieces). Due to the nature of this approach, the download of any file can be halted at any time and be resumed at a later date, without the loss of previously downloaded information, which in turn makes BitTorrent particularly useful in the transfer of larger files. This also enables the client to seek out readily available pieces and download them immediately, rather than halting the download and waiting for the next (and possibly unavailable) piece in line, which typically reduces the overall time of the download.

a.jpg.gif

When a peer completely downloads a file, it becomes an additional seed. This eventual shift from peers to seeders determines the overall "health" of the file (as determined by the number of times a file is available in its complete form).

The distributed nature of BitTorrent can lead to a flood-like spreading of a file throughout many peer computer nodes. As more peers join the swarm, the likelihood of a completely successful download by any particular node increases. Relative to traditional Internet distribution schemes, this permits a significant reduction in the original distributor's hardware and bandwidth resource costs.

So if there is a popular file which everyone is trying to download, the server wont get overloaded in this type of a system. On the contrary, it will work more efficiently if there are more peers as the peers would also contribute to others as source.

TIT FOR TAT SYSYTEM

In a peer to peer network situation, one can run into the trouble a free rider problem. This is when a user consumes resources without contributing back to the community. In the case with P2P, a user will only leech (download) files without seeding (uploading) any of his own or downloaded files. Without any incentives to seed or punishments to purely leech, there is no reason for the user not to free-ride which is detrimental to the whole community.

After a user has downloaded from its peers, it is selective about who it will upload to (unchoke). What happens is that the user usually unchokes whoever has uploaded to him. In this case, a peer makes a new action every 10 and will choose neighbors (k-1 peers from which it got the highest download rate. Default is 4 peers) to unchoke based on the actions from the last 20 seconds. The peer will then spit its upload bandwidth into these equal parts (equal-split rate). The reason why it waits 10 seconds is that it needs to give TCP enough to let transfers get to their full capacity. This in fact exactly explains how BitTorrent uses a strategy similar to tit-for-tat. If client A does not upload to client B, then client B will not unchoke client A and thus will not be able to download from client B. On the flip side, if client A does upload to client B with a good speed, client B will unchoke A and A will have better download speeds.

SOME METHODS TO INCREASE TORRENT EFFICIENCY

All these parameters have been formed by us by trying different settings on bittorrent and the suggestions we found on various websites on the Internet. The appropriate setting for every computer could vary slightly depending on the Internet speed and computer specifications.

INCREASING NUMBER OF CONNECTIONS

Your TCP connections are limited to a maximum of 10. This might hurt your downloading speed because it wont let you connect to as much peers as you want. It is supposed to slow down viruses because their spreading strategy is to connect to a high amount of ip numbers, but it could cripple your torrent downloads. You need to configure your torrent client to allow 50-100 max half-open TCP connections.

uTorrent: Options > Preferences > Advanced options > net.max_halfopen

Set it to around 100 as shown in the image below.

Screenshot%20%284%29.png

MAXIMUM UPLOAD SPEED

Your connection is (sort of) like a pipeline, if you use you maximum upload speed there’s not enough space left for the files you are downloading. So you have to cap your upload speed.

Use the following formula to determine your optimal upload speed…

80% of your maximum upload speed

so if your maximum upload speed is 40 kB/s, the optimal upload rate is 32kB/s

But keep seeding!

Screenshot%20%286%29.png

MAXIMUM DOWNLOAD SPEED

Although setting your maximum download speed to unlimited may sound interesting, in reality it will only hurt your connection. If you still want to be able to browse properly, set your maximum download speed to:

95% of your maximum download speed

so if your maximum download speed is 400 kB/s, the optimal download speed is 380kB/s.

Screenshot%20%287%29.png

MAXIMUM CONNECTED PEERS PER TORRENT

We experimented quite a lot with the max connected peers settings and came to the conclusion that both high and low number hurt the download speed of a torrent. The following setting worked best for us.

Upload speed * 1.4

So if your maximum upload speed is 40 kB/s, the optimal amount of connected peers per torrent is

40 * 1.4 = 56

Screenshot%20%288%29.png

MAXIMUM UPLOAD SLOTS

Maximum upload slots = 1 + (upload speed / 6)

So if your maximum upload speed is 30 kB/s, the optimal number of upload slots is

1 + (30 / 6) = 6

Screenshot%20%289%29.png

CHECK SEEDS AND PEERS

Always look for torrents with the best seed/peer ratio. The more seeds (compared to peers) the better (in general). So 50 seeds and 50 peers is better than 500 seeds and 1000 peers.

Screenshot%20%2810%29.png

CHANGE THE DEFAULT PORT

By default, BitTorrent uses a port 6881-6999. BitTorrent generates a lot traffic , so ISPs like to limit the connection offered on the these ports. So, you should change these to another range. Good clients allow you to do this, just choose anything you like. If you’re behind a router, make sure you have your ports forwarded or UPnP enabled.

Screenshot%20%2811%29.png

TURN ON ENCRYPTION

Encrypting your torrents will prevent throttling ISPs from limiting your BitTorrent traffic.

CONTRIBUTORS

ANANT JAIN
DHRUV PARGAI

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License