Categories
Internet Mozilla

BitTorrent For HTTP Failover

There is a proposal circulating around the web to create a X-Torrent HTTP header for the purpose of pointing to a torrent file as an alternative way to download a file from an overloaded server. I’ve been an advocate of implementing BitTorrent in browsers in particular Firefox since at least 2004 according to this blog, and like the idea in principal but don’t like the implementation proposed.

The way the proposal would work is a server would send the X-Torrent HTTP header and if the browser chose to use the BitTorrent it would do that rather than leach the servers bandwidth. This however fails if the server is already overloaded.

Unnecessary Header

This is also a little unnecessary since browsers automatically send an Accept-Encoding Requests header which could contain support for torrents removing the need for a server to send this by default. Regardless the system still fails if the server is overloaded.

Doesn’t Failover

A nicer way would be to also utilize DNS which is surprisingly good at scaling for these types of tasks. It’s already used for similar things like DNSBL and SPF.

Example

Assume my browser supports the BitTorrent protocol and I visit the following URL for a download:

http://dl.robert.accettura.com/pub/myfile.tar.gz

My request would look something like this:

Get: /pub/myfile.tar.gz
Host: dl.robert.accettura.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate,torrent
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://robert.accettura.com/download

The servers response would look something like this:

Date: Sun, 18 Jan 2009 00:25:54 GMT
Server: Apache
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: application/x-bittorrent

The content would be the actual torrent. The browser would handle as appropriate by opening a helper application or handling it internally. If I didn’t have torrent in my Accept-Encoding header, I would have been served via HTTP like we are all accustomed.

Now what happens if the server is not responding? A fallback to the DNS level could be done.

First take the GET and generate a SHA1 checksum for the GET, in my example that would be:

438296e855494825557824b691a09d06a86a21f1

Now to generate a DNS Query in the format [hash]._torrent.[server]:

438296e855494825557824b691a09d06a86a21f1._torrent.dl.robert.accettura.com

The response would look something like a Base64 encoded .torrent file broken up and served as TOR or TXT records. Should the string not fit in one record (I think the limit is 512 bytes) the response could be broken up into multiple records and concatenated by the client to reassemble.

Odds of a collision with existing DNS space is limited due to the use of a SHA1 hash and the _torrent subdomain. It coexists peacefully.

Downside

The downside here is that if your server fails your DNS is going to take an extra query from any client capable of doing this. There is slight latency in this process.

Upside/Conclusion

The upside is that DNS scaling has come a long way and is rarely an issue for popular sites and web hosts. DNS can (and often is) cached by ISP’s resulting in an automatic edge CDN thanks to ISP’s. ISP’s can also mitigate traffic on their networks by caching on their side (something I also suggested in 2004).

BitTorrent may be used for illegal content, but so is HTTP. I think costs for ISP’s and websites could be significantly cut by making BitTorrent more transparent as a data transfer protocol.

13 replies on “BitTorrent For HTTP Failover”

Is “torrent” really an encoding? When I see that header, I assume that the data is all contained in the followed stream, encoded by whatever happens to be in that field.

It might be better to put it in the Accept header:

Accept: */*, application/x-bittorrent

@Shawn Wilsher: Not quite. metalink is trying to reinvent downloads. This is bootstrapping BitTorrent into a HTTP download interaction users are already familiar with.

That said, I personally have slight reservations with abstracting yet another thing with XML.

Matt: Accept-Encoding only serves to tell the server which compression techniques we can handle (the server can compress the data stream for faster transfers) but the server can still choose whether or not it actually uses one of those.

Also your Accept idea isn’t really all that great, since right now servers would go check any media type they’re sending against that field (ideally) and check to see if that media type can be handled by the client. If not, it might send a different media type. The wildcards there mean that the browser wants the server to send whatever it’s got… an explicit declaration of application/x-bittorrent would seem to be redundant, not to mention it changes the way that header is expected to be parsed by the server (*/* now means everything… EXCEPT bittorrent! Not exactly intuitive.).

I like the Accept-Encoding method if only because with X-Torrent, the browser would likely send GET and expect to get the full file via HTTP… which the server would then send along with the X-Torrent! A bit of a waste of bandwidth when the server is just going to cut it off and use the torrent.

Otherwise the client should send a separate X-Header of it’s own to indicate BitTorrent support, to get a torrent file in return.

Or we could go and add yet another string to the User Agent to indicate built-in BitTorrent support. πŸ™‚ Perhaps a JavaScript method too. πŸ˜›

The DNS thing seems to be a bit weird to me, and I don’t think it’ll work out.

1) It’s not what DNS is designed for
2) I don’t know how many hosting providers allow you to declare your own DNS entries, but mine doesn’t. I’m limited to making subdomains on the same server as my main site. I can’t even make one that points to my home computer or a dynamic DNS (no-IP.com, etc) entry. Noone may be able to use it.
3) Even without your DNS solution we’re still much better off than before. Not to mention, if the server does go down, our solution at getting users to use BitTorrent to download the file has already clearly failed. The whole point is to keep the server up. πŸ™‚

Not sure how else you would do it though. Redirect all traffic to a mirror of the torrent file on TPB or elsewhere, maybe… some servers that are failing to serve too many users with full web pages may improve if all they have to do is reply with a Location header.

@Dan:
1. DNS has been used for similar tasks for years. SPF and DNSBL are great examples. It wasn’t designed for either. This design is very similar to them.
2. Most hosting providers I’ve seen do. I know mine does, I tweak it from time to time. Not doing this because a few hosting providers still don’t provide any DNS control is silly. Besides, their customers aren’t likely to need this functionality anyway.
3. DNS is redundant since you should have at least 2 DNS servers in separate facilities on different networks (see RFC 2182). This particular blog as 6 DNS servers behind it. All sites have at least 2. ISP’s cache. There’s better protection on the DNS level than anywhere else if the site is hosted correctly.

DNS servers can also handle a ton of traffic without much effort. They are often one of the older boxes a hosting provider has running.

@Robert: Is that reinvent w/ a bad connotation? πŸ™‚

We are trying to get torrents and advanced download features into a HTTP download interaction users are already familiar with…In the sense that they don’t need to know it’s being used, because it all happens automatically in the background. Yes, you can lists torrents and all mirrors for a file w/ a metalink (XML). This is what things like the openSUSE download redirector (MirrorBrain) and other Linux distributions are using metalink for. Some clients download from torrents mirrors at the same time, then share over BitTorrent when finished.

For metalink, we’ve been using transparent content negotiation which is apparently not the correct thing to do in some people’s opinion, but works. I’ve been told the HTTP Link header is the correct thing for us, so I’m hoping it’d be good for torrents too.

If anyone is interested in this, feel free to help us improve what we’re doing by dialoging and collaborating.

@Dan (the first):
Umm, my Firefox right now sends “Accept: text/html,application/xhtml XML,application/xml;q=0.9,*/*;q=0.8” – i.e. there are preferences. Of course adding anything to any header is costly (it gets sent with every request), and all that.

Robert:
For the fallback, just having a pointer to a magent: URI might be better? Not so much data to cram onto your DNS servers, and all that.

@Mook: The advantage of DNS is that you avoid overloading yet another HTTP server and can take advantage of how well DNS scales thanks to UDP and being cached by ISP’s.

I added the following clarification via a comment to the Ajaxian post about this:

1. Accept Encoding is correct since the server would serve a .torrent (over HTTP) as the response should it find the browser can accept it. The DNS layer is a fallback initiated by the browser should the HTTP download server fail. This is by far the most common point of failure in a high traffic environment. Accept Encoding doesn’t switch protocols as you claim it merely tells the server if it can offer a .torrent or not.

2. DNS would really hold only 1-2kb at most. It would hold the .torrent which generally run 1kb for essential info only. As I mentioned this is not really that different than storing SPF data in a TXT or SPF record. Downloading the actual file is up to the torrent network. ISP’s could cache the torrent itself if they wanted to keep it on their network and offer faster downloads to their customers.

I as well don’t see this working out to well, but i do like the idea. there is just to much **** to deal with that would make something like this worth it.

* you have to worry about things such as people going to site.com or http://www.site.com. how do you handle things such as pagination and pages with ?page=2

* depending on the page this torrent is gonna be changing every 10 minutes, and isp DNS caching isn’t gonna help much at all, you will just be giving people old useless pages. you can keep the ttl low, but then your DNS server is gonna get thrashed no matter what.

* i dont know the bittorrent protocol at all but last i checked, trackers run over HTTP. so your web server is getting hammered you have yet another server to worry about getting hammered after that.

* how long do you propose people’s browsers maintain seeding a page? ratio of 1? 2? as long as IM at that site? as long as my browser is open? as long as its in the cache? if you have peers connecting to a tracker and not being on any longer than 10 seconds thats gonna render attempting to get the page almost useless.

@Zac: You’ve got several things wrong:

1. You would likely dedicate a subdomain.
2. Pagination is irrelevant since this is DNS.
3. DNS trashing is hardly a big deal considering how low resource they really are by todays standards. Likely unnecessary for most anyway.
4. Trackers could be done on any protocol in theory, HTTP is ubiquitous.
5. That’s really up to the browser. There’s no standard for BitTorrent clients either and they managed. Could even have a separate process to mimic that of a typical BitTorrent client, though considering how long most people keep browsers open it’s likely not needed.

Leave a Reply

Your email address will not be published. Required fields are marked *