I want to return to my prior post from Tuesday questioning the utility of peer-to-peer file distribution. That post has spurred a number of responsive posts (from my colleague Lior Strahilevitz here, from Ed Felten here and from Brett Frischmann here), plus extensive comments from Tim Wu and others. (Tim and I co-taught an innovation policy seminar in the Spring, so this is part of a continuing conversation.)
Both Lior and Ed focus on the question of control. I think that that is exactly the right issue, but we should see what to make of it. Unfortunately, I think we need to start with questions of telecommunications and computer engineering before we can turn to law and economics. I say unfortunately as I suspect that I’m at a comparative disadvantage relative to Ed (who is as you may know a Princeton computer science prof) and Tim, who spent a number of years in the Valley at a network equipment firm. Nonetheless, in the great tradition of lunch at the University of Chicago Law School, I will plunge ahead fearlessly.
Ed Felten focuses on the question of whether even “centralized” sites such as Google are really centralized. He notes that that Google’s site probably uses a distributed computing architecture. By that he means that Google is not just one giant hard disk somewhere. Instead, Google has racks and racks of servers and these servers are at locations throughout the country. This is a distributed architecture, even though from the user’s standpoint, it acts as if the site is run by one giant computer.
How Google is organized is an interesting question of computer engineering and manufacturing costs. Think of this as the question: how should we organize computer storage? What size should a component be given how it can be produced most efficiently and given the cost of communicating among the components? Is it more expensive to communicate within a component or across components? All interesting issues, but I don’t think this is really the issue for our discussion of the uses of peer-to-peer file distribution.
Instead, and Ed and Lior both make this point, the real issue is control. However Google is engineered, we have a single point of control. If the boys at Google—Larry Page and Sergey Brin—get up one day and decide to flick a switch, they can turn off Google. This is one of the key differences between a product and a service. Even Microsoft, with all of its vaunted power, can’t turn off the already-distributed copies of Windows.
We saw an interesting example of this control over the last few days, and this takes us from peer-to-peer to peering. The backbone of the Internet is a series of interconnected networks. Packets move about the network, just as cars do on the interstate, but the interconnections between the networks are done through contracts. Two contract types are important: peering and transit. In a peering contract, no money changes hands between the networks. Instead, the deal is: “You take my packets, I will take yours, and we will call it a wash.” Peering avoids the transaction costs of metering. The other arrangement is transit: count the number of packets exchanged and assess a fee.
For peering to work, the traffic flows between the contracting networks need to be relatively symmetric. If that symmetry is broken, problems may result, and this is what we saw this week. As described by c|net news.com (“Blackout shows Net’s fragility”), Level 3 Communications had a peering arrangement in place with Cogent Communications. The peering became unbalanced and Level 3 told Cogent that that it needed to start paying. When no agreement was reached, Level 3 turned off the connection, and Cogent’s customers, including the Museum of Fine Arts in Boston, couldn’t get to parts of the Internet.
Control is key, and as both Ed and Lior note, peer-to-peer means decentralized control. As the Level 3 example suggests, even peer-to-peer will be dependent on the underlying rules for organizing the Internet, but with true p2p, we will avoid another single switch that can be turned on or off. So as Lior notes, Ian Clarke’s freenet software is designed with the idea of avoiding centralized control. The fear is China, where governmental authorities seek to exercise broad control over the distribution of ideas and content. If that is the fear, then we went to spread control, and peer-to-peer software is a good approach to that. But as Bruce Boyden notes in his comment on Lior’s post, the U.S. isn’t China.
So I circle back to my original question: what content should we distributed p2p and why? In that context, I should say a few words about BitTorrent, raised by Tim Wu in his comments. This takes us from organizing computer storage to organizing bandwidth.
The idea behind BitTorrent is simple, at least if I understand it. For consumers, think of broadband as two different pipes running into your home. One is a downloading pipe, the other an uploading pipe. For most consumers, the downloading pipe is much larger than the uploading pipe. That creates a problem for peer-to-peer distribution. Say, to make the numbers simple, the downloading pipe is 10 times the size of the uploading pipe. If I am at home downloading a song from Lior’s home computer, 90% of my downloading pipe sits idle. BitTorrent recognizes that, so instead of downloading the entire song from Lior’s computer, it finds 10 computers with the song and downloads 1/10th of the song from each of the computers. I use their entire uploading bandwidth, while using all of my downloading bandwidth.
The BitTorrent software obviously needs to make sure that it is not downloading the same snippet from each song from each each computer, so I get the full song rather than 10 copies of one part of it. (In truth, for most top 40 music, it isn’t obvious it would make a difference.) (Further parenthetical: that is just a line really, as I listen to nothing but top 40 (at least of various decades) and its ilk, unlike most of my colleagues who seem to see each other constantly at the opera.)
But as I hope that discussion makes clear, the utility of BitTorrent is based on the artifact of the uploading/downloading asymmetry of consumer broadband. I assume—but don’t know for sure and would be delighted to learn more—that this asymmetry is not an important organizing principle for centralized download sites, such as iTunes. This is not to say that I can’t imagine commercial companies using BitTorrent for distribution: if they can economize on their own bandwidth costs by taking advantage of consumer bandwidth, do so. I take it this is exactly Tim’s point, when he suggests that a peer-to-peer infrastructure democratizes distribution costs.
But that takes me back to my original point, namely, that I don’t see us doing that currently for fee-copyright content, and I was surprised to see that we really aren’t doing that for public domain content or even for photographs. Maybe we will do that for fat files, such as sharing our home movies, but then we have to switch from questions of technology and supply to the issue of demand. There is no easier way to clear a room than to break out the home movies, and I suspect that is true whether we are in one physical room or one giant virtual room.
What's the point of this discussion? It seems like the real question is whether P2P technologies should be banned or not. There's a very simple answer to that question -- No.
Posted by: Doug Lay | October 07, 2005 at 10:59 AM
The bandwidth organizing benefits of p2p are being recognized more widely all the time, including by "for fee" content. For example, the video game business has begun relying on this feature, employing p2p technologies developed by companies like Red Swoosh and Kontiki.
This discussion brings to light an important dichotomy buried in the term "p2p". The best known p2p apps generally accomplish two distinct functions: search and file transfer. Napster handled the former in a centralized way, the latter in a decentralized way. Later p2p apps, like eDonkey and Kazaa, have implemented both functions in decentralized fashion. Bit Torrent does not attempt to do search at all, focusing instead solely on file transfer (but relies on a centralized "tracker" for each file to handle coordination).
With respect to the question of whether decentralization is a good idea for bandwidth sharing, I think the posts reflect a consensus that it is.
The more interesting question, in my view, is what case can be made for implementing search in a decentralized (i.e., not controlled by one entity) fashion. Of course, it seems to me exactly the kind of question that the free market should decide, with minimal interference from a priori copyright restrictions in the form of secondary liability.
Posted by: Fred von Lohmann | October 07, 2005 at 11:27 AM
Prof. Picker-
On BitTorrent and asymmetry. You are right that BitTorrent works by balancing upload/download asymmetry. But the asymmetry that it balances is the aggregate of the network, not that of the individual host.
For instance, imagine that iVideo (a hypothetical video distribution service) is distributing a 1000 megabyte file to 1000 peers. And assume, too, that each peer has a symmetric 10MB/sec connection (highly unrealistic, about 100-megabit ethernet), and that iVideo has a symmetric 1000MB/sec connection (ie, 10-gigabit ethernet). Under a traditional distribution model, each of those thousand peers is going to compete for a "fair" share of iVideo's 1000MB/sec: about 1MB/sec. (As an asside, in TCP/IP jargon, the meaning of fair is just as ambiguous as in law and economics; under the standard TCP/IP implementation, each peer would ideally get about 0.75-MB/sec, with other 25% being lost to transactions costs incurred by congestion control. But I'll call it 1MB/sec.). Distribution of the 1000MB file would therefore take 1000 seconds for each peer.
Note that this wastes a great deal of bandwidth: each peer has some amount of bandwidth that can be used for uploading--in this model as much as for downloading. 1000 peers with 10MB/sec upload capacity is 10000MB/sec of wasted capacity; a full order of magnitude more than iVideo has.
Under the BitTorrent model, each of the 1000 peers would download a separate piece of the 1000MB file. This would take 1 second. In the next second, each peer would download another chunk from iVideo, and a third chunk from an another peer (perhaps "peer n" would download from "peer n+1"). In second 1, 1000MB is transferred; in second 2, 2000MB. The amount transferred in each second grows exponentially for the number of peers (with an extra 1000MB coming from the iVideo server) (2:3000MB, 3:5000MB, 4:9000MB/sec, &c), up to the point of network saturation (11000MB/sec). To figure out the total transfer, we look at the total data to be transferred: 1000 peers times 1000MB = 1000000MB (1TB). I'll spare you the math, and say that, with the exponential grown to saturation, the total transfer, in theory, would take about 90 seconds.
Clearly, 90 seconds is much better than 1000 seconds.
Returning to the original question of asymmetry, what this is doing is "balancing" the asymmetry of the earlier, non-peer-to-peer model, by making use of the available upload capacity of the peers. So, the asymmetry isn't the 90/10 upload/download per-peer ratio, but the 100/0 ratio downloading from iVideo to downloading from other peers. A simpler way to think of this is that, under the previous model, the peers' available capacity wasn't being used: it was a deadweight loss incurred by that protocol. BitTorrent allow us to capitalise what would otherwise be waste.
Now, to address your underlying question: why do we need p2p. I hope that I only need to address the converse--the gains are obvious from the above example--why don't we need it? The answer, as always, is transaction costs. Consumers don't frequently transfer huge files; and the researchers that do aren't numerous enough to need it (the traditional protocols--ftp and scp, for instance are simpler to administer for frequent transfers by few users of infrequent transfers of many files). In the case of the consumers, transferring a 10MB video or music file in toto might take a minute; with a streaming model, it might take 5 seconds to transfer enough to start playing. The BitTorrent model doesn't allow for streaming (because the data might not arrive in order), and it incurs substantial transaction costs (coordinating the optimal p2p transfer between an arbitrary number of peers with grossly different bandwith capacities is non-trivial; indeed, it is impossible other than by chance without a priori knowledge). (Note that, even if it's only 10% efficient, in the above example 900 seconds is still better than 1000 seconds; and there are network effect benefits, too; because iVideo could get away with a much smaller pipe without grossly affecting total transfer times; a n-fold decrease in iVideo's capacity should, if I recall correctly, yield a O(log(n)) increase in total transfer time.).
So, BitTorrent needs to either allow complete transfers faster than data-to-start downloads for streaming, or needs to be used in an environment where complete downloads are used (eg, music for your iPod). And, it needs to be used in an environment where the network effect benefits, as limited by the necessary ineffeciencies of the protocl, yield a net increase in average transfer times. An althernative, more Chicago-like understanding for where we should use P2P a la BitTorrent is where the total cost difference between x-mbit/sec and (n*x)-mbit/sec Internet connections for iVideo is greater than the costs incurred by the peers by an k*O(log(n)) increase in download times (where k is the coeffecient of efficiency for the P2P protocol).
And, as Fred notes, we are seeing companies starting to switch to these models, particularly for prototype video-distribution systems. Probably both because the download times are otherwise too large, and because the needed centralised-distribution bandwidth requirements are prohibitive.
I hope that this helps with the discussion--by which I mean that the costs incurred by reading this are less than the net of the value of the ideas imparted and the consequent discussion.
--Gus
Posted by: Gus | October 07, 2005 at 11:57 AM
Gus, I don't know who you are, but I agree with everything you say.
Saying what use is Bittorent is like asking "what's the use of cheaper airline tickets?" Bittorent style technologies just, in effect, make bandwidth cheaper, and to paraphase Lenin, cheapeness has a character all of its own.
I think, to agree with Fred, that the really interesting question is, what's the use of decentralized searches, as first implemented in Gnutella. And here I am inclined to agree with Randy and say my first instinct is that its useful for avoiding unwanted control.
But maybe we're wrong, and maybe Google will be supplemented any day by a purely decentralized search system (though in a sense Google is already somewhat decentralized, as far as I can tell). Anyone have an idea?
Posted by: Tim Wu | October 07, 2005 at 11:02 PM
I'm late to this party, but Fred, the type of "for fee" game content you're talking about is patches, updates, demos, add-on levels, mods, etc., right? I.e., stuff that's distributed for free, but requires the original game to make use of. Or are there games out there that can be downloaded for a fee from a peer-to-peer network? If so, I'm curious what those are.
Posted by: Bruce | October 09, 2005 at 08:33 PM
I have a followup post on this over at Freedom to Tinker, at http://www.freedom-to-tinker.com/?p=907
Posted by: Ed Felten | October 10, 2005 at 09:56 AM