Jigdo - Spreading the load of CD and DVD downloads ================================================== (c) 2005 Steve McIntyre Distribute freely using the terms of the GNU General Public License, version 2. What is Jigdo? ============== Jigsaw Download (jigdo) is a tool designed to ease the distribution of very large files over the internet, for example CD or DVD images. Its aim is to make downloading the images as easy for users as a click on a direct download link in a browser, while avoiding all the problems that server administrators have with hosting and serving such large files. History ======= Debian has long been distributing CD images of its software. As time has gone on, this has become more and more problematic. Potato (v2.2) took 3 CDs for each of its 7 architectures, and Woody (v3.0) took 6 or 7 CDs for each of its 11 architectures. For many users, even with broadband connections, keeping an FTP or HTTP connection alive long enough to download a 650MB file was problematic. Also, a significant number of users (especially Debian developers) would already have most of the contents of those CDs available locally, either from local mirrors of the Debian archive, or from older versions of the CDs. Re-downloading all the data again would not be ideal. Lastly, mirror space was growing exponentially. The first solution to the CD download problem was the Pseudo Image Kit (PIK), written by Anne Bezemer at about the time of the Potato release. He wrote two simple scripts, one to generate a list of files included in a CD image and a second to simply stick those files together into a "pseudo-image". To convert this pseudo-image into an exact copy of the distributed image, you would run rsync to fill in the gaps. This method worked, but still had a few drawbacks, all due to that final rsync step. The central server had to keep full copies of all the ISO images, and had to cope with the load caused by clients syncing their images. Rsync can be _very_ expensive! Also, clients had to have a net connection available to be able to generate ISO images. For many people, that could be a big problem - imagine a user with a laptop on the road, or a user at a Linux show with limited disk space, writing CDs to sell. In 2002, Richard Atterer developed jigdo to solve those issues. It caught on quickly, and ever since then it has been used by Debian as the primary method of distribution for ISO images. It generally works well, and is an effective way to allow people to download these ISO images quickly and efficiently. How does Jigdo work? ==================== Jigdo is more generic than the PIK. It works on the assumption that a large file for distribution is primarily made up of smaller files, along with some extra "padding". This works for CD/DVD ISO images (of course), but could also be used for tar files and other applications where the same assumptions hold. Jigdo splits its data into two files, the "jigdo" file and the "template" file. The template file is the piece that contains most of the information needed to recreate the output file: