Derivative/collective works and OSL

David Dillard david.dillard at
Sun Feb 6 23:52:41 UTC 2005

Reply inline below...

> -----Original Message-----
> From: Chuck Swiger [mailto:chuck at] 
> Sent: Sunday, February 06, 2005 5:09 PM
> To: John Cowan
> Cc: license-discuss at
> Subject: Re: Derivative/collective works and OSL
> John Cowan wrote:
> [ ... ]
> > There is a similar dispute, though not so problematic, at 
> the other end:
> > does mere compiling of source code create a derivative 
> work, or is the 
> > object code the original work in a different medium, as a paperback 
> > book is the same work as a hardback original?
> > Nobody knows the answer to that either.
> It may well be true that the courts have not considered a 
> dispute involving this specific issue and resolved it in a 
> way that sets a clear standard or precedent.
> However, there exists a branch of software engineering known 
> as compiler design, and there exist experts in that field who 
> have written working compilers who share a common 
> understanding of how a compiler toolchain
> operates: compilers perform a mechanical, deterministic, and 
> reversible transformation on source code to produce object code.
> By definition, this transformation does not change the 
> semantic meaning of the program and does not involve human 
> decision-making or any possibility of creativity. [1]
> To use your analogy as a starting point, consider taking a 
> book and translating it to another language.  For human 
> languages, this is a creative process since there can be many 
> ways to translate something, the process is not deterministic 
> since two translators often produce output which is noticably 
> different, and the process is not reversible: if you 
> translate a sentence from English to Russian, and then from 
> Russian back to English, it is very likely that what you get 
> back is not the same as the original work.
> [ A classic example from NLP was: "The spirit is willing, but 
> the flesh is weak." became "The vodka is good but the meat is 
> rotten." ]
> Computer languages are unlike human languages: they possess 
> well-defined semantics, enforced by compiler parsing rules 
> like LR(1) or LALR(1) which forbid ambiguity and ensure that 
> well-defined source code has one and only one meaning when 
> compiled.  You can compile a source code file with one 
> compiler into an object file, decompile the object file via a 
> disassembler or debugger like gdb, and then recompile that 
> result into a new object file using a different compiler.  
> You will end up with a program that has the same exact 
> behavior and meaning as the original program.

It has the same behavior, but the process of deassembling isn't going to
give you something equivalent to what you started with (assuming some sort
of high-level language).  Thus, what you have after disassembly isn't nearly
as easy to successfully modify as the original.

> The process of compiling software is thus very similar to 
> photocopying an original document, and then photocopying the 
> copy.  With analog photometric reproduction, the process is 
> lossy (the "Xerox" effect where a second copy becomes blurry 
> compared with the original), but a digital process does not 
> suffer generation loss.

Only if you consider the original work to be the compiled output to be the
original work.

> > The reason it matters is that pretty much everyone agrees that a 
> > tarball is a collective work,
> If I put a book-- a single work, written by a single author-- 
> into a box and mail that box, the box only contains a single 
> work.  If I put two books into the box, then there are two 
> works in the box, but that does not mean the box is a 
> collective work: it is a mere aggregation of two components 
> which are distinct and can be handled seperately without any 
> confusion.
> The tape archive format, or tarball, is a method of packaging 
> content for shipment over the network or for convenient 
> long-term storage, just as the box
>   used for the sake of example is a convenient method for 
> packaging content for shipment via the postal service.
> A tarball of a single work is an archive containing a single 
> work, not a collective work.  A tarball of two seperate works 
> is an archive of two seperate works, which is a simple 
> aggregation and not a collective work. [2]
> > ...and if when compiled it is still a collective work, then 
> it is not 
> > derivative of any of the works contained in the tarball.
> You can't compile a tarball without extracting the contents 
> any more than one could read a book mailed to you in a box 
> without opening that box, first.

I don't think you want to go there.  Someone could easily write a compiler
or a compiler preprocessor so that you could indeed compile a tarball.

> Is a photocopy of a document considered a derived work, or is 
> it considered to be the same thing as the original work for 
> practical and legal purposes?
> --
> -Chuck
> [1]: If the code being compiled has a bug that results in 
> undefined behavior, the compiler is allowed to produce 
> different results when invoked with different optimization 
> flags or compared with the output generated by another 
> compiler.  While true, this does not refute my argument: what 
> the compiler is allowed to do when compiling/optimizing 
> source code is required not to diverge for code which does 
> not involve undefined behavior.
> Page 586 of _Compilers: Principles, Techniques, and Tools_ by 
> Aho, Sethi, and Ullman states:  "First, a transformation must 
> preserve the meaning of programs.  That is, an 'optimization' 
> must not change the output produced by a program for a given 
> input, or cause an error, such as a division by zero, that 
> was not present in the original program.  The influence of 
> this criterion prevades this chapter; at all times we take 
> the 'safe' approach of missing an opportunity to apply a 
> transformation rather than risk changing what the program does."
> [2]: The vast majority of archives found on various FTP and 
> websites contain a single work comprised of one or more 
> source code files.  There are a few cases where a tarball 
> contains several works, such as nmap shipping with libpcap, 
> or Python coming with expat, but it is easy to see that these 
> are two seperate works because the archive keeps them in two 
> seperate directory tree hierarchies.
> I suppose it would be possible to rip out all of the pages 
> from two books, and mix them together on a chapter-by-chapter 
> or page-by-page basis to form a new work which actually was a 
> single indivisible compilation, just as it would be possible 
> to mix all of the files of two software projects together to 
> work a new work, but that is certainly not the normal case.

More information about the License-discuss mailing list