# [License-discuss] A modest proposal to reduce the number of BSD licenses

Atwood, Mark atwoodm at amazon.com
Fri Aug 21 22:07:01 UTC 2020

That's an impressively huge pile of BSD license variants, and is the central
example of the problem with the license -as-used.

The BSD licenses in the SPDX database have some "matching rules", so that it
can cover common variants that the SPDX reviewers decided are legally
identical.

Have you tried running that dataset thru those to see if they all match or
if any don't?

This tool [ https://github.com/jpeddicord/askalono ] (written by a
recovering Amazonian who used to be on my team) makes it pretty easy to do.
When built, it downloads all the SDPX database, and then compiles it into
the executable.  Feed it the text of a license, and it will tell you which
SPDX license it matches, and if it doesn't perfectly match any of them,
which one is the nearest edit distance away, and what the edit distance is.

..m

Mark Atwood <atwoodm at amazon.com>
Principal, Open Source
+1-206-604-2198

-----Original Message-----
Behalf Of Jeremy C. Reed
Sent: Friday, August 21, 2020 2:28 PM
Subject: RE: [EXTERNAL] [License-discuss] A modest proposal to reduce the

CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you can confirm the sender and know
the content is safe.

On Fri, 21 Aug 2020, Josh Berkus wrote:

> > Amazon?s preferred permissive license is Apache 2.0.? In part
> > because it doesn?t have this ?dozens and dozens of pointless minor
variants? problem.
>
> For such a short license, BSD has an awful lot of variations.

I published a print two volume set of NetBSD sysadmin manuals

As part of that work I identified hundreds of unique licenses.
My printed License acknowledgements include 26 different statements "This
product includes software developed ..."

And the included licenses began on labeled page 1461 (volume 2 physical page
716) and ended on page 1529 (volume 2 page 784).

t1:reed$ls -l /home/reed/book/netbsd-documents/copyrights.tex -rw-r--r-- 1 reed reed 281396 Jun 14 2010 /home/reed/book/netbsd-documents/copyrights.tex t1:reed$ wc /home/reed/book/netbsd-documents/copyrights.tex

t1:reed$grep '\\hline' /home/reed/book/netbsd-documents/copyrights.tex | wc -l 109 (separators between unique licenses) t1:reed$ grep '\\textbf' /home/reed/book/netbsd-documents/copyrights.tex |
wc -l
620
(unique files)

t1:reed$grep '^Copyri' /home/reed/book/netbsd-documents/copyrights.tex | wc -l 683 (unique Copyright lines) t1:reed$ grep "^Redistribution and use"
97

t1:reed\$ grep -v "^Permission to use"
13

Every license (disclaimer etc) included was unique due to some wording
difference even if only a single word but not including the copyright
owners. As part of the listing I bundled all the copyright
statements/dates/owners and the filenames with the single corresponding

That was for only that small two volume collection. (I say "small" since
there are around 15 more volumes and hundreds more unique licenses.)

As part of this work I identified many missing licenses -- unknown
provenance of code or no contact with original developers -- and also worked

> Gotta be the bikeshed problem.

_______________________________________________
The opinions expressed in this email are those of the sender and not
necessarily those of the Open Source Initiative. Official statements by the
Open Source Initiative will be sent from an opensource.org email address.