[License-discuss] objective criteria for license evaluation

Sun Dec 9 22:14:51 UTC 2012

Hi Luis,

There are many useful ways to cut the data. Even raw statistics on number of
lines of code under each license; number of independent foundations/projects
that have adopted each license; types of software under each license; etc.
can be interesting. I'd like to know which licenses are used by government
agencies; for-profit software companies; non-profits. Most useful would be a
way of listing "large" or "important" projects and the licenses they use, as
long as the list of such projects is broad and comprehensive. 

I have no idea how Black Duck or others calculate their statistics nor what
is included in their samples, so the lack of methodological openness is more
of a problem than the availability of "statistics". I hope that OSI can
address these questions as scientists would, rather than as religious
zealots for one sect or another.

Regarding the classification of licenses, I think it is most important to
categorize licenses in the same business-related terminology that relates to
business models. So you need to identify which licenses ignore or have
antiquated provisions regarding patents, and why that might matter; which
licenses require reciprocity; whether that reciprocity includes use by third
parties over a network or whether it is a "strong" or "weak" reciprocity;
which licensees contain defensive suspension provisions (patent only or
copyright also) that require due diligence before reliance on that software;
which licenses are definitely incompatible with each other for derivative
work purposes; which licenses are approved for use by the US or other
governments; which contain attribution requirements beyond a subset of basic
requirements; which contain jurisdiction or governing law provisions; etc.
Of course, OSI should identify licenses that have been superseded or
withdrawn by the author.

Good luck doing this with scientific precision.

/Larry 

Lawrence Rosen
Rosenlaw & Einschlag, a technology law firm (www.rosenlaw.com)
3001 King Ranch Rd., Ukiah, CA 95482
Office: 707-485-1242

-----Original Message-----
From: Luis Villa [mailto:luis at tieguy.org] 
Sent: Sunday, December 09, 2012 10:47 AM
To: Karl Fogel; License Discuss
Subject: Re: [License-discuss] objective criteria for license evaluation

I'm a little surprised at how quiet this thread has been, especially since I
know some members of this list have been calling for objective criteria for
a while.

So let me restate the question to broaden it a bit. If you had a *blue-sky
dream* what subjective information would you look at?

For example, if you had the resources to scan huge numbers of code
repositories, what numbers would you look for?

* ranking by LoC under each license
* ranking by "projects" under each license
* ... ?

Similarly, if you could declare objective criteria for textual license
analysis and had the time/resources to read all of them, what would those
criteria be? e.g.,

* has/has not been retired by the author
* has/has not been obsoleted by a new license published by the same author
* has/doesn't have an explicit patent grant
* ... ?

These examples assume quantitative measures of adoption, the text, and the
explicit actions of the author are the only things about a license that can
actually be measured, but I am probably thinking small- other examples
welcome.

[As a reminder, this is not a purely theoretical exercise- I agree with many
on this list that a license process based on more objective criteria would
be a good thing, and this thread is an effort to explore that issue and
start thinking about what such a list might look like.]

Luis

On Thu, Dec 6, 2012 at 3:35 PM, Karl Fogel <kfogel at red-bean.com> wrote:
> Matthew Flaschen <matthew.flaschen at gatech.edu> writes:
>>On 12/05/2012 10:23 AM, Karl Fogel wrote:
>>> Luis Villa <luis at tieguy.org> writes:
>>>> Anyone else have other suggestions for objective criteria we could 
>>>> use? I know some folks here have been thinking about this issue for 
>>>> some time.
>>>
>>> Number of "forks" of software under a given license on GitHub, 
>>> adjusted for license popularity across GitHub?  (And the equivalent 
>>> calculation for other sites, where possible.)
>>
>>That could be misleading, depending on what we want to measure.  There 
>>are a lot of forks doing real work (either true forks, or those that 
>>do ongoing pull requests to keep synced).
>>
>>However, there are also people that fork and make one or two changes, 
>>or none at all.  There's nothing wrong with that, it just might not be 
>>a meaningful metric for this purpose.
>
> Of course.  I meant that as a direction to look in, not as a literal 
> suggestion of methodology.  By number of forks at GitHub, I meant 
> "look at the forks, using some kind of intelligent criteria, 
> statistical methods, etc".
>
> This is non-trivial work, of course.  Which is why it is so hard to 
> get good stats on license popularity and why the notion is rife with 
> fundamental definitional questions.
> _______________________________________________
> License-discuss mailing list
> License-discuss at opensource.org
> http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discus
> s
_______________________________________________
License-discuss mailing list
License-discuss at opensource.org
http://projects.opensource.org/cgi-bin/mailman/listinfo/license-discuss