[License-review] Please review revised ModelGo licenses

Tue May 20 02:34:58 UTC 2025

Hi Pam an Richard,

I also agree that this topic warrants further discussion. I’ve spoken with my legal teammate and would like to share their comments on the matter below:

Response to Pam’s comments: I agree on the patent termination - isn't there at least one license that does include derivative works though? 
[R: per my earlier comments, I think we should clarify the concern around patent termination - I am still not quite sure whether we are concerned with licensors suing licensees, or licensees suing licensors (I suspect it is the former). And if we are indeed concerned with licensors suing licensees, then I do not necessarily see an issue with the original termination provision, since it "prevents the assertion of patents against downstream modifiers of the upstream licensor’s patents”. While OSI suggests that this could affect uptake, I’m not sure that we should be concerned with uptake by users who are likely to have patents (ie, commercial parties). I do note, however, that foundations could also maintain patents; but I’m not sure if that is our target audience.]

Response to Pam’s comments: Not necessarily a bar to license approval, but I am skeptical that copyleft is a workable concept for models, particularly where the Model is used for training of new models through distillation or generating synthetic data. This is a known problem for databases and I expect it will be even more challenging for models. It can easily become unmanageable. 
[R: this is a difficult issue, and I agree with Pam that it will be challenging to actually implement copyleft for models, which in turn could expose downstream users to liability from the Licensor. But as a practical matter, how many Licensors would actually sue users for failing to comply with copyleft? Only (a small number of) commercial users would be worth suing for failing to comply with copyleft. This means that in practice, including a copyleft obligation only makes the license risky for commercial users, but probably makes no difference to other users. Further, the copyleft obligations in Section 2.2(a) only apply to Licensed Materials and Derivative Materials - I would argue that once something is sufficiently transformed or changed such that it cannot be considered Licensed Materials or Derivative Materials, copyleft no longer applies. This is especially so for models, which, unlike software, cannot be easily decomposed into its component libraries/routines/functions etc. In my view, this means that the copyleft obligation will only catch the most egregious violations/copying by commercial users (who, arguably, should know better). Overall, I would include the copyleft obligation, but I’m interested to hear OSI’s comments on my analysis.] 

Response to the Model Output Provision and the Editor Hypothetical:
[R: I agree with Moming that the model output provision does not violate any of the 10 Open-Source Definition criteria. The OSI’s responses appear to be against this model output provision, mostly on the basis that it restricts the conditions of use of the output (ie the “editor argument”).

I think I would counter that open-source software is not free (or ‘libre’) software (in FSF parlance). To quote, "Why Open Source Misses the Point of Free Software" https://www.gnu.org/philosophy/open-source-misses-the-point.html . Further, "the criteria for open source are concerned solely with the use of the source code. Indeed, almost all the items in the Open Source Definition are formulated as conditions on the software's source license rather than on what users are free to do.” 

If so, then imposing conditions on use of the output is not against the 10 OSD criteria (does not appear to be in violation of OSD6 and OSD8).

I would even say that an editor which required any file created with the editor to have an attribution notice could be open-source, but not free. Of course, in practice, such an attribution requirement is silly for editors, but as Moming pointed out, an LLM is not a code editor.

More importantly, while the principles of ‘free’ software point in favour of having no conditions on use/output, the unique nature of LLMs also mean that some conditions on use/output are desirable. And although this is a value-judgment (on how AI-created content should be treated), I would say that (1) adherence to the 4 FSF freedoms is itself a value-judgment in favour of freedom (understood a certain way) in software, and (2) labelling AI-generated content is something that is compatible with, and necessary, for the 4 FSF freedoms to be desirable. I would analogise this to how even the right of free speech is limited in specific contexts. This point is more germane to the “advertisement” requirement (previously 2.4(b), which had the purpose of targeting misinformation and misleading claims), but also applies to the current Section 2.2(b) (which has the narrower purpose of ensuring sustainability in open-source LLM development).

In the most recent threads, there was also some discussion about whether copyright or patent law grants the Licensor rights/control over the output. My view is that copyright/patent law does not need to grant the Licensor rights/control over the output, since the License is imposing conditions on the use of the Model (ie Licensed Materials), and is making the License grant conditional on compliance with these conditions of use. I would suggest not approaching the issue from the point of view of IP rights over output (since that is a thorny issue - requires further analysis), but merely as an issue of whether imposing certain conditions on use is OSD-compliant.

Finally, I agree entirely with your response to Pam that although "attribution information may be lost over several generations... it would be unreasonable to respond to this challenge by altering data licenses to allow unrestricted reuse and removal of attribution simply for the sake of convenience or ease of crawling”. In the case of downstream reuse of the Output, the difficulty in knowing whether “the original Output is still there several generations later”, in fact works in our favour, because it means that the attribution obligation is not overly onerous and will only catch the most egregious violators/copiers.

Thanks,
Moming

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensource.org/pipermail/license-review_lists.opensource.org/attachments/20250520/ba82cd52/attachment-0001.htm>