[License-review] Please review revised ModelGo licenses
Moming Duan
duanmoming at gmail.com
Sun May 25 05:55:12 UTC 2025
Hi Pam,
Thank you for contributing your insights. I realize that our output provision may not be effective or function as intended, as I have not clearly defined the “knowledge” embedded in the output. While AI-generated outputs can indeed be used to improve other models, which might justify placing conditions on such behavior, there is little benefit in doing so if the corresponding clause is highly ambiguous.
I will continue to investigate whether it's possible to distinguish between mere outputs and outputs that embed knowledge. Otherwise, we may consider removing this provision and leaving the concern for later consideration.
Best,
Moming
> On 25 May 2025, at 7:23 AM, Pamela Chestek <pamela at chesteklegal.com> wrote:
>
>
> On 5/19/2025 7:34 PM, Moming Duan wrote:
>> Hi Pam an Richard,
> <snip>
>>
>>> Response to Pam’s comments: Not necessarily a bar to license approval, but I am skeptical that copyleft is a workable concept for models, particularly where the Model is used for training of new models through distillation or generating synthetic data. This is a known problem for databases and I expect it will be even more challenging for models. It can easily become unmanageable.
>> [R: this is a difficult issue, and I agree with Pam that it will be challenging to actually implement copyleft for models, which in turn could expose downstream users to liability from the Licensor. But as a practical matter, how many Licensors would actually sue users for failing to comply with copyleft? Only (a small number of) commercial users would be worth suing for failing to comply with copyleft. This means that in practice, including a copyleft obligation only makes the license risky for commercial users, but probably makes no difference to other users. Further, the copyleft obligations in Section 2.2(a) only apply to Licensed Materials and Derivative Materials - I would argue that once something is sufficiently transformed or changed such that it cannot be considered Licensed Materials or Derivative Materials, copyleft no longer applies. This is especially so for models, which, unlike software, cannot be easily decomposed into its component libraries/routines/functions etc. In my view, this means that the copyleft obligation will only catch the most egregious violations/copying by commercial users (who, arguably, should know better). Overall, I would include the copyleft obligation, but I’m interested to hear OSI’s comments on my analysis.]
> I continue to be troubled by the position that it's ok to put in a provision because it will be ignored. We want people to do their best to comply with licenses, which means that the license should be a clear as possible and it is possible to comply with the license requirements. It's harmful from policy and compliance standpoints to set up a situation where you expect and ratify people not holding up their end of the bargain.
>
> You say "I would argue that once something is sufficiently transformed or changed such that it cannot be considered Licensed Materials or Derivative Materials, copyleft no longer applies. This is especially so for models, which, unlike software, cannot be easily decomposed into its component libraries/routines/functions etc. In my view, this means that the copyleft obligation will only catch the most egregious violations/copying by commercial users (who, arguably, should know better)."
>
> This is exactly the problem -- when copyleft will apply is subject to interpretation, and even perhaps entirely unknowable. This is perfectly setting up a troll, which you seem to find acceptable because it will only catch commercial companies - but why are you so dismissive of commercial companies? I might go so far as to say that a provision that is designed to trap some categories of users, but not others, is a violation of OSD 6, No Discrimination Against Field of Endeavor. You know who else it harms? Individuals who want to do the right thing and so will not use the model because the license compliance obligations are too ambiguous.
>
>>
>>
>>
>>> Response to the Model Output Provision and the Editor Hypothetical:
>>
>> [R: I agree with Moming that the model output provision does not violate any of the 10 Open-Source Definition criteria. The OSI’s responses appear to be against this model output provision, mostly on the basis that it restricts the conditions of use of the output (ie the “editor argument”).
>>
>> I think I would counter that open-source software is not free (or ‘libre’) software (in FSF parlance). To quote, "Why Open Source Misses the Point of Free Software" https://www.gnu.org/philosophy/open-source-misses-the-point.html . Further, "the criteria for open source are concerned solely with the use of the source code. Indeed, almost all the items in the Open Source Definition are formulated as conditions on the software's source license rather than on what users are free to do.”
>>
>> If so, then imposing conditions on use of the output is not against the 10 OSD criteria (does not appear to be in violation of OSD6 and OSD8).
> I agree with Carlo's email on this point. We look at it holistically too, particularly where it's a license for new subject matter, an AI model.
>>
>> I would even say that an editor which required any file created with the editor to have an attribution notice could be open-source, but not free. Of course, in practice, such an attribution requirement is silly for editors, but as Moming pointed out, an LLM is not a code editor.
>>
>> More importantly, while the principles of ‘free’ software point in favour of having no conditions on use/output, the unique nature of LLMs also mean that some conditions on use/output are desirable. And although this is a value-judgment (on how AI-created content should be treated), I would say that (1) adherence to the 4 FSF freedoms is itself a value-judgment in favour of freedom (understood a certain way) in software, and (2) labelling AI-generated content is something that is compatible with, and necessary, for the 4 FSF freedoms to be desirable. I would analogise this to how even the right of free speech is limited in specific contexts. This point is more germane to the “advertisement” requirement (previously 2.4(b), which had the purpose of targeting misinformation and misleading claims), but also applies to the current Section 2.2(b) (which has the narrower purpose of ensuring sustainability in open-source LLM development).
> As you correctly state, "the principles of ‘free’ software point in favour of having no conditions on use/output." For the few conditions that exist, there was indeed a value judgment made that the benefit gained was worth the impairment. I don't see anything in your answer that explains what benefit there is to your proposed impairment on output, much less that it's significant enough that a new condition should be allowed. What is its purpose?
>>
>> In the most recent threads, there was also some discussion about whether copyright or patent law grants the Licensor rights/control over the output. My view is that copyright/patent law does not need to grant the Licensor rights/control over the output, since the License is imposing conditions on the use of the Model (ie Licensed Materials), and is making the License grant conditional on compliance with these conditions of use. I would suggest not approaching the issue from the point of view of IP rights over output (since that is a thorny issue - requires further analysis), but merely as an issue of whether imposing certain conditions on use is OSD-compliant.
> As mentioned, we don't allow any conditions lightly and I haven't seen an adequate justification for this one.
>>
>> Finally, I agree entirely with your response to Pam that although "attribution information may be lost over several generations... it would be unreasonable to respond to this challenge by altering data licenses to allow unrestricted reuse and removal of attribution simply for the sake of convenience or ease of crawling”. In the case of downstream reuse of the Output, the difficulty in knowing whether “the original Output is still there several generations later”, in fact works in our favour, because it means that the attribution obligation is not overly onerous and will only catch the most egregious violators/copiers.
> As explained above, this is a highly problematic outlook on licenses.
>
> Pam
>
>
> Pamela S. Chestek
> Chestek Legal
> PLEASE NOTE OUR NEW MAILING ADDRESS
> 4641 Post St.
> Unit 4316
> El Dorado Hills, CA 95762
> +1 919-800-8033
> pamela at chesteklegal
> www.chesteklegal.com <http://www.chesteklegal.com/>_______________________________________________
> The opinions expressed in this email are those of the sender and not necessarily those of the Open Source Initiative. Communication from the Open Source Initiative will be sent from an opensource.org email address.
>
> License-review mailing list
> License-review at lists.opensource.org
> http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensource.org/pipermail/license-review_lists.opensource.org/attachments/20250525/eebe58b6/attachment-0001.htm>
More information about the License-review
mailing list