[License-review] [off-topic] OSAID requires the training data (was Re: ModelGo Zero License, Version 2.0)
Stefano Maffulli
stefano at opensource.org
Fri Feb 14 10:15:48 UTC 2025
On Fri, Feb 14, 2025 at 3:57 AM Eric Schultz <eric at wwahammy.com> wrote:
> PS: While the Open Source AI definition says you don't have to include the
> source data to be an "Open Source AI", I would disagree with that
> conclusion. But that's my own two cents.
>
This is a common misunderstanding that I'd like to correct for the records.
If you read carefully, the OSAID clearly states that the training data *is*
required. That's what these sentences in the Definition mean:
"Sufficiently detailed information about the data used to train the system
so that a skilled person can build a substantially equivalent system." and
"The Code shall represent the full specification of how the data was
processed and filtered, and how the training was done."
In hindsight, could the text have been more clear? We tried to draft a text
that would cover all kinds of machine learning systems, not just LLMs.
We'll fix the text in later versions.
The FAQ has more details, you may start from here:
https://opensource.org/ai/faq#what-kind-of-data-should-be-required-in-the-open-source-ai-definition
/stef
PS this conversation is off-topic for this list. Please continue it on
https://discuss.opensource.org, if you're interested.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensource.org/pipermail/license-review_lists.opensource.org/attachments/20250214/2412d196/attachment.htm>
More information about the License-review
mailing list