<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">Hi <span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Carlo and </span>Pam,<div><br></div><div><br></div><div>Thanks for your valuable insights. I will discuss them with our teammates. Here are three points I\u2019d like to clarify.</div><div><br></div><div><b>First</b>, I cannot agree that distilled content has the similar meaning as pre-training data. <span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"></span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">In </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">fact, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">human-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">generated </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">data </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">is </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">expected </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">to </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">be </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">exhausted </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">soon (https://epoch.ai/blog/will-we-run-out-of-data-limits-of-llm-scaling-based-on-human-generated-data), </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">effectively </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">setting </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">an </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">upper </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">limit </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">on </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">amount </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">of </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">available </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">pre-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">training </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">data</span>. However, the main reason developers perform distillation is not just data scarcity\u2014it\u2019s because model-generated data can significantly improve the robustness and generalization ability of new models. For example, distilled data from DeepSeek R1 has been shown to enhance reasoning performance. This kind of \u201cgeneralization ability\u201d is precisely what people aim to transfer.</div><div><br></div><div><strong data-start="131" data-end="141" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Second</strong><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">I </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">understand </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">that </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">after </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">multiple </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">generations, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">attribution </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">information </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">may </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">be </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">lost. </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">However, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">intention </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">behind </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">output </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">provision </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">is </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">simply </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">to </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">remind </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">sub-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">users </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">at </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><em data-start="314" data-end="332" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"><span class="_fadeIn_m1hgl_8">first </span><span class="_fadeIn_m1hgl_8">generation</span></em><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"> </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">to </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">respect </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">attribution. </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">For </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">example, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">if </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">model </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">owner </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">releases </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">their </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">model </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">under </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">MG-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">BY, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">and </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">someone (</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">possibly </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">even </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">model </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">owner) </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">distills </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">outputs </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">from </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">it </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">into </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">dataset </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">and </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">publishes </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">that </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">dataset </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">without </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">any </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">attribution </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">notice, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">then </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">downstream </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">user </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">who </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">trains </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">new </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">model </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">on </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">that </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">dataset </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">may </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">unknowingly </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">breach </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">MG-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">BY </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">license. </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">If </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">their </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">use </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">does </span><em data-start="686" data-end="691" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">not</em><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"> </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">result </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">in </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">derivative </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">model, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">it </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">still </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">undermines </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">licensor\u2019s </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">original </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">intention </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">in </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">choosing </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">MG-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">BY, i</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">n </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">such </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">case, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">distilling </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">and </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">republishing </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">as </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">dataset </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">effectively </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">bypasses </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">conditions </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">set </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">by </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">ModelGo </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">License. From an ML perspective, a single generated artwork does not constitute a dataset. Some distilled datasets can be found at: https://huggingface.co/datasets?sort=trending&search=distill</span></div><div><br></div><div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Lastly, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">I </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">still </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">have </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">not </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">identified </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">which </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">specific </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">clause </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">of OSD</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"> </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">this </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">output </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">provision </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">directly </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">violates.</span></div><div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"><br></span></div><div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);"><br></span></div><div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Best,</span></div><div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Moming</span></div><div><br id="lineBreakAtBeginningOfMessage"><div><br><blockquote type="cite"><div>On 18 May 2025, at 5:51\u202fAM, Pamela Chestek <pamela@chesteklegal.com> wrote:</div><br class="Apple-interchange-newline"><div>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div><p>Hi Moming,</p><p>I realize you're not trying to impose any license on the use, but
you are imposing a obligation that runs with the output, which has
never been acceptable in an open source license. I also realize
that this is new territory, and just because it wasn't done for
software doesn't mean it can't be done for model output, but it is
something that needs to be thoughtfully considered.</p><p>What is the justification for it? Why is attribution to the
original model something important enough that it has to be said?
Is it because so much work went into training the model? The
attribution for software is, I suspect, a nod to the concept of
attribution of authors in copyrighted works that exists in some
countries. But is that rationale appropriate for models, where
there is likely no copyrightable authorship in the output?<br>
</p><p>I am most concerned about the implications for individual works.
As I mentioned in my original email, the words "collection" and
"dataset" suggest your intention may have been to limit the duty
to downstream models, not generated works, but that is not at all
clear in the license. If I generate a single artistic work from a
model under this license, do I have to provide attribution
information on my Output? Caution would suggest that is the case,
which I think is quite problematic.<br>
</p><p>I am troubled by your statement:</p><div>
<br class="webkit-block-placeholder"></div><blockquote type="cite">
<div>I recognize that attribution information may be lost over
several generations\u2014just as licensing information is often
lost when data is crawled from the web and later used to train
models. However, it would be unreasonable to respond to this
challenge by altering data licenses to allow unrestricted
reuse and removal of attribution simply for the sake of
convenience or ease of crawling. </div>
</blockquote>
Licensing information <i>shouldn't</i> be lost in the licensing
of software, and a great deal of effort goes into making sure that
it isn't. To say that "oh, we know that you'll be out of
compliance with the license at some point and we're cool with
that" isn't how contracts do or should work. Most people try to
abide by their legal obligations and will try to comply, so they
will be heavily burdened by this requirement because it will be
impossible to figure out after only a generation or two. And you
may be cool with it, but it is a way for someone less forgiving
than you to opportunistically claim a breach of the license,
putting users at risk of expensive lawsuits. <br><div><br class="webkit-block-placeholder"></div><p>So this obligation puts a lot of burden on users, and I am
looking for a reason why it's justified.</p><p>Pam<br>
</p>
<div class="moz-signature">Pamela S. Chestek<br>
Chestek Legal<br>
PLEASE NOTE OUR NEW MAILING ADDRESS<br>
4641 Post St.<br>
Unit 4316<br>
El Dorado Hills, CA 95762<br>
+1 919-800-8033<br>
pamela@chesteklegal<br>
<a class="moz-txt-link-abbreviated" href="http://www.chesteklegal.com/">www.chesteklegal.com</a><br>
<br>
<br>
</div>
<div class="moz-cite-prefix">On 5/15/2025 7:14 AM, Moming Duan
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CD03810C-99F8-4020-9508-DE1288CB46BB@gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
Hi Pam,
<div><br>
</div>
<div><br>
</div>
<div>ModelGo Licenses (MG-0, MG-BY, and MG-BY-OS) clearly grant
the right to create Derivative Materials, including new models
developed via techniques such as distillation. As stated in
Section 2.2(b), attribution is not required for internal use of
generated content; the obligation only applies when generated
datasets are Distributed.</div>
<div>This is a lightweight, attribution-style requirement that is
easy to comply with, for example, by including proper credit in
the dataset README, as commonly seen on: <a href="https://huggingface.co/datasets" moz-do-not-send="true" class="moz-txt-link-freetext">https://huggingface.co/datasets</a></div>
<div>Importantly, this does not mean that the generated dataset
must adopt the same license as the original model. </div>
<div>I recognize that attribution information may be lost over
several generations\u2014just as licensing information is often lost
when data is crawled from the web and later used to train
models. However, it would be unreasonable to respond to this
challenge by altering data licenses to allow unrestricted reuse
and removal of attribution simply for the sake of convenience or
ease of crawling. </div>
<div><br>
</div>
<div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">Even </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">though </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">question </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">of </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">who </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">owns </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">generated </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">content </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">remains </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">a </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">legal </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">issue </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">yet </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">to </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">be </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">fully </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">resolved</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">. But i</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">f </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">model-</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">generated </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">content (at least when collected in
significant quantities) </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">didn\u2019t </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">contain </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">knowledge </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">or reasoning patterns</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);"> </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">akin </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">to \u201c</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">source </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">code,\u201d </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">why </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">would </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">there </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">be </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">such </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">widespread </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">enthusiasm </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">for </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">model </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">distillation? </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">We </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">don\u2019t </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">see </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">people </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">transferring </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">knowledge </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">from </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">books </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">they </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">wrote </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">in </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">Word </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">into TEXT</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);"> </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">with </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">the </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">same </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">motivation. </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">A </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">more </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">fitting </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">analogy </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">is </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">users </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">copying </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">code </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">from </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">one </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">repository </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">to </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">another\u2014</span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">that, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">to </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">me, </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">better </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">captures </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">what\u2019s </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">happening. </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">This </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">is </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">my </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">personal </span><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">opinion.</span></div>
<div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);"><br>
</span></div>
<div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);"><br>
</span></div>
<div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">Best,</span></div>
<div><span class="_fadeIn_m1hgl_8" style="caret-color: rgb(0, 0, 0);">Moming</span></div>
<div><br>
</div>
<div><br id="lineBreakAtBeginningOfMessage">
<div><br>
<blockquote type="cite">
<div>On 15 May 2025, at 11:38\u202fAM, Pamela Chestek
<a class="moz-txt-link-rfc2396E" href="mailto:pamela@chesteklegal.com"><pamela@chesteklegal.com></a> wrote:</div>
<br class="Apple-interchange-newline">
<div>
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
<div><p>This appears to be an attempt at making it a
restriction for distillation or synthetic data
generation, not, for example, an individual work ("a <i><b>collection
</b></i>of Output <i><b>as a dataset</b></i>"), and
I don't doubt that it's well-intended, but I agree
that the limitation on Output is inconsistent with
open source principles. It also seems unworkable as
the original Output is further reused downstream. How
would one know if the original Output is still there
several generations later?<br>
</p><p>Pam<br>
</p>
<div class="moz-signature">Pamela S. Chestek<br>
Chestek Legal<br>
4641 Post St.<br>
Unit 4316<br>
El Dorado Hills, CA 95762<br>
+1 919-800-8033<br>
pamela@chesteklegal<br>
<a class="moz-txt-link-abbreviated" href="http://www.chesteklegal.com/" moz-do-not-send="true">www.chesteklegal.com</a><br>
<br>
<br>
</div>
<div class="moz-cite-prefix">On 5/14/2025 7:09 AM, Carlo
Piana wrote:<br>
</div>
<blockquote type="cite" cite="mid:1375873151.44917731.1747231771304.JavaMail.zimbra@piana.eu">
<pre class="moz-quote-pre" wrap="">Josh,
sorry for long silence.
I think that the new version of the ModelGo license does not seem to addres=
s the concern I have expressed against it, following up on your own comment=
on output (now in 2.bb). I think that imposing anything on the output of t=
he model is against the OSD as it is a restriction on the use of the licens=
ed subject matter.
So no, I am confused at how this new text should be addressing the above co=
ncern.
In a separate thread I have expressed perplexity on certain clauses, these =
seem to have been removed, so no issue on that end.
This applies to the updated versions.
Cheers
Carlo
----- Messaggio originale -----
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Da: "Josh Berkus" <a class="moz-txt-link-rfc2396E" href="mailto:josh.berkus@opensource.org" moz-do-not-send="true"><josh.berkus@opensource.org></a>
A: "License submissions for OSI review" <<a class="moz-txt-link-abbreviated moz-txt-link-freetext" href="mailto:license-review@lists.opensource.=" moz-do-not-send="true">license-review@lists.opensource.=</a>
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">org>
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Inviato: Marted=C3=AC, 15 aprile 2025 1:46:45
Oggetto: [License-review] Please review revised ModelGo licenses
</pre>
</blockquote>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Carlo, Pam, Eric, Shuji,
=20
Moming has re-submitted revised versions of his licenses based on your
feedback. Please check them when you can and make sure that your
concerns about the licenses have been addressed.
=20
--
-- Josh Berkus
OSI Board Member
=20
=20
_______________________________________________
The opinions expressed in this email are those of the sender and not nece=
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">ssarily
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">those of the Open Source Initiative. Communication from the Open Source
Initiative will be sent from an opensource.org email address.
=20
License-review mailing list
<a class="moz-txt-link-abbreviated moz-txt-link-freetext" href="mailto:License-review@lists.opensource.org" moz-do-not-send="true">License-review@lists.opensource.org</a>
<a class="moz-txt-link-freetext" href="http://lists.opensource.org/mailman/listinfo/license-review_lists.opensou=" moz-do-not-send="true">http://lists.opensource.org/mailman/listinfo/license-review_lists.opensou=</a>
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">rce.org
_______________________________________________
The opinions expressed in this email are those of the sender and not necessarily those of the Open Source Initiative. Communication from the Open Source Initiative will be sent from an opensource.org email address.
License-review mailing list
<a class="moz-txt-link-abbreviated moz-txt-link-freetext" href="mailto:License-review@lists.opensource.org" moz-do-not-send="true">License-review@lists.opensource.org</a>
<a class="moz-txt-link-freetext" href="http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org" moz-do-not-send="true">http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
The opinions expressed in this email are those of the
sender and not necessarily those of the Open Source
Initiative. Communication from the Open Source Initiative
will be sent from an opensource.org email address.<br>
<br>
License-review mailing list<br>
<a class="moz-txt-link-abbreviated" href="mailto:License-review@lists.opensource.org">License-review@lists.opensource.org</a><br>
<a class="moz-txt-link-freetext" href="http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org">http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org</a><br>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
The opinions expressed in this email are those of the sender and not necessarily those of the Open Source Initiative. Communication from the Open Source Initiative will be sent from an opensource.org email address.
License-review mailing list
<a class="moz-txt-link-abbreviated" href="mailto:License-review@lists.opensource.org">License-review@lists.opensource.org</a>
<a class="moz-txt-link-freetext" href="http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org">http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>The opinions expressed in this email are those of the sender and not necessarily those of the Open Source Initiative. Communication from the Open Source Initiative will be sent from an opensource.org email address.<br><br>License-review mailing list<br>License-review@lists.opensource.org<br>http://lists.opensource.org/mailman/listinfo/license-review_lists.opensource.org<br></div></blockquote></div><br></div></body></html>