<aside> 💡 AI Generated Summary
Non-profit organization Fairly Trained has certified four AI technologies, including a large language model (LLM) KL3M from 273 Ventures, as free of copyright infringement. This challenges the common claim by AI companies that it's impossible to train an LLM without using copyrighted material. Fairly Trained, founded by former Stability AI executive Ed Newton-Rex, aims to evaluate and certify AI tech based on its respect for creators' rights. The certification encourages AI companies to train on licensed data, ensuring creators' consent and compensation.
</aside>
Back in January, Ed Newton-Rex – a former exec at Stability AI and founder of music-making AI platform Jukedeck – launched Fairly Trained, a non-profit whose mission is to evaluate and certify AI tech based on its respect for creators’ rights.
The non-profit is something of a crusade for Newton-Rex, who resigned from Stability AI on principle after the company submitted a statement to the US Copyright Office asserting that scraping copyrighted materials without permission to train AI should be considered “fair use” under the law.
That’s a stance held by many other AI companies, including Anthropic, the company facing a copyright infringement suit from Universal Music Group over its alleged unauthorized use of copyrighted lyrics to train its Claude AI chatbot.
Newton-Rex disagrees with that stance “because one of the factors affecting whether the act of copying is fair use, according to Congress, is ‘the effect of the use upon the potential market for or value of the copyrighted work’,” he wrote in a guest column for MBW last fall.
“Today’s generative AI models can clearly be used to create works that compete with the copyrighted works they are trained on. So I don’t see how using copyrighted works to train generative AI models of this nature can be considered fair use.”
Two months after that column was penned, Fairly Trained was up and running, and now, the non-profit appears to have done considerable damage to a key point made by many AI companies: That it’s not possible to train a large language model (LLM) without using reams and reams of copyrighted material.
“THERE IS NO FUNDAMENTAL REASON THAT LARGE LANGUAGE MODEL DEVELOPERS CAN’T WORK IN A WAY THAT RESPECTS CREATORS’ RIGHTS.”
FAIRLY TRAINED
Fairly Trained announced on Wednesday (March 20) that it had certified four new AI technologies as being free of copyright infringement, including a large language model: KL3M from 273 Ventures, the first LLM to be certified by the non-profit.
“One of the most frequent questions we were asked when we launched in January was whether it was realistic to think that we would be able to certify any large language models,” Fairly Trained said in a statement.
“We were optimistic that we would, as there is no fundamental reason that large language model developers can’t work in a way that respects creators’ rights. Today’s announcement answers this question, and strengthens our belief in a future in which a fair approach to training data is the norm.”
“THE CERTIFICATION INCENTIVIZES AI COMPANIES TO TRAIN ON LICENSED DATA AND CENTERS HUMAN CREATORS IN THE AI LANDSCAPE. IT IS AN IMPORTANT STEP TOWARDS ENSURING HUMAN AUTHORS HAVE CONSENT AND RECEIVE COMPENSATION FROM THE USE OF THEIR WORKS IN AI TRAINING.”
THE AUTHORS GUILD
Fairly Trained also announced it had certified Voicemod, the first company offering AI speech and singing models to receive the certification.