Microsoft is likely to be unhappy when it discovers that its arch rival Amazon is leaning hard on GitHub for AI training data.
"Our LLMs are trained on data from a variety of sources, including licensed and proprietary data, open-source datasets, and publicly available data where appropriate.
It also said Amazon employees should create a "classic personal token," not a "fine-grained personal token," when signing up.
Tech companies, hungry for even more training data, are also granting themselves new permissions to use a lot more of consumers' information.
Though Amazon's legal team has approved the GitHub data scraping workaround, the move could put Amazon in a tricky position.
Persons:
—, Rohit Prasad NurPhoto, Rohit Prasad, Amazon, Amazon's, Andy Jassy, Prasad, Matthew Butterick, Joseph Saveri, Joseph Saveri's, Butterick, Copilot
Organizations:
Service, Business, General Intelligence Group, Amazon, Microsoft, Google, Meta, GitHub, News Corp, Tech, Alexa
Locations:
GitHub