Open source software – where developers release a product’s source code and allow anyone else to reuse and remix it to their liking – is the foundation of Google’s Android, Apple’s iOS and all four major web browsers. Encryption of WhatsApp chats, compression of Spotify streams, and formatting of saved screenshots are all controlled by open source code.
Although the open source movement has its roots in the post-hippie utopianism of 1980s California, it still thrives today, in part because its ethos is not entirely altruistic. Giving software away for free allows developers to get help making their code more powerful; prove its trustworthiness; win the plaudits of their peers; and in some cases, make money by providing support to people who use the product for free.
Several model makers in the artificial intelligence (AI) space, including social media giant Meta, hope to follow this open source tradition as they develop powerful product suites. They hope to bring together hobbyists and startups into a force that rivals billion-dollar labs while bolstering their reputations.
Unfortunately for them, however, guidance released last week by the US non-profit Open Source Initiative (OSI) suggests that the tech giants’ modern use of the term has become meaningless. OSI says these free products are limited and developed in secret and will never drive a real wave of innovation unless something changes. It’s the latest heated argument in a raging debate: What does open source actually mean in the age of artificial intelligence?
In traditional software, the term is clearly defined. Developers will provide the original lines of code used to write the software. Crucially, in doing so, they give up most of their rights: any other developer can download the code and adapt it to their own purposes. Often, the original developer attaches a so-called “copyleft” license, requiring that the modified version be shared in turn. Written using.
Continuing with this tradition, US technology giant Meta proudly declares its large language model (LLM) Llama 3 to be “open source”, sharing the finished product for free with anyone who wants to build on it. Restrictions were imposed, including a ban on using the model to build products with more than 700 million monthly active users.
What Meta freely shares – the connection weights between the artificial neurons in its LLM, rather than all the source code and data that made it – is certainly not enough for someone to build their own version of Llama 3 from scratch, As open source purists usually demand. This is because training artificial intelligence is very different from general software development. Engineers gather data and build rough blueprints of the model, but the system actually assembles itself, processing the training data and updating its own structure until acceptable performance is achieved.
Because each training step adjusts the model in fundamentally unpredictable ways and only converges to the correct solution over time, the model is trained using the same data, the same code, and the same hardware as Llama 3 Will be very similar to the original model, but different from the original model. This eliminates some of the purported benefits of the open source approach: inspect all the code you want, but you can never be sure that what you’re using is the same as what the company provides.
Other obstacles stand in the way of truly open source AI. For example, training a “cutting-edge” artificial intelligence model to go head-to-head with the latest versions of OpenAI or its peers would cost at least $1 billion, which would disincentivize those who spend such huge sums to allow others to profit. In the wrong hands, the most powerful models can teach users to build biological weapons or create unlimited child abuse images, locking their models behind severely restricted access points, allowing AI labs to control how they can be What to ask and decide what their requirements are.
open and close
The complexity of this issue has led to debate over what “open source artificial intelligence” actually means. [open source] said Rob Sherman, Meta’s vice president of policy.
In a recent report, OSI went to great lengths to define the term. It believes that to earn this label, AI systems must offer “four freedoms”: they should be free to use, study, modify and share. details to allow the construction of “substantially equivalent” systems. Sharing all training material for a model is not always desirable anyway, and it would actually prevent the creation of open source medical AI tools, for example, since health records are the property of the patient and cannot be shared without permission.
For those building on top of Llama 3, the question of whether it can be labeled open source is less important because no other major lab is as generous as Meta. Vincent Weisser, founder of San Francisco-based AI lab Prime Intellect, would prefer the model to be “completely open in every dimension,” but still believes Meta’s approach will have a long-term positive impact, reducing end-point access costs. , enthusiasts have compressed it enough to run on mobile phones; and repurposing it for military purposes as part of a Chinese army project has proven that its shortcomings are more than theoretical.
Not everyone is so willing to adopt. Ben Maling, a patent expert at London law firm EIP, said that legally speaking, using truly open source software should be “frictionless”. Companies like Getty Images and AdAdobe have sworn off using certain AI products out of fear that other companies will follow suit.
A precise definition of open source artificial intelligence will have wide-ranging implications. Just as vineyards live or die by whether they can call what they produce champagne or just sparkling wine, open source labeling could be crucial to the future of tech companies. Mark Surman, president of the open source foundation Mozilla, said that if a country lacks a local artificial intelligence superpower, then it may want to support the open source industry to check the dominance of the United States. For example, the EU’s Artificial Intelligence Act currently has loopholes in relaxing open source model testing requirements. Other regulators around the world may follow. As governments seek to establish tight controls over how artificial intelligence is built and operated, they will be forced to make a decision: Do they want to ban bedroom tinkerers from operating in the field, or free them from costly burdens?
For now, the closed labs are optimistic. Even Llama 3, the most powerful of the nearly open source competitors, has been catching up with models released by OpenAI, Anthropic, and Google. A senior director at a major laboratory told The Economist that the economics involved made this scenario inevitable. While releasing a powerful model that’s free to use allows Meta to undercut competitors’ business without getting itself into trouble, the lack of direct revenue also limits the spending it needs to become a leader rather than a fast follower. Funding Desire. Freedom is rarely truly free.
© 2024, The Economist Newspapers Limited. From The Economist, published with permission. Original content can be found at www.economist.com