Ever since ChatGPT launched a year ago, we have repeatedly heard the same question: What’s going on? It’s obvious that artificial intelligence (AI) is going to completely change everything, or at least something, based on the explosion of chatbots and countless “AI-powered” apps. Even so, even the experts in artificial intelligence are experiencing a disorienting sensation that despite all the talk of its revolutionary potential, so much about this technology remains shrouded in mystery.
It goes beyond a simple emotion. As a result of corporations’ opaqueness regarding the capabilities and development process of their AI models, a growing amount of technology that was previously developed through open research has been virtually hidden. Although secrecy is problematic, transparency is not legally required: Earlier this year, it is disclosed that Meta and other parties had trained their AI models on nearly 200,000 books without the authors’ permission or compensation.
We can now quantify the true severity of AI’s secrecy issue. A new transparency index, which measures the openness of ten significant AI companies, including OpenAI, Google, and Anthropic, was introduced by Stanford University’s Centre for Research on Foundational Models. Based on the public disclosure of 100 distinct pieces of information—such as the data used for training, the compensation given to data and content-moderation workers involved in the model’s development, and situations in which the model should not be used—the researchers assigned a grade to each company’s flagship model. After all, each disclosure was worth one point. The highest-scoring company among the ten received an average of 37 points, barely exceeding 50 out of a possible 100. In other words, every company receives a dismal F.
Consider OpenAI, whose name denotes a dedication to openness. With a score of 48, its flagship model, GPT-4, lost important points for not disclosing information about the data it was fed, how it handled personally identifiable information that might have been included in the scraped data, and how much energy was used to create the model. Not even Meta, which has made a point of letting users download and modify its model, received just 54 points. According to Deborah Raji, an AI accountability researcher at UC Berkeley who was not involved in the study, one way to conceptualize it is as follows: You are receiving a baked cake, and you can add decorations or layers to that cake. The recipe book for the cake’s actual ingredients, however, is not included.
Many businesses, like Anthropic and OpenAI, have maintained that they withhold such information to maintain a competitive edge or to stop their technology from spreading dangerously, or for both reasons. According to a representative for Amazon, the business is eager to examine the index closely. According to Margaret Mitchell, a researcher and chief ethics scientist at Hugging Face, the index misrepresented BLOOMZ as the company’s model when, in reality, it was created by the BigScience project, an international research collaboration that the company co-organized.
The Stanford researchers focused on the inputs into each model, the details of the model itself, and the downstream effects of the final product when choosing the 100 criteria, drawing from years of prior AI research and policy work. To support its finding that the businesses should disclose whether they directly employ the workers and any labour protections they implement, for instance, the index cites academic and journalistic investigations into the low pay for data workers who assist in perfecting AI models. When creating the index, Rishi Bommasani and Kevin Klyman, the index’s primary creators, told me that they tried to consider what kinds of disclosures would be most useful to a variety of different groups, including consumers deciding whether to use a model in a given situation, policy makers creating regulations around AI, and scientists conducting independent research about these models.
The primary index creators, Rishi Bommasani and Kevin Klyman, told that they made an effort to consider the kinds of disclosures that would be most beneficial to a variety of different groups, including consumers choosing whether to use a model in a given circumstance, policy makers creating regulations pertaining to AI, and scientists conducting independent research about these models.
Apart from providing information on particular models, the index also highlights information gaps in the industry. The researchers evaluated several models, but none of them disclosed whether the training data was subject to copyright restrictions or other limitations on use. Furthermore, none of the models provide enough details about the writers, artists, and other people whose creations were plagiarised and used as training materials. The majority of businesses also keep quiet about the flaws in their models, such as the ingrained prejudices or the frequency of fabrications.
It is a reflection on the industry as a whole that every company performs so poorly. The executive director of the AI Now Institute, Amba Kak, actually told that the index was not a high enough standard. She informed me that even 100 criteria can’t fully expose the issues because the opacity in the industry is so deep-rooted and ubiquitous. Furthermore, Raji informed me that the story of transparency is incomplete without complete disclosures from businesses. Additionally, the narrative is nearly always positive.
Raji coauthored a paper in 2019 that demonstrated the poor performance of various facial recognition products on women and people of colour, including those sold to law enforcement. The study clarified the risk of law enforcement utilizing substandard technology. Six cases of police in the United States falsely accusing people of crimes based on faulty facial recognition have been reported as of August; all of the accused are Black. According to Raji, these new AI models carry comparable risks. AI businesses can easily exaggerate their capabilities in ways that encourage users or third-party app developers to use subpar or defective technology in critical contexts like criminal justice and healthcare by withholding from policy makers and independent researchers the data they require to audit and validate corporate claims.
The industry-wide opacity has rarely been broken. One model that is not in the index is BLOOM, which was created by the BigScience project in a similar manner (though it is not the same as BLOOMZ). The BLOOM researchers recorded information about data creators, copyright, personally identifiable information, and source licenses for the training data in addition to performing one of the few studies of the wider environmental effects of large-scale AI models that are currently available. It proves that this level of openness is achievable. However, according to Kak, regulatory mandates will be necessary to change industry norms. She said that we cannot rely on the public and researchers to piece together this informational map.
The most telling finding may be that, when it comes to the “impact” criteria—which include the number of users using the product, the applications being developed using the technology, and the geographic distribution of where these technologies are being deployed—all of the companies have particularly poor disclosures. This is perhaps the biggest deal-breaker. Regulators find it much more difficult to monitor and hold accountable each firm’s sphere of influence and control as a result. It’s much more difficult for consumers as well: you might not even be aware that OpenAI technology is supporting your family doctor, your child’s teacher, and your office productivity tools. Put another way, we don’t even know how much we rely on these technologies because we don’t know enough about them.
Of course, secrecy is nothing new in Silicon Valley. Frank Pasquale, a tech and law scholar, first used the term “black-box society” almost ten years ago to describe how digital platforms were becoming more and more opaque as they established their dominance over people’s lives. We are unaware of important decisions, and secrecy is getting closer to a critical mass, he wrote. In spite of the numerous warning stories from social media and other AI technologies, many people have become accustomed to using black boxes. Years of effort were put into creating a new, murky standard in Silicon Valley, which is now just seen as normal.