Tech

How AI is learning to improve itself

For this reason, Mirhoseini has been optimizing AI chips using AI. She and her Google colleagues developed a non-LLM AI system back in 2021 that could determine the best locations for different parts on a computer chip to maximize performance.

Mirhoseini claims that Nature looked into the study and supported its validity, despite the fact that some other researchers were unable to replicate its findings. She also points out that Google has utilized the system’s designs for several generations of its own AI chips.

How AI is learning to improve itself

LLMs have been used more recently by Mirhoseini to solve the challenge of writing kernels, which are low-level functions that regulate how different operations, such as matrix multiplication, are performed in chips. She discovered that even general-purpose LLMs can occasionally produce kernels that are faster than those created by humans.

Scientists at Google developed a technique that they used to optimize different aspects of the business’s LLM infrastructure. The AlphaEvolve system asks Google’s Gemini LLM to create algorithms to solve problems, assesses those algorithms, and then requests that Gemini enhance the most effective ones. This process is repeated multiple times. AlphaEvolve improved Google’s unique chip design, created a new kernel that sped up Gemini’s training by 1%, and developed a new method for operating datacenters that saved 0.7% of Google’s processing resources.

Automating training

Because LLMs are notoriously data-hungry, training them is expensive at every turn. Real-world data is too limited in some particular domains—unusual programming languages, for instance—for LLMs to be trained efficiently. Models that behave in accordance with human norms and preferences have been created through the use of reinforcement learning with human feedback, a technique in which humans rate LLM responses to prompts and the LLMs are then trained using those scores. However, getting human input is time-consuming and costly.

How AI is learning to improve itself

LLMs are being utilized more and more to cover the gaps. LLMs can produce realistic synthetic data in domains where they haven’t been trained if given lots of examples. This synthetic data can then be utilized for training. Additionally, LLMs are useful for reinforcement learning: A strategy known as “LLM as a judge” uses LLMs to score the outputs of models that are being trained instead of humans. In 2022, Anthropic researchers introduced the groundbreaking “Constitutional AI” paradigm, which relies on input from another LLM to teach one LLM to be less damaging.

For AI agents, data scarcity is a particularly serious issue. To complete specific tasks, effective agents must be able to execute multistep plans; yet, there are few online examples of successful step-by-step task completion, and it would be expensive to use humans to create new examples. In order to get around this restriction, Mirhoseini and her colleagues at Stanford recently tested a method where an LLM agent comes up with a potential step-by-step solution to a problem, an LLM judge assesses the validity of each step, and then a new LLM agent is trained on those steps. According to Mirhoseini, “you’re no longer constrained by data because the model can just arbitrarily generate more and more experiences.”

Perfecting agent design

The design of LLMs themselves is one area in which they haven’t yet significantly advanced the field. All of the LLMs used today are built on a neural-network structure known as a transformer, which was first put forth by human researchers in 2017. The architecture’s significant subsequent advancements were also created by humans.

ALSO READ: LAD REPORTING

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button