Can LLMs Lead Us To Superintelligence?



In this article, I share a few thoughts on the futuristic view of AI. Specifically the view of the future in which AI helps us solve many of the hard problems in science that would allow us to unlock boundless energy, get adverse climate change under control and cure all diseases for example. This is a future I would certainly love to strive for, but I find a lot of the pronouncements on how close we are to getting there very curious, given the reliance on large language models and what we know about how they are trained and how they perform. I go through my reasoning of why I believe LLMs are not enough, and conclude with what I think some of the missing pieces are.

LLMs are excellent at emulating and exemplifying the use of human language, and since we use language as a means to communicate everything we know, experience and feel, it follows that LLMs, effectively trained on billions of examples of the use of language can paint a very convincing picture of exceptional intelligence, just by how they are able to communicate. LLMs not only sound smart, but by many conventional measures, they are. State of the art LLMs today can score well on PhD level tests, meaning that the content of their responses is actually accurate, in addition to being well communicated. In fact, hard coded within a trained LLM are parameters that represent many facts about a wide range of topics, and that are wired together in a way that forces the LLM to respond with very specific content after seeing a particular piece of text. You can type into an LLM the first few lyrics of your favourite song to see this in action.This is by design, because fundamentally, the job of a language model is to predict the next word after seeing a particular sequence of words.

Given that LLMs are “smart” in this way, how can they actually help us achieve breakthroughs in science and technology as has been suggested? Since LLMs only know what humans have put on the Internet, is there a scenario in which LLMs can discover something or teach us something which humanity as a collective already does not know? One plausible scenario is that they are able to connect the dots between different concepts as captured by language, in new and meaningful ways that shed new light on previously poorly understood concepts. The research on LLM interpretability is fascinating in this regard, and perhaps being able to understand the emergent capabilities of LLMs could give us new insight on some topics. Aside from uncovering latent knowledge and underlying relationships which may inspire new human knowledge, it does not seem obvious that an LLM can truly know or discover independently beyond what it already knows, which is also all we already know as a collective. No matter how much data we throw at an LLM, in as much as the quality of its response may increase, its actual knowledge will be limited by the knowledge contained in the data it has been trained on.

Despite LLMs’ intelligence being limited, their level of performance today is so good that it certainly seems as though there must be a significant role for them to play in ushering in this new future. But if they can’t make accurate scientific conjectures beyond what they already know, and beyond the patterns they have been trained to reason along, how can they help us crack problems which have evaded even us, the original authors of this very same knowledge and deductive reasoning? Consider that despite the fact that as humans, we do not know everything there is to know about the universe, today we do know more than what we did 100 years ago. What we know actually does increase, and by and large, the pathway to increasing our knowledge is through experimentation and observation. Without being able to observe experiments, whether naturally occurring or lab designed, and draw empirical interpretations from those observations, we would not be able to understand and model the world we live in to the extent that we do now. For a long time there was a debate on whether light was a particle or a wave, based on experimental data and observations that made a strong case for both. If however all we knew or had the chance to observe were experiments and observations supporting the wave theory of light, then we likely would never have questioned that light was anything but a wave. In the case of LLMs, that know all we collectively know which is based on all we have collectively observed, experimented with and reasoned about, they similarly have no reason to think beyond what has been presented as empirical evidence of the state of the world and human being’s place in it.

In order for AI to truly make novel contributions in science that could lead to significant breakthroughs, we need to give it the ability to perceive and observe the real world. This includes everything that we can readily perceive with fully functioning human senses, as well as the many things we cannot without specialized instruments, for example ultra-low or high sound frequencies, subatomic particles or chemical compositions. All these signals make up part of an infinite dataset on which AI models can train on and try to predict, to independently learn the relationships that govern our physical, chemical and biological world. This would ultimately give AI its best shot at learning something that we didn’t already collectively know, and that was not subject to our rate of knowledge increase.

LLMs are crucially important, and in some way are the first piece and the last piece of the puzzle with regards to AI that is actually smarter than the collective “us”. LLMs give us an effective way to communicate with machines using natural language. LLMs could also translate our prompts into requests understood by models that specialize in understanding various aspects of the world. For example, they could translate requests to a Large Physics Model, which would make some prediction, perhaps on the maximum weight some newly discovered material can bear, and that newly discovered material may have been created by a formulation predicted by a Large Chemistry Model, in response to the initial prompt to propose a material that can withstand the harsh conditions on Mars for 100 years, and the output of the physics and chemistry models can be fed back into the LLM to break down how to achieve this technology that was previously genuinely beyond our realm of capability with our current level of knowledge and understanding, using simple natural language.

This may be as futuristic as any other proposal put forward, because if anything the sheer amount of compute required to train trillion parameter models with potentially infinite access to observational data from the real world may be unattainable in current conditions. In any case, since we are talking about super-intelligence which is already futuristic and fanciful, this seems to me to be the more likely path, at least when pitted side by side against LLMs.