đź’ˇ I indeed felt the AGI in Singapore.
Artificial Intelligence has gone through significant waves of progress, from Neural Networks, Hardware innovations, tooling, algorithmic breakthroughs and discoveries that shape the way we think about intelligence itself. I went to the Google Gemini Symposium held at the Google Office in Singapore, where leading minds like Jeff Dean, Quoc Le and others shared their journeys, lessons and visions of what comes next.
Keynote: Jeff Dean on the evolution of AI Infrastructure
Early Days of Neural Networks
- For years only a few dedicated researchers were involved in pushing the field forward. Then Google Brain combined model parallelism with data distribution and used unsupervised learning on video frame, opening up to new possibilitiles
Development of AI-First hardware
- 2013: Specialized chips for neural inference sparked the first AI Hardware wave
- 2016: GPU Pods connected through 3d meshes created specialized supercomputers for training.
- Today: Hardware scaling continues.
Role of Open source
Tools like TensorFlow, PyTorch and JAX made experimentation accessible, we no longer need to spend hours setting up our code to write a working training loop. Google’s early “DistBelief” system planted the seeds for it. Also, PyTorch being a Python library lowered the entry barrier for developers worldwide.
Transformers
- 2017: Attention is all you need, this paper solved the bottleneck of LSTMs’ sequential dependency by keeping state in matrices to enable full parallelization. This paper also introduced the attention mechanism which means now models had the ability to attend to certain parts of the input sequence based on the context involved.
- 2018: Language model was being scaled. Self-Supervised learning on massive text corpora became the backbone of modern LLMs.
- Sparse models (Mixture of experts): Outperformed dense models by activating only a few experts per inference.
Key Innovations
- Pathways abstracted distributed ML computation, tuning thousands of chips into “One giant chip”.
- Thinking Longer at Inference (2022): By coaxing models to reason step-by-step, performance improved without retraining and Chain of Thought emerged.
- Knowledge Distillation (2014): Teacher-student training accelerated efficiency and performance.
Dean closed by emphasizing two points: (1) pushing the quality/price Pareto frontier of AI systems, and (2) ensuring open-source AI continues to thrive.
Gemini X Reasoning Panel: Why are my LLMs reasoning better than me?
The panel explore the speaker’s thoughts on reasoning.
- Where does reasoning come from?
- About 70% of the reasoning abilities seems to come from post-training (reinforcement learning, fine-tuning), but pretraining lays critical groundwork. Reinforcement learning shines when verifiable reward data exists, but struggles on less-structured tasks.
- Are LLMs able to generate novel knowledge?
- LLMs are primarily compression of the internet. Most of their “creativity” is interpolation, however when generating novel knowledge we are supposed to extrapolate knowledge. Yet exceptions exists, for example Alpha evolve was able to come up with a new form of matrix multiplication.
- Defining AGI
- Human Cognitive Benchmark: AI Should be able to handle any cognitive task humans can, particularly those that are done via computers.
- Human Level AI: Embodied intelligence that can also act in the physical world
- Current Bottlenecks:
- Limited abilitiy to reason reliably for long-duration tasks.
- Difficulty in domains where reinforcement learning struggles.
- Weaknesses in games and physical world interaction.
Quoc Le’s Journey: From Seq2Seq to Reasoning
Quoc Le gave us a look at his path through AI Research.
- Early Contributions
- Seq2Seq for machine translation
- Neural Conversational Models
- Semi-supervised sequence learning introduced the now-standard idea of pretraining then fine-tuning, though at the time GANs stole the spotlight.
- Scaling and LaMDA: In 2020, he helped build a 137B model (LamDA) just as GPT-3 emerged. It went viral but raised safety and hallucination concerns.
- Also he mentioned that Chain Of Thought was discovered as an accident.
- Its funny how something discovered as an accident can lead to a whole new paradigm for LLMs.
- His key takeaways: “Compression will give way to intelligence” When a model compresses information well, understanding and intelligence emerges always.
Danny Zhou on Reasoning
Danny Zhou drilled into the mechanics of reasoning
- Why Intermediate Tokens Matter: Problems solvable by Boolean circuits of size T can be solved by transformers generating O(T) intermediate tokens. Directly producing the answer often fails; reasoning requires steps.
- Methods for Better Reasoning:
- Chain of Thought Prompting: Asking for step-by-step explanations boosts accuracy, but requires examples.
- Supervised Fine-Tuning (SFT): Human-annotated step-by-step solutions.
- Self-Improvement + RL: Models generate reasoning traces, refine them, and improve iteratively.
Fireside Chat with Benoit Schillings: Expanding Beyond Code
Benoit Schillings reflected on his career: learning assembly and writing a game from it at 16 to impress a girlfriend, and later realizing that coding is necessary but not sufficient for AGI.
- His advice: Don’t confine yourself to computer science. Great breakthroughs come from combining disciplines.
- His big bet for the next decade: AGI/ASI systems mining humanity’s accumulated knowledge for transformative insights.
Closing Thoughts
Across the keynotes and panels, one theme stood out: AI is scaling, reasoning, and evolving in unexpected ways. From Jeff Dean’s hardware and open-source perspective, to Quoc Le’s discoveries in reasoning, to Danny Zhou’s deep dive on intermediate steps, and Benoit Schillings’ call to expand beyond code, AI is not just about bigger models, but better thinking systems.
The next frontier isn’t only about raw power it’s about harnessing reasoning, reliability, and cross-disciplinary intelligence to create tools that truly amplify human potential.
Thanks to Sree for proofreading this text.