Practical Problem Solving Applying GenAI and Classic Computer Science

Learn how Generative AI and classic CS techniques combine to solve real-world business challenges and drive growth. Understand real AI's potential goes beyond hype, transforming industries today.

Jun 28, 2024

Hello, fellow tech visionaries and AI enthusiasts! Welcome to another electrifying episode of Tech Trendsetters, where we learn and explore the cutting-edge developments shaping our technological future.

Today, we're exploring a fascinating convergence that could dramatically accelerate not only the path to artificial general intelligence but also the path for your own problem sets, – whether these are your everyday tasks or complex challenges that can give you a competitive edge in the market. In short – today’s episode is about the fusion of generative AI with classic computer science techniques.

Today, you'll discover how to apply generative AI to elevate practical problems or tasks to a new level. Such approaches can be expanded to virtually anything you’re working on, demonstrating that this is not a rocket science but simply an effective application of Generative AI to solve the problems at hand. By understanding and leveraging these techniques, you can harness the full potential of Generative AI to drive innovation and efficiency in all your projects.

The Power of Combining Generative AI and Traditional Computer Science

As someone who's been closely following AI developments for years, I've seen countless breakthroughs come and go. But the research we're discussing today has me more excited. We might be on the cusp of unlocking capabilities that rival or surpass latest frontier models, using much smaller open-sourced models combined with clever search algorithms.

In a recent study “Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B: A Technical Report” published by Shanghai Artificial Intelligence Laboratory, a team of researchers has developed a method they call "MCT Self-Refine" (MCTSr) that combines large language models (LLMs) with Monte Carlo Tree Search (MCTS) – a technique you might recognize from groundbreaking systems like AlphaGo or if you’re from software engineering – that’s something you’ve already heard of.

The experimental results demonstrated significant improvements in problem-solving success rates across various datasets. One notable area of success was in solving Olympic-level mathematical challenges, which are typically highly complex and demanding.

This fusion of generative AI with classic computer science techniques represents a significant leap forward in problem-solving capabilities. This hybrid approach leverages the strengths of both domains, potentially unlocking new levels of performance and efficiency.

So, how did they do it? And why is this combination so powerful? Last but not least, how can it be applied to real-world challenges?

Understanding the MCTSr Algorithm

The MCT Self-Refine (MCTSr) algorithm, as proposed by the Shanghai Artificial Intelligence Laboratory team, is a prime example of this synergistic approach. In short, MCTSr integrates Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS), a technique that has proven to be highly effective in complex decision-making scenarios.

This idea is quite simple, but at the same time, that’s why it is so powerful. Here's a breakdown of how MCTSr works:

Initial Solution Generation

The process begins with the LLM generating an initial solution to a given problem. Simultaneously, a "dummy" answer (e.g., "I don't know") is added to create an original set of potential solutions.
Self-Evaluation

The model then evaluates the quality of each solution, assigning scores on a scale from -100 to 100. This self-evaluation step is crucial, as it allows the model to critically assess its own outputs.
Selection

The highest-scoring solution is chosen for further refinement. This selection process ensures that the most promising ideas are explored more deeply.
Self-Refine

The LLM is prompted to critique the chosen solution, identifying potential flaws or areas for improvement. Based on this critique, a new, potentially improved solution is generated.
Tree Building

As this process repeats, it creates a tree structure of solutions. Each new solution becomes a branch, with the most promising branches spawning further refinements.
Backpropagation

The scores of new solutions are propagated back up the tree, updating the potential of earlier solutions based on their "descendants." This step allows the algorithm to learn from later iterations and adjust its strategy accordingly.
Iteration

The process continues, with the algorithm exploring the most promising branches of the solution tree until a satisfactory result is achieved or a predetermined stopping point is reached.

The most amazing thing is that the 8B model almost achieves the level of results of the GPT-4 (1.76T model). I can only wonder what will happen if we run this on more powerful models.

Understanding the Monte Carlo Tree Search (MCTS) Algorithm

A quick note about the algorithms researchers applied to the problem. Monte Carlo Tree Search (MCTS) is a heuristic search algorithm used for decision-making processes, particularly in game theory and AI applications. It has gained significant popularity due to its effectiveness in handling large and complex decision spaces, such as those found in board games like Go, Chess, and various other domains requiring strategic planning.

MCTS is particularly powerful because it combines the precision of tree search algorithms with the generality of random sampling. It balances exploration of unexplored nodes and exploitation of nodes that have shown promise in the past.

Starting from the root node, the algorithm navigates through the tree, choosing child nodes based on a balance between exploitation (selecting nodes with good known outcomes) and exploration (investigating less-visited nodes).
When the algorithm reaches a leaf node that isn't a terminal state, it expands the tree by adding one or more child nodes.
From the new node, the algorithm performs a simulation (often called a "rollout") to estimate the value of the node. This usually involves making random moves until reaching a terminal state.
The result of the simulation is then propagated back up the tree, updating the statistics of each node traversed.

By iterating through these steps many times, MCTS builds a asymmetric tree that focuses on the most promising lines of play. This approach allows it to handle extremely large search spaces efficiently, making it particularly well-suited for complex problems.

Results and Implications of MCTSr

Self-evaluation of an answer to build another node for MCTS

In the context of the MCTSr algorithm, MCTS is adapted to work with language models. Instead of game moves, the "actions" are refinements to the generated text. The "simulation" step is replaced by the language model's self-evaluation of the refined text. This clever adaptation allows the strengths of MCTS to be applied to language generation and problem-solving tasks.

The integration of MCTS with language models in MCTSr showcases how traditional computer science algorithms can be creatively applied to enhance the capabilities of generative AI. This mixed approach leverages the strengths of both GenAI and Search Algorithms: the structured exploration of Monte Carlo Tree Search (MCTS) and the flexible, context-aware generation of language models.

Performance on Mathematical Benchmarks

The researchers tested the MCTSr-enhanced LLaMA-3 8B model on several challenging mathematical datasets, including GSM8K, GSM Hard, MATH, AIME, Math Odyssey, and OlympiadBench. These datasets range from typical high school math problems to Olympic-level challenges that stump even the most advanced AI systems.

On the GSM8K dataset, which consists of grade school math word problems, the MCTSr approach showed impressive gains. The baseline LLaMA-3 8B model with zero-shot chain-of-thought reasoning solved about 74% of the problems. With the addition of MCTSr and 8 rollouts, this jumped to an astounding 96.66% success rate. This level of performance is competitive with much larger models like GPT-4, which reportedly achieves around 97% accuracy on this dataset.

The gains on more challenging datasets were even more striking. On the MATH dataset, which covers a wide range of high school and early undergraduate-level problems, the MCTSr approach more than doubled the baseline performance. The zero-shot model solved about 24% of problems, while the 8-rollout MCTSr version achieved 58.24% accuracy. This is a massive leap forward, bringing the 8B parameter model into the ballpark of models hundreds of times its size.

Perhaps most impressively, on Olympic-level problems from datasets like AIME and OlympiadBench, the MCTSr approach showed substantial improvements. While the absolute numbers might seem low (e.g., improving from 1.25% to 7.76% on OlympiadBench), it's important to remember that these are extremely difficult problems that challenge even human experts.

The fact that a relatively small model which you can run your local computer can solve any of these problems is remarkable, and the multi-fold improvement with MCTSr is truly significant.

Apple's Clever Use of Classic CS in LLMs

After discussing the impressive results of MCTSr on mathematical benchmarks, let's shift our focus to another fascinating example of how classic computer science techniques are being applied to modern AI challenges. While MCTSr demonstrates the power of combining search algorithms with language models for problem-solving, other researchers are tackling different aspects of AI implementation.

One particularly interesting case comes from Apple, where scientists have adapted a fundamental computer science concept to address the resource constraints of running large language models on mobile devices.

In a recent study titled "LLM in a flash: Efficient Large Language Model Inference with Limited Memory", Apple scientists demonstrated a clever application of a classic computer science technique to solve a very modern problem. Before we delve into the details, it's worth noting that this approach is another excellent example of how fundamental computer science concepts can be repurposed to overcome challenges in AI deployment.

The challenge Apple faced was: How can we bring the power of LLMs to mobile devices without compromising the performance? The solution they devised involves adapting the sliding window technique, a concept that has been well-known in computer science for decades, typically used in areas like data compression and network protocols.

Here's a quick overview of how they adapted it for LLMs:

Instead of processing the entire input at once, the model breaks it into smaller, manageable chunks.
The model maintains a "window" of tokens, including both a portion of the input and the generated output.
As the model generates new tokens, the window slides forward, discarding older tokens and incorporating new ones from the input.
By carefully managing the window size and sliding process, the model maintains enough context for coherent generation while significantly reducing memory usage.

This approach offers several key benefits, including a reduced memory footprint and improved real-time performance. Most importantly, it allows more advanced LLMs to run on hardware people already use, without the need for everyone to update their mobile devices.

The result they achieved is impressive: These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively.

This case study serves as another testament to the power of combining classic computer science techniques with modern AI challenges, much like the MCTSr approach we discussed earlier.

Practical Application of MCTSr for Businesses

Now that we've explored two different examples of how computer science techniques can enhance AI capabilities, let's consider a practical application of MCTSr that businesses could leverage today.

Take a manufacturing company, which operates a complex production line with multiple interconnected machines. Each machine has different maintenance requirements, and unplanned downtime can be extremely costly. The company wants to optimize its maintenance schedule to minimize downtime while also avoiding unnecessary maintenance.

Here's how MCTSr could be applied to this problem:

Initial Schedule Generation:
- The LLM component generates an initial maintenance schedule based on historical data, manufacturer recommendations, and current equipment status.
Node Creation and Evaluation:
- Each node in the Monte Carlo tree represents a specific maintenance action or set of actions.
- Nodes are evaluated based on the following business metrics:
  - Predicted downtime reduction
  - Maintenance costs
  - …
  - Risk of equipment failure
Self-Refinement:
- The LLM critiques each node, considering factors like resource availability, production schedules, and potential cascading effects on other equipment.
- Based on this critique, new nodes (refined maintenance plans) are generated.
Tree Exploration:
- The MCTS algorithm explores the most promising branches of the maintenance decision tree.
- As more simulations are run, the algorithm learns which types of maintenance strategies tend to yield the best results.
Continuous Integration:
- As maintenance actions are carried out in the real world, their actual outcomes are fed back into the system, allowing it to learn and improve over time.

This example demonstrates how MCTSr can be applied to a complex industrial problem, generating solutions that can be evaluated using clear business metrics. The power of this approach lies in its ability to consider multiple factors simultaneously, learn from actual outcomes, and continuously refine its strategies to improve key performance indicators.

The Rise of Automated AI Research

As we wrap up this episode of Tech Trendsetters, let's finally revisit the topic that's thrilling and unsettling at the same time: the potential for automated AI research that's within our grasp right now. We talked about “Automated AI research” as a main driving point for gaining superiority in all aspects of the future world in one of our recent episodes:

Superalignment or Extinction – The Manhattan Project of Our Time

Dmitry K

June 22, 2024

Read full story

We're standing on the precipice of a major shift in how AI development itself occurs. And to understand this, we need to revisit a piece in AI philosophy by Rich Sutton's "The Bitter Lesson."

For those unfamiliar, Rich Sutton's "The Bitter Lesson" posits that the most effective approach in AI research has consistently been to leverage raw computation power rather than trying to encode human knowledge into systems. Sutton argues that methods that scale with increased computation ultimately triumph over those relying on human-engineered features or domain-specific knowledge.

Now, you might be thinking, "Wait a minute, isn't that counterintuitive? Shouldn't our understanding of cognition and specialized knowledge help?" And that's precisely why Sutton calls it a "bitter" lesson. It's a hard pill to swallow for many researchers who've invested years in developing intricate, knowledge-based systems. But here's where it gets really interesting:

What if we apply this lesson to AI research itself?

The combination of generative AI and classic computer science techniques we've discussed throughout this episode isn't just applicable to solving math problems or running models on mobile devices. It opens up the possibility of AI systems that can conduct their own research and development. Crazy, right?

Imagine an AI system that:

Generates hypotheses about potential improvements to its own architecture;
Designs experiments to test these hypotheses;
Implements and runs these experiments;
Analyzes the results and iterates on the process;

This isn't science fiction. The building blocks are already here:

Large language models can generate coherent, creative ideas in natural language;
Code generation models can translate these ideas into executable experiments;
Automated testing and benchmarking tools can evaluate the results;
The whole process can be guided by techniques like Monte Carlo Tree Search (MCTS) to efficiently explore the vast space of possible improvements;

If we can successfully automate significant portions of AI research, we could see an explosion in the pace of advancement. It's a form of meta-learning that could lead to rapid, compounding improvements in AI capabilities.

But remember the Bitter Lesson. The most successful automated AI research systems might not resemble human research processes at all. They might discover optimization techniques or architectural innovations that seem counterintuitive or even incomprehensible to human researchers.

This brings us to a crucial point: The goal isn't to replicate human research methods, but to find the most effective ways for machines to improve themselves, leveraging their unique strengths in computation and pattern recognition.

From Practical Solutions to Paradigm Shifts

Instead of a traditional conclusion, I've deliberately structured this episode to offer value to a diverse audience. For the software engineers among you, we've delved into the technical details of algorithms like MCTS and how they can be integrated with language models. Business leaders will find insights into how these technologies can be applied to real-world problems and potentially transform industries. And for the visionaries and scientists tuning in, we've explored the cutting-edge possibilities that come with pushing the boundaries of AI capabilities.

Let's take a moment to reflect on how our journey through this episode – from MCTSr's mathematical problem-solving to Apple's mobile LLM innovations, to practical industrial applications – illustrates these diverse perspectives:

For Software Engineers: The MCTSr algorithm demonstrates how classic techniques like Monte Carlo Tree Search can be creatively combined with modern LLMs to achieve remarkable results. This opens up new avenues for algorithm design and AI system architecture.
For Business Leaders: Our industrial maintenance example showcases the tangible benefits of these technologies. By optimizing maintenance schedules, businesses can significantly reduce downtime, cut costs, and improve overall equipment effectiveness. This is just one of many potential applications that could transform various industries.
For Visionaries and Scientists: The fusion of generative AI with classic computer science techniques points towards a future where AI systems might drive their own evolution. AI that can analyze its own performance, generate hypotheses for improvement, and conduct its own experiments.

You may now see how the fusion of generative AI with classic computer science techniques opens up a two-fold opportunity. On one hand, it provides powerful tools to expand your problem-solving capabilities right here on Earth. On the other hand the potential for automated AI research to accelerate progress is enormous.

The Bitter Lesson reminds us that the most effective paths in AI development may not align with our intuitions or preferences. We must be prepared for surprises, challenges, and yes, some bitterness as we navigate this new frontier.

As I sign off, I encourage you all to consider how you might apply these ideas in your own work and thinking. The beauty of combining generative AI with classic computer science lies in its versatility – it's a toolkit that can be adapted to an incredibly wide range of problems and domains.

The future of AI is being written now, and each of us has a role to play in it. Until next time!

🔎 Explore more: