Effective use of LLMs: LLM routing and complex query decomposition- Group 2

Term: 
2025-2026 Fall
Faculty Department of Project Supervisor: 
Faculty of Engineering and Natural Sciences
Number of Students: 
5

This project aims to investigate and design a cost-efficient framework for implementing Large Language Models (LLMs), with a focus on optimizing both performance and resource usage. Specifically, we will study the following key aspects:

1. Cost-Accuracy Trade-off:

We will explore how to optimize query routing across a set of LLMs that vary in cost, latency, and response quality. The objective is to:

Develop routing strategies that minimize computational and financial costs.

Ensure that selected strategies meet predefined quality or accuracy thresholds.

Understand the trade-offs between using smaller, cheaper models and larger, more capable (but costlier) ones.

2. Query Decomposition:

This component investigates the possibility of breaking down complex queries into smaller, simpler sub-queries that can be:

Solved more efficiently,

Routed to specialized or smaller LLMs (LLMs  have only partial information)

And later recombined into a coherent final response.

We aim to define a general framework for query decomposition, including:

Query analysis and segmentation techniques,

Assignment policies for sub-queries,

Aggregation mechanisms for the final output.

3. Unified End-to-End Framework

The final step is to design a comprehensive end-to-end framework that integrates both:

Cost-aware routing strategies (from part 1), and

Query decomposition techniques (from part 2),

This unified system should:

Dynamically route queries (or sub-queries) to the most suitable models,

Balance performance and cost,

Research Objectives:

The main goals of the project include:

Literature Survey: Analyze and classify existing research related to cost-efficient LLMs, routing strategies, and query decomposition.

Comparative Analysis: Identify strengths and limitations of current approaches.

Framework Proposal: Outline a novel framework (a prototype) with clear design principles
 
Some Related works:
BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute, ICML 2025
ROUTELLM: LEARNING TO ROUTE LLMS WITH PREFERENCE DATA, ICLR 2025
 
In the scope of this research, there will be a possible collaboration with Imperial College London and the University of Birmingham.
 
 

Related Areas of Project: 
Computer Science and Engineering
Electronics Engineering
Industrial Engineering
​Mathematics

About Project Supervisors

Emre Özfatura
Çağlar Tunç