AgentRank Reranker
Cross-encoder reranker for AI agent memory retrieval - the second stage of a two-stage retrieval pipeline.
What is This?
AgentRank Reranker is a cross-encoder that scores query-memory pairs for relevance. Use it after fast vector retrieval to rerank candidates for higher accuracy.
Two-Stage Pipeline
| Stage | Model | Speed | Accuracy |
|---|---|---|---|
| 1. Retrieve | agentrank-base | Fast | Good |
| 2. Rerank | This model | Slower | Best |
Performance
| Metric | Value |
|---|---|
| Validation Accuracy | 89.11% |
| Best Val Loss | 0.2554 |
| Base Model | ModernBERT-base |
| Parameters | ~149M |
Quick Start
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load model
model = AutoModelForSequenceClassification.from_pretrained("vrushket/agentrank-reranker")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-reranker")
model.eval()
# Score relevance
query = "What did we discuss about Python?"
memory = "User mentioned they prefer Python for backend development"
inputs = tokenizer(query, memory, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
score = torch.sigmoid(model(**inputs).logits).item()
print(f"Relevance: {score:.2%}")
Rerank Top-K Candidates
def rerank(query, candidates, top_k=10):
scores = []
for memory in candidates:
inputs = tokenizer(query, memory, return_tensors="pt", truncation=True)
with torch.no_grad():
score = torch.sigmoid(model(**inputs).logits).item()
scores.append((score, memory))
return sorted(scores, reverse=True)[:top_k]
# Example usage
top_50_from_vector_db = [...]
top_10 = rerank("What is my Python preference?", top_50_from_vector_db)
Full Two-Stage Pipeline
from agentrank import AgentRankEmbedder
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Stage 1: Fast retrieval with embedder
embedder = AgentRankEmbedder.from_pretrained("vrushket/agentrank-base")
query_embedding = embedder.encode(["What is my Python preference?"])
# ... search vector DB for top-50 candidates ...
# Stage 2: Accurate reranking
reranker = AutoModelForSequenceClassification.from_pretrained("vrushket/agentrank-reranker")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-reranker")
scored = []
for memory in top_50_candidates:
inputs = tokenizer(query, memory, return_tensors="pt", truncation=True)
with torch.no_grad():
score = torch.sigmoid(reranker(**inputs).logits).item()
scored.append((score, memory))
# Return top-10
final_results = sorted(scored, reverse=True)[:10]
Related Models
| Model | Type | Link |
|---|---|---|
| AgentRank-Base | Embedder | https://huggingface.co/vrushket/agentrank-base |
| AgentRank-Small | Embedder | https://huggingface.co/vrushket/agentrank-small |
| AgentRank-Reranker | Cross-encoder | This model |
Training Details
- Base Model: ModernBERT-base
- Epochs: 3
- Batch Size: 16
- Learning Rate: 2e-5
- Loss: Binary Cross-Entropy
- Hardware: 2x RTX 6000 Ada (48GB)
Links
Contact
- Author: Vrushket More
- Email: [email protected]
- LinkedIn: https://www.linkedin.com/in/vrushket-more-07b38b25b/
License
Apache 2.0
- Downloads last month
- 12