AgentRank Reranker

Cross-encoder reranker for AI agent memory retrieval - the second stage of a two-stage retrieval pipeline.

What is This?

AgentRank Reranker is a cross-encoder that scores query-memory pairs for relevance. Use it after fast vector retrieval to rerank candidates for higher accuracy.

Two-Stage Pipeline

Stage Model Speed Accuracy
1. Retrieve agentrank-base Fast Good
2. Rerank This model Slower Best

Performance

Metric Value
Validation Accuracy 89.11%
Best Val Loss 0.2554
Base Model ModernBERT-base
Parameters ~149M

Quick Start

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model
model = AutoModelForSequenceClassification.from_pretrained("vrushket/agentrank-reranker")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-reranker")
model.eval()

# Score relevance
query = "What did we discuss about Python?"
memory = "User mentioned they prefer Python for backend development"

inputs = tokenizer(query, memory, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    score = torch.sigmoid(model(**inputs).logits).item()

print(f"Relevance: {score:.2%}")

Rerank Top-K Candidates

def rerank(query, candidates, top_k=10):
    scores = []
    for memory in candidates:
        inputs = tokenizer(query, memory, return_tensors="pt", truncation=True)
        with torch.no_grad():
            score = torch.sigmoid(model(**inputs).logits).item()
        scores.append((score, memory))
    return sorted(scores, reverse=True)[:top_k]

# Example usage
top_50_from_vector_db = [...]
top_10 = rerank("What is my Python preference?", top_50_from_vector_db)

Full Two-Stage Pipeline

from agentrank import AgentRankEmbedder
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Stage 1: Fast retrieval with embedder
embedder = AgentRankEmbedder.from_pretrained("vrushket/agentrank-base")
query_embedding = embedder.encode(["What is my Python preference?"])
# ... search vector DB for top-50 candidates ...

# Stage 2: Accurate reranking
reranker = AutoModelForSequenceClassification.from_pretrained("vrushket/agentrank-reranker")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-reranker")

scored = []
for memory in top_50_candidates:
    inputs = tokenizer(query, memory, return_tensors="pt", truncation=True)
    with torch.no_grad():
        score = torch.sigmoid(reranker(**inputs).logits).item()
    scored.append((score, memory))

# Return top-10
final_results = sorted(scored, reverse=True)[:10]

Related Models

Model Type Link
AgentRank-Base Embedder https://huggingface.co/vrushket/agentrank-base
AgentRank-Small Embedder https://huggingface.co/vrushket/agentrank-small
AgentRank-Reranker Cross-encoder This model

Training Details

  • Base Model: ModernBERT-base
  • Epochs: 3
  • Batch Size: 16
  • Learning Rate: 2e-5
  • Loss: Binary Cross-Entropy
  • Hardware: 2x RTX 6000 Ada (48GB)

Links

Contact

License

Apache 2.0

Downloads last month
12
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support