AgentRank Reranker

Cross-encoder reranker for AI agent memory retrieval - the second stage of a two-stage retrieval pipeline.

What is This?

AgentRank Reranker is a cross-encoder that scores query-memory pairs for relevance. Use it after fast vector retrieval to rerank candidates for higher accuracy.

Two-Stage Pipeline

Stage	Model	Speed	Accuracy
1. Retrieve	agentrank-base	Fast	Good
2. Rerank	This model	Slower	Best

Performance

Metric	Value
Validation Accuracy	89.11%
Best Val Loss	0.2554
Base Model	ModernBERT-base
Parameters	~149M

Quick Start

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model
model = AutoModelForSequenceClassification.from_pretrained("vrushket/agentrank-reranker")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-reranker")
model.eval()

# Score relevance
query = "What did we discuss about Python?"
memory = "User mentioned they prefer Python for backend development"

inputs = tokenizer(query, memory, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    score = torch.sigmoid(model(**inputs).logits).item()

print(f"Relevance: {score:.2%}")

Rerank Top-K Candidates

def rerank(query, candidates, top_k=10):
    scores = []
    for memory in candidates:
        inputs = tokenizer(query, memory, return_tensors="pt", truncation=True)
        with torch.no_grad():
            score = torch.sigmoid(model(**inputs).logits).item()
        scores.append((score, memory))
    return sorted(scores, reverse=True)[:top_k]

# Example usage
top_50_from_vector_db = [...]
top_10 = rerank("What is my Python preference?", top_50_from_vector_db)

Full Two-Stage Pipeline

from agentrank import AgentRankEmbedder
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Stage 1: Fast retrieval with embedder
embedder = AgentRankEmbedder.from_pretrained("vrushket/agentrank-base")
query_embedding = embedder.encode(["What is my Python preference?"])
# ... search vector DB for top-50 candidates ...

# Stage 2: Accurate reranking
reranker = AutoModelForSequenceClassification.from_pretrained("vrushket/agentrank-reranker")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-reranker")

scored = []
for memory in top_50_candidates:
    inputs = tokenizer(query, memory, return_tensors="pt", truncation=True)
    with torch.no_grad():
        score = torch.sigmoid(reranker(**inputs).logits).item()
    scored.append((score, memory))

# Return top-10
final_results = sorted(scored, reverse=True)[:10]

Related Models

Model	Type	Link
AgentRank-Base	Embedder	https://huggingface.co/vrushket/agentrank-base
AgentRank-Small	Embedder	https://huggingface.co/vrushket/agentrank-small
AgentRank-Reranker	Cross-encoder	This model

Training Details

Base Model: ModernBERT-base
Epochs: 3
Batch Size: 16
Learning Rate: 2e-5
Loss: Binary Cross-Entropy
Hardware: 2x RTX 6000 Ada (48GB)

Contact

Author: Vrushket More
Email: [email protected]
LinkedIn: https://www.linkedin.com/in/vrushket-more-07b38b25b/

License

Apache 2.0

Downloads last month: 12

Safetensors

Model size

0.1B params

Tensor type

F32

vrushket
/

agentrank-reranker