--- license: apache-2.0 language: - en tags: - sentence-transformers - embeddings - retrieval - agents - memory - rag - semantic-search library_name: transformers pipeline_tag: sentence-similarity datasets: - custom metrics: - mrr - recall - ndcg model-index: - name: agentrank-small results: - task: type: retrieval name: Agent Memory Retrieval metrics: - type: mrr value: 0.6375 name: MRR - type: recall value: 0.4460 name: Recall@1 - type: recall value: 0.9740 name: Recall@5 - type: ndcg value: 0.6797 name: NDCG@10 --- # AgentRank-Small: Embedding Model for AI Agent Memory Retrieval

MRR Recall@5 Parameters License

**AgentRank** is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions. ## πŸš€ Key Results | Model | MRR | Recall@1 | Recall@5 | NDCG@10 | |-------|-----|----------|----------|---------| | **AgentRank-Small** | **0.6375** | **0.4460** | **0.9740** | **0.6797** | | all-MiniLM-L6-v2 | 0.5297 | 0.3720 | 0.7520 | 0.6370 | | all-mpnet-base-v2 | 0.5351 | 0.3660 | 0.7960 | 0.6335 | **+20% MRR improvement over base MiniLM model!** ## 🎯 Why AgentRank? AI agents need memory that understands: | Challenge | General Embedders | AgentRank | |-----------|-------------------|-----------| | "What did I say **yesterday**?" | ❌ No temporal awareness | βœ… Temporal embeddings | | "What's my **preference**?" | ❌ Mixes with events | βœ… Memory type awareness | | "What's **most important**?" | ❌ No priority | βœ… Importance prediction | ## πŸ“¦ Installation ```bash pip install transformers torch ``` ## πŸ’» Usage ### Basic Usage ```python from transformers import AutoModel, AutoTokenizer import torch # Load model model = AutoModel.from_pretrained("vrushket/agentrank-small") tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small") def encode(texts): inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) embeddings = outputs.last_hidden_state.mean(dim=1) embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1) return embeddings # Encode memories and query memories = [ "User prefers Python over JavaScript", "User asked about machine learning yesterday", "User is working on a web project", ] query = "What programming language does the user like?" memory_embeddings = encode(memories) query_embedding = encode([query]) # Compute similarities similarities = torch.mm(query_embedding, memory_embeddings.T) print(f"Most relevant: {memories[similarities.argmax()]}") # Output: "User prefers Python over JavaScript" ``` ### With Temporal & Memory Type Metadata (Full Power) ```python # For full AgentRank features including temporal awareness: # pip install agentrank (coming soon!) from agentrank import AgentRankEmbedder model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small") # Encode with metadata embedding = model.encode( "User mentioned they prefer morning meetings", days_ago=3, # Memory is 3 days old memory_type="semantic" # It's a preference, not an event ) ``` ## πŸ—οΈ Architecture AgentRank-Small is based on `all-MiniLM-L6-v2` with novel additions: ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ MiniLM Transformer Encoder (6 layers) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” ↓ ↓ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Temporal β”‚ β”‚ Memory β”‚ β”‚ Importanceβ”‚ β”‚ Position β”‚ β”‚ Type β”‚ β”‚ Predictionβ”‚ β”‚ Embed β”‚ β”‚ Embed β”‚ β”‚ Head β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ↓ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ L2 Normalized β”‚ β”‚ 384-dim Embeddingβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` **Novel Features:** - **Temporal Position Embeddings**: 10 learnable buckets (today, 1-3 days, week, month, etc.) - **Memory Type Embeddings**: Episodic, Semantic, Procedural - **Importance Prediction Head**: Auxiliary task during training ## πŸŽ“ Training - **Dataset**: 500K synthetic agent memory samples - **Memory Types**: Episodic (40%), Semantic (35%), Procedural (25%) - **Loss**: Multiple Negatives Ranking Loss + Importance MSE - **Hard Negatives**: 5 types (temporal, type confusion, topic drift, etc.) - **Hardware**: NVIDIA RTX 6000 Ada (48GB) with FP16 ## πŸ“Š Benchmarks Evaluated on AgentMemBench (500 test samples, 8 candidates each): | Metric | AgentRank-Small | MiniLM | Improvement | |--------|-----------------|--------|-------------| | MRR | 0.6375 | 0.5297 | **+20.4%** | | Recall@1 | 0.4460 | 0.3720 | **+19.9%** | | Recall@5 | 0.9740 | 0.7520 | **+29.5%** | | NDCG@10 | 0.6797 | 0.6370 | **+6.7%** | ## πŸ”œ Coming Soon - **AgentRank-Base**: 110M params, even better performance - **AgentRank-Reranker**: Cross-encoder for top-k refinement - **Python Package**: `pip install agentrank` ## πŸ“š Citation ```bibtex @misc{agentrank2024, author = {Vrushket More}, title = {AgentRank: Embedding Models for AI Agent Memory Retrieval}, year = {2024}, publisher = {HuggingFace}, url = {https://huggingface.co/vrushket/agentrank-small} } ``` ## πŸ“„ License Apache 2.0 - Free for commercial use! ## 🀝 Acknowledgments Built on top of [sentence-transformers](https://www.sbert.net/) and [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).