File size: 6,860 Bytes
191850d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
---
license: apache-2.0
language:
- en
tags:
- sentence-transformers
- embeddings
- retrieval
- agents
- memory
- rag
- semantic-search
library_name: transformers
pipeline_tag: sentence-similarity
datasets:
- custom
metrics:
- mrr
- recall
- ndcg
model-index:
- name: agentrank-small
results:
- task:
type: retrieval
name: Agent Memory Retrieval
metrics:
- type: mrr
value: 0.6375
name: MRR
- type: recall
value: 0.4460
name: Recall@1
- type: recall
value: 0.9740
name: Recall@5
- type: ndcg
value: 0.6797
name: NDCG@10
---
# AgentRank-Small: Embedding Model for AI Agent Memory Retrieval
<p align="center">
<img src="https://img.shields.io/badge/MRR-0.6375-brightgreen" alt="MRR">
<img src="https://img.shields.io/badge/Recall%405-97.4%25-blue" alt="Recall@5">
<img src="https://img.shields.io/badge/Parameters-33M-orange" alt="Parameters">
<img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
</p>
**AgentRank** is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.
## π Key Results
| Model | MRR | Recall@1 | Recall@5 | NDCG@10 |
|-------|-----|----------|----------|---------|
| **AgentRank-Small** | **0.6375** | **0.4460** | **0.9740** | **0.6797** |
| all-MiniLM-L6-v2 | 0.5297 | 0.3720 | 0.7520 | 0.6370 |
| all-mpnet-base-v2 | 0.5351 | 0.3660 | 0.7960 | 0.6335 |
**+20% MRR improvement over base MiniLM model!**
## π― Why AgentRank?
AI agents need memory that understands:
| Challenge | General Embedders | AgentRank |
|-----------|-------------------|-----------|
| "What did I say **yesterday**?" | β No temporal awareness | β
Temporal embeddings |
| "What's my **preference**?" | β Mixes with events | β
Memory type awareness |
| "What's **most important**?" | β No priority | β
Importance prediction |
## π¦ Installation
```bash
pip install transformers torch
```
## π» Usage
### Basic Usage
```python
from transformers import AutoModel, AutoTokenizer
import torch
# Load model
model = AutoModel.from_pretrained("vrushket/agentrank-small")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")
def encode(texts):
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
embeddings = outputs.last_hidden_state.mean(dim=1)
embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
return embeddings
# Encode memories and query
memories = [
"User prefers Python over JavaScript",
"User asked about machine learning yesterday",
"User is working on a web project",
]
query = "What programming language does the user like?"
memory_embeddings = encode(memories)
query_embedding = encode([query])
# Compute similarities
similarities = torch.mm(query_embedding, memory_embeddings.T)
print(f"Most relevant: {memories[similarities.argmax()]}")
# Output: "User prefers Python over JavaScript"
```
### With Temporal & Memory Type Metadata (Full Power)
```python
# For full AgentRank features including temporal awareness:
# pip install agentrank (coming soon!)
from agentrank import AgentRankEmbedder
model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")
# Encode with metadata
embedding = model.encode(
"User mentioned they prefer morning meetings",
days_ago=3, # Memory is 3 days old
memory_type="semantic" # It's a preference, not an event
)
```
## ποΈ Architecture
AgentRank-Small is based on `all-MiniLM-L6-v2` with novel additions:
```
βββββββββββββββββββββββββββββββββββββββββββ
β MiniLM Transformer Encoder (6 layers) β
βββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
β β β
βββββββββββ ββββββββββββ βββββββββββββ
β Temporal β β Memory β β Importanceβ
β Position β β Type β β Predictionβ
β Embed β β Embed β β Head β
βββββββββββ ββββββββββββ βββββββββββββ
β β β
βββββββββββββββββΌββββββββββββββββ
β
βββββββββββββββββββ
β L2 Normalized β
β 384-dim Embeddingβ
βββββββββββββββββββ
```
**Novel Features:**
- **Temporal Position Embeddings**: 10 learnable buckets (today, 1-3 days, week, month, etc.)
- **Memory Type Embeddings**: Episodic, Semantic, Procedural
- **Importance Prediction Head**: Auxiliary task during training
## π Training
- **Dataset**: 500K synthetic agent memory samples
- **Memory Types**: Episodic (40%), Semantic (35%), Procedural (25%)
- **Loss**: Multiple Negatives Ranking Loss + Importance MSE
- **Hard Negatives**: 5 types (temporal, type confusion, topic drift, etc.)
- **Hardware**: NVIDIA RTX 6000 Ada (48GB) with FP16
## π Benchmarks
Evaluated on AgentMemBench (500 test samples, 8 candidates each):
| Metric | AgentRank-Small | MiniLM | Improvement |
|--------|-----------------|--------|-------------|
| MRR | 0.6375 | 0.5297 | **+20.4%** |
| Recall@1 | 0.4460 | 0.3720 | **+19.9%** |
| Recall@5 | 0.9740 | 0.7520 | **+29.5%** |
| NDCG@10 | 0.6797 | 0.6370 | **+6.7%** |
## π Coming Soon
- **AgentRank-Base**: 110M params, even better performance
- **AgentRank-Reranker**: Cross-encoder for top-k refinement
- **Python Package**: `pip install agentrank`
## π Citation
```bibtex
@misc{agentrank2024,
author = {Vrushket More},
title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/vrushket/agentrank-small}
}
```
## π License
Apache 2.0 - Free for commercial use!
## π€ Acknowledgments
Built on top of [sentence-transformers](https://www.sbert.net/) and [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).
|