Upload AgentRank model

Browse files

Files changed (8) hide show

README.md +207 -0
agentrank_components.pt +3 -0
config.json +25 -0
model.safetensors +3 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +65 -0
vocab.txt +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,207 @@

+---
+license: apache-2.0
+language:
+- en
+tags:
+- sentence-transformers
+- embeddings
+- retrieval
+- agents
+- memory
+- rag
+- semantic-search
+library_name: transformers
+pipeline_tag: sentence-similarity
+datasets:
+- custom
+metrics:
+- mrr
+- recall
+- ndcg
+model-index:
+- name: agentrank-small
+  results:
+  - task:
+      type: retrieval
+      name: Agent Memory Retrieval
+    metrics:
+    - type: mrr
+      value: 0.6375
+      name: MRR
+    - type: recall
+      value: 0.4460
+      name: Recall@1
+    - type: recall
+      value: 0.9740
+      name: Recall@5
+    - type: ndcg
+      value: 0.6797
+      name: NDCG@10
+---
+# AgentRank-Small: Embedding Model for AI Agent Memory Retrieval
+<p align="center">
+  <img src="https://img.shields.io/badge/MRR-0.6375-brightgreen" alt="MRR">
+  <img src="https://img.shields.io/badge/Recall%405-97.4%25-blue" alt="Recall@5">
+  <img src="https://img.shields.io/badge/Parameters-33M-orange" alt="Parameters">
+  <img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
+</p>
+**AgentRank** is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.
+## 🚀 Key Results
+| Model | MRR | Recall@1 | Recall@5 | NDCG@10 |
+|-------|-----|----------|----------|---------|
+| **AgentRank-Small** | **0.6375** | **0.4460** | **0.9740** | **0.6797** |
+| all-MiniLM-L6-v2 | 0.5297 | 0.3720 | 0.7520 | 0.6370 |
+| all-mpnet-base-v2 | 0.5351 | 0.3660 | 0.7960 | 0.6335 |
+**+20% MRR improvement over base MiniLM model!**
+## 🎯 Why AgentRank?
+AI agents need memory that understands:
+| Challenge | General Embedders | AgentRank |
+|-----------|-------------------|-----------|
+| "What did I say **yesterday**?" | ❌ No temporal awareness | ✅ Temporal embeddings |
+| "What's my **preference**?" | ❌ Mixes with events | ✅ Memory type awareness |
+| "What's **most important**?" | ❌ No priority | ✅ Importance prediction |
+## 📦 Installation
+```bash
+pip install transformers torch
+```
+## 💻 Usage
+### Basic Usage
+```python
+from transformers import AutoModel, AutoTokenizer
+import torch
+# Load model
+model = AutoModel.from_pretrained("vrushket/agentrank-small")
+tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")
+def encode(texts):
+    inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
+    with torch.no_grad():
+        outputs = model(**inputs)
+        embeddings = outputs.last_hidden_state.mean(dim=1)
+        embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
+    return embeddings
+# Encode memories and query
+memories = [
+    "User prefers Python over JavaScript",
+    "User asked about machine learning yesterday",
+    "User is working on a web project",
+]
+query = "What programming language does the user like?"
+memory_embeddings = encode(memories)
+query_embedding = encode([query])
+# Compute similarities
+similarities = torch.mm(query_embedding, memory_embeddings.T)
+print(f"Most relevant: {memories[similarities.argmax()]}")
+# Output: "User prefers Python over JavaScript"
+```
+### With Temporal & Memory Type Metadata (Full Power)
+```python
+# For full AgentRank features including temporal awareness:
+# pip install agentrank  (coming soon!)
+from agentrank import AgentRankEmbedder
+model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")
+# Encode with metadata
+embedding = model.encode(
+    "User mentioned they prefer morning meetings",
+    days_ago=3,           # Memory is 3 days old
+    memory_type="semantic" # It's a preference, not an event
+)
+```
+## 🏗️ Architecture
+AgentRank-Small is based on `all-MiniLM-L6-v2` with novel additions:
+```
+┌─────────────────────────────────────────┐
+│  MiniLM Transformer Encoder (6 layers)  │
+└─────────────────────────────────────────┘
+                    │
+    ┌───────────────┼───────────────┐
+    ↓               ↓               ↓
+┌─────────┐   ┌──────────┐   ┌───────────┐
+│ Temporal │   │ Memory   │   │ Importance│
+│ Position │   │ Type     │   │ Prediction│
+│ Embed    │   │ Embed    │   │ Head      │
+└─────────┘   └──────────┘   └───────────┘
+    │               │               │
+    └───────────────┼───────────────┘
+                    ↓
+         ┌─────────────────┐
+         │ L2 Normalized   │
+         │ 384-dim Embedding│
+         └─────────────────┘
+```
+**Novel Features:**
+- **Temporal Position Embeddings**: 10 learnable buckets (today, 1-3 days, week, month, etc.)
+- **Memory Type Embeddings**: Episodic, Semantic, Procedural
+- **Importance Prediction Head**: Auxiliary task during training
+## 🎓 Training
+- **Dataset**: 500K synthetic agent memory samples
+- **Memory Types**: Episodic (40%), Semantic (35%), Procedural (25%)
+- **Loss**: Multiple Negatives Ranking Loss + Importance MSE
+- **Hard Negatives**: 5 types (temporal, type confusion, topic drift, etc.)
+- **Hardware**: NVIDIA RTX 6000 Ada (48GB) with FP16
+## 📊 Benchmarks
+Evaluated on AgentMemBench (500 test samples, 8 candidates each):
+| Metric | AgentRank-Small | MiniLM | Improvement |
+|--------|-----------------|--------|-------------|
+| MRR | 0.6375 | 0.5297 | **+20.4%** |
+| Recall@1 | 0.4460 | 0.3720 | **+19.9%** |
+| Recall@5 | 0.9740 | 0.7520 | **+29.5%** |
+| NDCG@10 | 0.6797 | 0.6370 | **+6.7%** |
+## 🔜 Coming Soon
+- **AgentRank-Base**: 110M params, even better performance
+- **AgentRank-Reranker**: Cross-encoder for top-k refinement
+- **Python Package**: `pip install agentrank`
+## 📚 Citation
+```bibtex
+@misc{agentrank2024,
+  author = {Vrushket More},
+  title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
+  year = {2024},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/vrushket/agentrank-small}
+}
+```
+## 📄 License
+Apache 2.0 - Free for commercial use!
+## 🤝 Acknowledgments
+Built on top of [sentence-transformers](https://www.sbert.net/) and [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).

agentrank_components.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f2116b40785efd6d5ad2e5303327afd730b089b4768e14f6c9d27524fa1fb17e
+size 912924

config.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "dtype": "float32",
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 6,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "transformers_version": "4.57.3",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fdb9fcdb851dbc7bba1f70a3ac3bf3bc1de46a4a8e0284d50b0270e8d69869f0
+size 90864192

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_length": 128,
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff