File size: 2,888 Bytes
613faa7
 
 
 
 
 
 
 
 
 
 
 
 
da18c24
21b6e70
da18c24
21b6e70
da18c24
 
21b6e70
da18c24
613faa7
 
da18c24
 
 
 
 
 
 
 
 
 
 
 
613faa7
 
 
 
 
 
 
da18c24
 
 
 
613faa7
 
da18c24
 
 
 
 
613faa7
da18c24
613faa7
 
 
 
 
 
 
 
 
da18c24
 
 
 
 
 
 
 
613faa7
da18c24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
613faa7
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
library_name: pytorch
license: apache-2.0
tags:
  - vision
  - image-classification
  - geometric-deep-learning
  - vit
  - cantor-routing
  - pentachoron
  - multi-scale
---

# πŸ«˜πŸ’Ž DavidBeans: Unified Vision-to-Crystal Architecture

This repository contains training runs for DavidBeans - a unified geometric deep learning architecture combining:

- **BEANS (ViT Backbone)**: Cantor-routed sparse attention
- **DAVID (Classifier)**: Multi-scale crystal projection with Cayley-Menger geometric regularization

## Repository Structure

```
AbstractPhil/geovit-david-beans/
β”œβ”€β”€ README.md (this file)
└── weights/
    β”œβ”€β”€ run_001_baseline_YYYYMMDD_HHMMSS/
    β”‚   β”œβ”€β”€ best.safetensors
    β”‚   β”œβ”€β”€ epoch_010.safetensors
    β”‚   β”œβ”€β”€ config.json
    β”‚   β”œβ”€β”€ training_config.json
    β”‚   └── tensorboard/
    β”œβ”€β”€ run_002_5expert_5scale_YYYYMMDD_HHMMSS/
    β”‚   └── ...
    └── ...
```

## Usage

```python
from safetensors.torch import load_file
from david_beans import DavidBeans, DavidBeansConfig
import json

# Pick a run
run_path = "weights/run_002_5expert_5scale_20251129_171229"

# Load config
with open(f"{run_path}/config.json") as f:
    config_dict = json.load(f)
config = DavidBeansConfig(**config_dict)

# Load model
model = DavidBeans(config)
state_dict = load_file(f"{run_path}/best.safetensors")
model.load_state_dict(state_dict)

# Inference
model.eval()
with torch.no_grad():
    output = model(images)
    predictions = output['logits'].argmax(dim=-1)
```

## Training Runs

| Run | Name | Accuracy | Notes |
|-----|------|----------|-------|
| 001 | baseline | 70.05% | Initial CIFAR-100 run |
| 002 | 5expert_5scale | 68.34% | 5 experts, 5 scales |

## Architecture

```
Image [B, 3, 32, 32]
       β”‚
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  BEANS BACKBONE                         β”‚
β”‚  β”œβ”€ Patch Embed β†’ [64 patches, dim]     β”‚
β”‚  β”œβ”€ Hybrid Cantor Router                β”‚
β”‚  β”œβ”€ N Γ— Attention Blocks                β”‚
β”‚  └─ N Γ— Pentachoron Expert Layers       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  DAVID HEAD                             β”‚
β”‚  β”œβ”€ Multi-scale projection              β”‚
β”‚  β”œβ”€ Per-scale Crystal Heads             β”‚
β”‚  └─ Geometric Fusion                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β–Ό
    [num_classes]
```

## License

Apache 2.0