muzakkirhussain011 commited on
Commit
8bab08d
·
1 Parent(s): fa8f1a7

Add application files (text files only)

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
.dockerignore ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+
5
+ # Python
6
+ __pycache__
7
+ *.py[cod]
8
+ *$py.class
9
+ *.so
10
+ .Python
11
+ .env
12
+ .venv
13
+ env/
14
+ venv/
15
+ ENV/
16
+
17
+ # IDE
18
+ .vscode
19
+ .idea
20
+ *.swp
21
+ *.swo
22
+
23
+ # OS
24
+ .DS_Store
25
+ Thumbs.db
26
+
27
+ # Logs
28
+ *.log
29
+ logs/
30
+
31
+ # Test
32
+ tests/
33
+ pytest_cache/
34
+ .pytest_cache/
35
+ .coverage
36
+ htmlcov/
37
+
38
+ # Documentation
39
+ *.md
40
+ !README.md
41
+
42
+ # Build artifacts
43
+ dist/
44
+ build/
45
+ *.egg-info/
.env.example ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: .env.example
2
+
3
+ # =============================================================================
4
+ # CX AI Agent Configuration
5
+ # =============================================================================
6
+
7
+ # Hugging Face Configuration (REQUIRED)
8
+ HF_API_TOKEN=your_huggingface_api_token_here
9
+ MODEL_NAME=Qwen/Qwen2.5-7B-Instruct
10
+ MODEL_NAME_FALLBACK=mistralai/Mistral-7B-Instruct-v0.2
11
+
12
+ # Web Search Configuration
13
+ # Uses Serper API (serper.dev) - Low-cost Google Search API
14
+ # Get your free API key from: https://serper.dev/ (2,500 free searches/month)
15
+ SERPER_API_KEY=your_serper_api_key_here
16
+
17
+ # SKIP_WEB_SEARCH: Set to "true" to skip web search and use intelligent fallback data
18
+ # Recommended for: Demo environments, or when SERPER_API_KEY is not available
19
+ SKIP_WEB_SEARCH=false
20
+
21
+ # MCP Mode (for deployment)
22
+ # Set to "true" for Hugging Face Spaces (uses in-memory services)
23
+ # Set to "false" for local development (uses separate MCP servers)
24
+ USE_IN_MEMORY_MCP=true
25
+
26
+ # Paths
27
+ COMPANY_FOOTER_PATH=./data/footer.txt
28
+ VECTOR_INDEX_PATH=./data/faiss.index
29
+ COMPANIES_FILE=./data/companies.json
30
+ SUPPRESSION_FILE=./data/suppression.json
31
+
32
+ # Vector Store
33
+ EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
34
+ EMBEDDING_DIM=384
35
+
36
+ # MCP Server Ports
37
+ MCP_SEARCH_PORT=9001
38
+ MCP_EMAIL_PORT=9002
39
+ MCP_CALENDAR_PORT=9003
40
+ MCP_STORE_PORT=9004
41
+
42
+ # Compliance Flags
43
+ ENABLE_CAN_SPAM=true
44
+ ENABLE_PECR=true
45
+ ENABLE_CASL=true
46
+
47
+ # Scoring Thresholds
48
+ MIN_FIT_SCORE=0.5
49
+ FACT_TTL_HOURS=168
.gitignore ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ignore Python virtual environment
2
+ .venv/
3
+
4
+ # Ignore Python cache files
5
+ __pycache__/
6
+ *.pyc
7
+ *.pyo
8
+ *.pyd
9
+ .Python
10
+
11
+ # Ignore database files
12
+ *.db
13
+ *.sqlite
14
+ *.sqlite3
15
+
16
+ # Ignore environment files
17
+ .env
18
+ .env.local
19
+
20
+ # Ignore IDE files
21
+ .vscode/
22
+ .idea/
23
+ *.swp
24
+ *.swo
25
+
26
+ # Ignore OS files
27
+ .DS_Store
28
+ Thumbs.db
29
+ nul
30
+
31
+ # Ignore Claude Code local settings
32
+ .claude/settings.local.json
DEMO_SCRIPT.md ADDED
@@ -0,0 +1,258 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CX AI Agent - Demo Video Script (Silent Screen Recording)
2
+
3
+ ## Video Details
4
+ - **Duration**: 3-5 minutes recommended
5
+ - **Format**: Screen recording with on-screen text/captions
6
+ - **No narration**: Use text overlays to explain each step
7
+
8
+ ---
9
+
10
+ ## SCENE 1: Title Card (5 seconds)
11
+ **On-screen text:**
12
+ ```
13
+ CX AI Agent
14
+ AI-Powered B2B Sales Intelligence Platform
15
+
16
+ MCP in Action Track - Enterprise Applications
17
+ Gradio Agents & MCP Hackathon 2025
18
+ ```
19
+
20
+ ---
21
+
22
+ ## SCENE 2: Landing Page Overview (10 seconds)
23
+ **Action**: Show the app's main interface with sidebar navigation
24
+
25
+ **On-screen text:**
26
+ ```
27
+ Built with:
28
+ • Model Context Protocol (MCP)
29
+ • Gradio 5.x
30
+ • HuggingFace AI (Qwen2.5-72B)
31
+ • Autonomous AI Agents
32
+ ```
33
+
34
+ ---
35
+
36
+ ## SCENE 3: Setup Page (20 seconds)
37
+ **Action**:
38
+ 1. Click on "Setup" in sidebar (should already be selected)
39
+ 2. Enter HuggingFace Token (paste your token)
40
+ 3. Enter Serper API Key (optional - paste if available)
41
+ 4. Type a company name: "TechFlow Solutions"
42
+ 5. Click "Setup Company" button
43
+ 6. Wait for AI to research the company
44
+
45
+ **On-screen text:**
46
+ ```
47
+ Step 1: Setup Your Company
48
+
49
+ • Enter API credentials
50
+ • AI automatically researches your company
51
+ • Builds knowledge base for prospect matching
52
+ ```
53
+
54
+ ---
55
+
56
+ ## SCENE 4: Dashboard Overview (15 seconds)
57
+ **Action**:
58
+ 1. Click "Dashboard" in sidebar
59
+ 2. Show the stats cards (Prospects: 0, Contacts: 0, Emails: 0)
60
+ 3. Show company status indicator
61
+
62
+ **On-screen text:**
63
+ ```
64
+ Dashboard: Real-time Pipeline Metrics
65
+
66
+ • Track prospects discovered
67
+ • Monitor contacts found
68
+ • View email drafts generated
69
+ ```
70
+
71
+ ---
72
+
73
+ ## SCENE 5: AI Discovery - The Core Feature (45 seconds)
74
+ **Action**:
75
+ 1. Click "Discovery" in sidebar
76
+ 2. Set number of prospects to find: 3
77
+ 3. Click "Find Prospects" button
78
+ 4. Wait and watch the AI work (this is the main demo!)
79
+ 5. Observe the output showing discovered companies
80
+
81
+ **On-screen text (sequence):**
82
+ ```
83
+ Step 2: AI-Powered Discovery
84
+
85
+ [When clicking button]
86
+ Autonomous AI Agent activates...
87
+
88
+ [While processing]
89
+ MCP Tools in Action:
90
+ • search_web - Finding prospect companies
91
+ • save_prospect - Storing company data
92
+ • find_verified_contacts - Locating decision makers
93
+ • save_contact - Saving contact information
94
+
95
+ [When complete]
96
+ AI discovered 3 matching prospects with contacts!
97
+ ```
98
+
99
+ ---
100
+
101
+ ## SCENE 6: Prospects List (15 seconds)
102
+ **Action**:
103
+ 1. Click "Prospects" in sidebar
104
+ 2. Scroll through discovered companies
105
+ 3. Show company details (name, industry, description)
106
+
107
+ **On-screen text:**
108
+ ```
109
+ Prospects: AI-Discovered Companies
110
+
111
+ • Automatically researched
112
+ • ICP-matched profiles
113
+ • Ready for outreach
114
+ ```
115
+
116
+ ---
117
+
118
+ ## SCENE 7: Contacts Found (15 seconds)
119
+ **Action**:
120
+ 1. Click "Contacts" in sidebar
121
+ 2. Show list of decision makers
122
+ 3. Point out titles (CEO, VP, Founder, etc.)
123
+
124
+ **On-screen text:**
125
+ ```
126
+ Contacts: Decision Makers Found
127
+
128
+ • C-level executives
129
+ • Department heads
130
+ • Verified contact info
131
+ • Title-based targeting
132
+ ```
133
+
134
+ ---
135
+
136
+ ## SCENE 8: AI-Drafted Emails (20 seconds)
137
+ **Action**:
138
+ 1. Click "Emails" in sidebar
139
+ 2. Show the personalized email drafts
140
+ 3. Scroll to show email content
141
+ 4. Highlight personalization elements
142
+
143
+ **On-screen text:**
144
+ ```
145
+ Emails: AI-Personalized Outreach
146
+
147
+ • Tailored to each prospect
148
+ • Based on company research
149
+ • Ready to send
150
+ • One-click copy
151
+ ```
152
+
153
+ ---
154
+
155
+ ## SCENE 9: AI Chat Assistant (30 seconds)
156
+ **Action**:
157
+ 1. Click "AI Chat" in sidebar
158
+ 2. Type: "What prospects have we found?"
159
+ 3. Wait for AI response
160
+ 4. Type: "Tell me more about [first prospect name]"
161
+ 5. Show the response
162
+
163
+ **On-screen text:**
164
+ ```
165
+ AI Chat: Your Sales Assistant
166
+
167
+ • Ask about your pipeline
168
+ • Get prospect insights
169
+ • Request additional research
170
+ • Natural language interface
171
+ ```
172
+
173
+ ---
174
+
175
+ ## SCENE 10: Prospect Chat Demo (30 seconds)
176
+ **Action**:
177
+ 1. Stay on "AI Chat" page, scroll to "Prospect Chat Demo" section
178
+ 2. Type as if you're a prospect: "Hi, I'm interested in your services"
179
+ 3. Wait for AI response
180
+ 4. Type: "What solutions do you offer for small businesses?"
181
+ 5. Click "Generate Handoff Packet"
182
+ 6. Show the generated packet
183
+
184
+ **On-screen text:**
185
+ ```
186
+ Prospect Chat Demo: Customer-Facing AI
187
+
188
+ • Qualifies leads automatically
189
+ • Answers product questions
190
+ • Generates handoff packets for sales team
191
+ • Escalation-ready workflows
192
+ ```
193
+
194
+ ---
195
+
196
+ ## SCENE 11: MCP Architecture Highlight (15 seconds)
197
+ **Action**:
198
+ 1. Click "About Us" in sidebar
199
+ 2. Scroll to show architecture or features section
200
+
201
+ **On-screen text:**
202
+ ```
203
+ Powered by Model Context Protocol (MCP)
204
+
205
+ MCP Servers:
206
+ • Search Server - Web & news research
207
+ • Store Server - Data persistence
208
+ • Email Server - Outreach management
209
+ • Calendar Server - Meeting scheduling
210
+ ```
211
+
212
+ ---
213
+
214
+ ## SCENE 12: Closing Card (10 seconds)
215
+ **On-screen text:**
216
+ ```
217
+ CX AI Agent
218
+ Autonomous B2B Sales Intelligence
219
+
220
+ Key Highlights:
221
+ ✓ MCP-powered tool orchestration
222
+ ✓ Autonomous AI agent architecture
223
+ ✓ End-to-end sales workflow automation
224
+ �� Real-time prospect discovery
225
+
226
+ Built for Gradio Agents & MCP Hackathon 2025
227
+ #mcp-in-action-track-enterprise
228
+
229
+ GitHub: [your-repo-url]
230
+ HuggingFace Space: [your-space-url]
231
+ ```
232
+
233
+ ---
234
+
235
+ ## Recording Tips
236
+
237
+ 1. **Resolution**: Record at 1920x1080 or higher
238
+ 2. **Browser**: Use Chrome/Edge in a clean window (no bookmarks bar)
239
+ 3. **Zoom**: Set browser zoom to 100% or 110% for readability
240
+ 4. **Cursor**: Use a cursor highlighter tool for visibility
241
+ 5. **Speed**: Move slowly, let viewers read the on-screen text
242
+ 6. **Pauses**: Pause 2-3 seconds on important screens
243
+ 7. **Loading**: If AI is processing, add text "AI Processing..." overlay
244
+
245
+ ## Text Overlay Tools (Free)
246
+ - **Kapwing** - Online video editor with text overlays
247
+ - **DaVinci Resolve** - Professional free editor
248
+ - **Clipchamp** - Windows 11 built-in editor
249
+ - **Canva Video** - Simple text animations
250
+
251
+ ## Suggested Background Music (Optional)
252
+ - Upbeat, corporate-friendly
253
+ - Low volume, non-distracting
254
+ - Royalty-free from YouTube Audio Library
255
+
256
+ ---
257
+
258
+ **Total Estimated Duration: ~3.5 minutes**
README.md CHANGED
@@ -1,12 +1,171 @@
1
  ---
2
- title: Cx Ai Agent V1
3
- emoji: 📚
4
- colorFrom: red
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 6.0.2
8
  app_file: app.py
9
  pinned: false
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: CX AI Agent - B2B Sales Intelligence
3
+ emoji: 🤖
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.33.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: AI-powered B2B sales automation with MCP tools
12
+ tags:
13
+ - mcp-in-action-track-enterprise
14
+ - mcp
15
+ - autonomous-agent
16
+ - b2b-sales
17
+ - prospect-discovery
18
+ - email-automation
19
+ - gradio
20
+ - huggingface
21
+ - qwen
22
+ - sales-intelligence
23
  ---
24
 
25
+ # 🤖 CX AI Agent - B2B Sales Intelligence Platform
26
+
27
+ [![Enterprise Application](https://img.shields.io/badge/MCP-Enterprise%20Track-blue)](https://github.com)
28
+ [![Powered by AI](https://img.shields.io/badge/Powered%20by-HuggingFace-yellow)](https://huggingface.co)
29
+ [![Gradio](https://img.shields.io/badge/Built%20with-Gradio-orange)](https://gradio.app)
30
+
31
+ > **🏆 MCP in Action Track - Enterprise Applications**
32
+ >
33
+ > Tag: `mcp-in-action-track-enterprise`
34
+
35
+ ## 📹 Overview
36
+
37
+ An AI-powered B2B sales automation platform that helps sales teams discover prospects, find decision-makers, and draft personalized outreach emails—all powered by autonomous AI agents using the **Model Context Protocol (MCP)**.
38
+
39
+ ## 🎯 Key Features
40
+
41
+ | Feature | Description |
42
+ |---------|-------------|
43
+ | **🔍 AI Discovery** | Automatically find and research prospect companies matching your ideal customer profile |
44
+ | **👥 Contact Finder** | Locate decision-makers (CEOs, VPs, Founders) with verified email addresses |
45
+ | **✉️ Email Drafting** | Generate personalized cold outreach emails based on company research |
46
+ | **💬 AI Chat** | Interactive assistant for pipeline management and real-time research |
47
+ | **👤 Prospect Chat** | Demo of prospect-facing AI with handoff & escalation capabilities |
48
+ | **📊 Dashboard** | Real-time pipeline metrics and progress tracking |
49
+
50
+ ## 🚀 Quick Start
51
+
52
+ 1. **Setup**: Enter your HuggingFace token and company name
53
+ 2. **Discover**: Let AI find prospects matching your profile
54
+ 3. **Review**: Check discovered companies and contacts
55
+ 4. **Engage**: Use AI-drafted emails for outreach
56
+
57
+ ## 🏗️ Architecture
58
+
59
+ ```
60
+ ┌─────────────────────────────────────────────────────────────┐
61
+ │ CX AI Agent │
62
+ ├─────────────────────────────────────────────────────────────┤
63
+ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
64
+ │ │ Gradio │ │ Autonomous│ │ MCP │ │
65
+ │ │ UI │──│ Agent │──│ Servers │ │
66
+ │ └─────────────┘ └─────────────┘ └─────────────┘ │
67
+ │ │ │ │ │
68
+ │ ▼ ▼ ▼ │
69
+ │ ┌─────────────────────────────────────────────────┐ │
70
+ │ │ MCP Tool Definitions │ │
71
+ │ │ • Search (Web, News) │ │
72
+ │ │ • Store (Prospects, Contacts, Facts) │ │
73
+ │ │ • Email (Send, Thread Management) │ │
74
+ │ │ • Calendar (Meeting Slots, Invites) │ │
75
+ │ └─────────────────────────────────────────────────┘ │
76
+ └─────────────────────────────────────────────_────────────────┘
77
+ ```
78
+
79
+ ## 🔧 MCP Tools Available
80
+
81
+ ### Search MCP Server
82
+ - `search_web` - Search the web for company information
83
+ - `search_news` - Find recent news about companies
84
+
85
+ ### Store MCP Server
86
+ - `save_prospect` / `get_prospect` / `list_prospects` - Manage prospects
87
+ - `save_company` / `get_company` - Store company data
88
+ - `save_contact` / `list_contacts_by_domain` - Manage contacts
89
+ - `discover_prospects_with_contacts` - Full discovery pipeline
90
+ - `find_verified_contacts` - Find decision-makers
91
+
92
+ ### Email MCP Server
93
+ - `send_email` - Send outreach emails
94
+ - `get_email_thread` - Retrieve conversation history
95
+
96
+ ### Calendar MCP Server
97
+ - `suggest_meeting_slots` - Generate available times
98
+ - `generate_calendar_invite` - Create .ics files
99
+
100
+ ## 🎭 Prospect Chat Demo
101
+
102
+ The **Prospect Chat Demo** showcases how prospects can interact with your company's AI:
103
+
104
+ - **Lead Qualification**: AI asks qualifying questions to understand prospect needs
105
+ - **Handoff Packets**: Generate comprehensive summaries for human sales reps
106
+ - **Escalation Flows**: Automatically escalate complex inquiries to humans
107
+ - **Meeting Scheduling**: Integrate with calendar for instant booking
108
+
109
+ ## 📊 Technology Stack
110
+
111
+ | Component | Technology |
112
+ |-----------|------------|
113
+ | **Frontend** | Gradio 5.x |
114
+ | **AI Model** | Qwen2.5-72B / Qwen3-32B via HuggingFace |
115
+ | **Protocol** | Model Context Protocol (MCP) |
116
+ | **Search** | Serper API |
117
+ | **Language** | Python 3.8+ |
118
+
119
+ ## 🔑 Environment Variables
120
+
121
+ Set these in your Space Secrets:
122
+
123
+ ```
124
+ HF_TOKEN=your_huggingface_token_here
125
+ SERPER_API_KEY=your_serper_api_key_here # Optional
126
+ ```
127
+
128
+ ## 📁 Project Structure
129
+
130
+ ```
131
+ cx-ai-agent/
132
+ ├── app.py # Main Gradio application
133
+ ├── requirements.txt # Python dependencies
134
+ ├── README.md # This file
135
+ ├── app/
136
+ │ └── schema.py # Pydantic data models
137
+ └── mcp/
138
+ ├── agents/ # Autonomous AI agents
139
+ ├── servers/ # MCP server implementations
140
+ └── tools/
141
+ └── definitions.py # MCP tool definitions
142
+ ```
143
+
144
+ ## 📝 License
145
+
146
+ This project is open source and available under the MIT License.
147
+
148
+ ## 🙏 Acknowledgments
149
+
150
+ - **Anthropic** - Model Context Protocol specification
151
+ - **HuggingFace** - AI model hosting and inference
152
+ - **Gradio** - UI framework
153
+ - **Serper** - Web search API
154
+
155
+ ---
156
+
157
+ ## 👨‍💻 Developer
158
+
159
+ **Syed Muzakkir Hussain**
160
+
161
+ [![HuggingFace](https://img.shields.io/badge/HuggingFace-muzakkirhussain011-yellow?logo=huggingface)](https://huggingface.co/muzakkirhussain011)
162
+
163
+ ---
164
+
165
+ <div align="center">
166
+
167
+ **Built with ❤️ by [Syed Muzakkir Hussain](https://huggingface.co/muzakkirhussain011) for the Gradio Agents & MCP Hackathon 2025**
168
+
169
+ `mcp-in-action-track-enterprise`
170
+
171
+ </div>
agents/__init__.py ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/__init__.py
2
+ from .hunter import Hunter
3
+ from .enricher import Enricher
4
+ from .contactor import Contactor
5
+ from .scorer import Scorer
6
+ from .writer import Writer
7
+ from .compliance import Compliance
8
+ from .sequencer import Sequencer
9
+ from .curator import Curator
10
+
11
+ __all__ = [
12
+ "Hunter", "Enricher", "Contactor", "Scorer",
13
+ "Writer", "Compliance", "Sequencer", "Curator"
14
+ ]
agents/compliance.py ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/compliance.py
2
+ from pathlib import Path
3
+ from app.schema import Prospect
4
+ from app.config import (
5
+ COMPANY_FOOTER_PATH, ENABLE_CAN_SPAM,
6
+ ENABLE_PECR, ENABLE_CASL
7
+ )
8
+
9
+ class Compliance:
10
+ """Enforces email compliance and policies"""
11
+
12
+ def __init__(self, mcp_registry):
13
+ self.mcp = mcp_registry
14
+ self.store = mcp_registry.get_store_client()
15
+
16
+ # Load footer
17
+ footer_path = Path(COMPANY_FOOTER_PATH)
18
+ if footer_path.exists():
19
+ self.footer = footer_path.read_text()
20
+ else:
21
+ self.footer = "\n\n---\nLucidya Inc.\n123 Market St, San Francisco, CA 94105\nUnsubscribe: https://lucidya.example.com/unsubscribe"
22
+
23
+ async def run(self, prospect: Prospect) -> Prospect:
24
+ """Check compliance and enforce policies"""
25
+
26
+ if not prospect.email_draft:
27
+ prospect.status = "blocked"
28
+ prospect.dropped_reason = "No email draft to check"
29
+ await self.store.save_prospect(prospect)
30
+ return prospect
31
+
32
+ policy_failures = []
33
+
34
+ # Check suppression
35
+ for contact in prospect.contacts:
36
+ if await self.store.check_suppression("email", contact.email):
37
+ policy_failures.append(f"Email suppressed: {contact.email}")
38
+
39
+ domain = contact.email.split("@")[1]
40
+ if await self.store.check_suppression("domain", domain):
41
+ policy_failures.append(f"Domain suppressed: {domain}")
42
+
43
+ if await self.store.check_suppression("company", prospect.company.id):
44
+ policy_failures.append(f"Company suppressed: {prospect.company.name}")
45
+
46
+ # Check content requirements
47
+ body = prospect.email_draft.get("body", "")
48
+
49
+ # CAN-SPAM requirements
50
+ if ENABLE_CAN_SPAM:
51
+ if "unsubscribe" not in body.lower() and "unsubscribe" not in self.footer.lower():
52
+ policy_failures.append("CAN-SPAM: Missing unsubscribe mechanism")
53
+
54
+ if not any(addr in self.footer for addr in ["St", "Ave", "Rd", "Blvd"]):
55
+ policy_failures.append("CAN-SPAM: Missing physical postal address")
56
+
57
+ # PECR requirements (UK)
58
+ if ENABLE_PECR:
59
+ # Check for soft opt-in or existing relationship
60
+ # In production, would check CRM for prior relationship
61
+ if "existing customer" not in body.lower():
62
+ # For demo, we'll be lenient
63
+ pass
64
+
65
+ # CASL requirements (Canada)
66
+ if ENABLE_CASL:
67
+ if "consent" not in body.lower() and prospect.company.domain.endswith(".ca"):
68
+ policy_failures.append("CASL: May need express consent for Canadian recipients")
69
+
70
+ # Check for unverifiable claims
71
+ forbidden_phrases = [
72
+ "guaranteed", "100%", "no risk", "best in the world",
73
+ "revolutionary", "breakthrough"
74
+ ]
75
+
76
+ for phrase in forbidden_phrases:
77
+ if phrase in body.lower():
78
+ policy_failures.append(f"Unverifiable claim: '{phrase}'")
79
+
80
+ # Append footer to email
81
+ if not policy_failures:
82
+ prospect.email_draft["body"] = body + "\n" + self.footer
83
+
84
+ # Final decision
85
+ if policy_failures:
86
+ prospect.status = "blocked"
87
+ prospect.dropped_reason = "; ".join(policy_failures)
88
+ else:
89
+ prospect.status = "compliant"
90
+
91
+ await self.store.save_prospect(prospect)
92
+ return prospect
agents/contactor.py ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/contactor.py
2
+ """
3
+ Contactor Agent - Discovers decision-makers at target companies
4
+ Now uses web search to find real contacts instead of generating mock data
5
+ """
6
+ from app.schema import Prospect, Contact
7
+ from app.config import SKIP_WEB_SEARCH
8
+ import logging
9
+ from services.prospect_discovery import get_prospect_discovery_service
10
+
11
+ logger = logging.getLogger(__name__)
12
+
13
+
14
+ class Contactor:
15
+ """
16
+ Discovers and validates decision-maker contacts
17
+
18
+ IMPROVED: Now uses web search to discover real decision-makers
19
+ Falls back to plausible generated contacts when search doesn't find results
20
+ """
21
+
22
+ def __init__(self, mcp_registry):
23
+ self.mcp = mcp_registry
24
+ self.store = mcp_registry.get_store_client()
25
+ self.prospect_discovery = get_prospect_discovery_service()
26
+
27
+ async def run(self, prospect: Prospect) -> Prospect:
28
+ """Discover decision-maker contacts"""
29
+
30
+ logger.info(f"Contactor: Finding contacts for '{prospect.company.name}'")
31
+
32
+ # Check domain suppression first
33
+ suppressed = await self.store.check_suppression(
34
+ "domain",
35
+ prospect.company.domain
36
+ )
37
+
38
+ if suppressed:
39
+ logger.warning(f"Contactor: Domain suppressed: {prospect.company.domain}")
40
+ prospect.status = "dropped"
41
+ prospect.dropped_reason = f"Domain suppressed: {prospect.company.domain}"
42
+ await self.store.save_prospect(prospect)
43
+ return prospect
44
+
45
+ # Get existing contacts to dedupe
46
+ seen_emails = set()
47
+ try:
48
+ existing = await self.store.list_contacts_by_domain(prospect.company.domain)
49
+ for contact in existing:
50
+ if hasattr(contact, 'email'):
51
+ seen_emails.add(contact.email.lower())
52
+ except Exception as e:
53
+ logger.error(f"Contactor: Error fetching existing contacts: {str(e)}")
54
+
55
+ # Discover contacts using web search
56
+ contacts = []
57
+ try:
58
+ # Determine number of contacts based on company size
59
+ max_contacts = 2 if prospect.company.size < 100 else 3
60
+
61
+ discovered_contacts = await self.prospect_discovery.discover_contacts(
62
+ company_name=prospect.company.name,
63
+ domain=prospect.company.domain,
64
+ company_size=prospect.company.size,
65
+ max_contacts=max_contacts,
66
+ skip_search=SKIP_WEB_SEARCH # Respect SKIP_WEB_SEARCH flag
67
+ )
68
+
69
+ # Filter out already seen emails and check individual email suppression
70
+ for contact in discovered_contacts:
71
+ email_lower = contact.email.lower()
72
+
73
+ # Skip if already seen
74
+ if email_lower in seen_emails:
75
+ logger.info(f"Contactor: Skipping duplicate email: {contact.email}")
76
+ continue
77
+
78
+ # Check email-level suppression
79
+ email_suppressed = await self.store.check_suppression("email", contact.email)
80
+ if email_suppressed:
81
+ logger.warning(f"Contactor: Email suppressed: {contact.email}")
82
+ continue
83
+
84
+ # Set prospect ID
85
+ contact.prospect_id = prospect.id
86
+
87
+ # Save and add to list
88
+ await self.store.save_contact(contact)
89
+ contacts.append(contact)
90
+ seen_emails.add(email_lower)
91
+
92
+ logger.info(f"Contactor: Added contact: {contact.name} ({contact.title})")
93
+
94
+ except Exception as e:
95
+ logger.error(f"Contactor: Error discovering contacts: {str(e)}")
96
+ # Continue with empty contacts list
97
+
98
+ # Update prospect
99
+ prospect.contacts = contacts
100
+ prospect.status = "contacted"
101
+ await self.store.save_prospect(prospect)
102
+
103
+ logger.info(f"Contactor: Found {len(contacts)} contacts for '{prospect.company.name}'")
104
+
105
+ return prospect
agents/curator.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/curator.py
2
+ from datetime import datetime
3
+ from app.schema import Prospect, HandoffPacket
4
+
5
+ class Curator:
6
+ """Creates handoff packets for sales team"""
7
+
8
+ def __init__(self, mcp_registry):
9
+ self.mcp = mcp_registry
10
+ self.store = mcp_registry.get_store_client()
11
+ self.email_client = mcp_registry.get_email_client()
12
+ self.calendar_client = mcp_registry.get_calendar_client()
13
+
14
+ async def run(self, prospect: Prospect) -> Prospect:
15
+ """Create handoff packet"""
16
+
17
+ # Get thread
18
+ thread = None
19
+ if prospect.thread_id:
20
+ thread = await self.email_client.get_thread(prospect.id)
21
+
22
+ # Get calendar slots
23
+ slots = await self.calendar_client.suggest_slots()
24
+
25
+ # Create packet
26
+ packet = HandoffPacket(
27
+ prospect=prospect,
28
+ thread=thread,
29
+ calendar_slots=slots,
30
+ generated_at=datetime.utcnow()
31
+ )
32
+
33
+ # Save packet
34
+ await self.store.save_handoff(packet)
35
+
36
+ # Update prospect status
37
+ prospect.status = "ready_for_handoff"
38
+ await self.store.save_prospect(prospect)
39
+
40
+ return prospect
agents/enricher.py ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/enricher.py
2
+ """
3
+ Enricher Agent - Enriches prospects with real-time web search data
4
+ Now uses actual web search instead of static/mock data
5
+ """
6
+ from datetime import datetime
7
+ from app.schema import Prospect, Fact
8
+ from app.config import FACT_TTL_HOURS, SKIP_WEB_SEARCH
9
+ import uuid
10
+ import logging
11
+
12
+ logger = logging.getLogger(__name__)
13
+
14
+
15
+ class Enricher:
16
+ """
17
+ Enriches prospects with facts from real web search
18
+
19
+ IMPROVED: Now uses actual web search to find:
20
+ - Company news and updates
21
+ - Industry trends and challenges
22
+ - Customer experience insights
23
+ - Recent developments
24
+ """
25
+
26
+ def __init__(self, mcp_registry):
27
+ self.mcp = mcp_registry
28
+ self.search = mcp_registry.get_search_client()
29
+ self.store = mcp_registry.get_store_client()
30
+
31
+ async def run(self, prospect: Prospect) -> Prospect:
32
+ """Enrich prospect with facts from web search"""
33
+
34
+ logger.info(f"Enricher: Enriching prospect '{prospect.company.name}'")
35
+
36
+ facts = []
37
+ seen_texts = set() # Deduplication
38
+
39
+ # Only do web search if not skipped
40
+ if not SKIP_WEB_SEARCH:
41
+ logger.info("Enricher: Performing web search for facts")
42
+
43
+ # Enhanced search queries for better fact discovery
44
+ queries = [
45
+ # Company news and updates
46
+ f"{prospect.company.name} news latest updates",
47
+ # Industry-specific challenges
48
+ f"{prospect.company.name} {prospect.company.industry} customer experience",
49
+ # Pain points and challenges
50
+ f"{prospect.company.name} challenges problems",
51
+ # Contact and support information
52
+ f"{prospect.company.domain} customer support contact"
53
+ ]
54
+
55
+ for query in queries:
56
+ try:
57
+ logger.info(f"Enricher: Searching for: '{query}'")
58
+ results = await self.search.query(query)
59
+
60
+ # Process search results
61
+ for result in results[:3]: # Top 3 per query
62
+ text = result.get("text", "").strip()
63
+ title = result.get("title", "").strip()
64
+
65
+ # Skip empty or very short results
66
+ if not text or len(text) < 20:
67
+ continue
68
+
69
+ # Combine title and text for better context
70
+ if title and title not in text:
71
+ full_text = f"{title}. {text}"
72
+ else:
73
+ full_text = text
74
+
75
+ # Deduplicate
76
+ if full_text in seen_texts:
77
+ continue
78
+ seen_texts.add(full_text)
79
+
80
+ # Create fact
81
+ fact = Fact(
82
+ id=str(uuid.uuid4()),
83
+ source=result.get("source", "web search"),
84
+ text=full_text[:500], # Limit length
85
+ collected_at=datetime.utcnow(),
86
+ ttl_hours=FACT_TTL_HOURS,
87
+ confidence=result.get("confidence", 0.75),
88
+ company_id=prospect.company.id
89
+ )
90
+ facts.append(fact)
91
+ await self.store.save_fact(fact)
92
+
93
+ logger.info(f"Enricher: Added fact from {fact.source}")
94
+
95
+ except Exception as e:
96
+ logger.error(f"Enricher: Error searching for '{query}': {str(e)}")
97
+ continue
98
+ else:
99
+ logger.info("Enricher: Skipping web search (SKIP_WEB_SEARCH=true)")
100
+
101
+ # Also add company pain points as facts (from discovery)
102
+ for pain in prospect.company.pains:
103
+ if pain and len(pain) > 10: # Valid pain point
104
+ fact = Fact(
105
+ id=str(uuid.uuid4()),
106
+ source="company_discovery",
107
+ text=f"Known challenge: {pain}",
108
+ collected_at=datetime.utcnow(),
109
+ ttl_hours=FACT_TTL_HOURS * 2, # Discovery data lasts longer
110
+ confidence=0.85,
111
+ company_id=prospect.company.id
112
+ )
113
+ facts.append(fact)
114
+ await self.store.save_fact(fact)
115
+
116
+ # Add company notes as facts
117
+ for note in prospect.company.notes:
118
+ if note and len(note) > 10: # Valid note
119
+ fact = Fact(
120
+ id=str(uuid.uuid4()),
121
+ source="company_discovery",
122
+ text=note,
123
+ collected_at=datetime.utcnow(),
124
+ ttl_hours=FACT_TTL_HOURS * 2,
125
+ confidence=0.8,
126
+ company_id=prospect.company.id
127
+ )
128
+ facts.append(fact)
129
+ await self.store.save_fact(fact)
130
+
131
+ prospect.facts = facts
132
+ prospect.status = "enriched"
133
+ await self.store.save_prospect(prospect)
134
+
135
+ logger.info(f"Enricher: Added {len(facts)} facts for '{prospect.company.name}'")
136
+
137
+ return prospect
agents/hunter.py ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/hunter.py
2
+ """
3
+ Hunter Agent - Discovers companies dynamically
4
+ Now uses web search to find company information instead of static files
5
+ """
6
+ import json
7
+ from typing import List, Optional
8
+ from app.schema import Company, Prospect
9
+ from app.config import COMPANIES_FILE, SKIP_WEB_SEARCH
10
+ from services.company_discovery import get_company_discovery_service
11
+ import logging
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+
16
+ class Hunter:
17
+ """
18
+ Discovers companies and creates prospects dynamically
19
+
20
+ NEW: Can now discover companies from user input (company names)
21
+ LEGACY: Still supports loading from seed file for backwards compatibility
22
+ """
23
+
24
+ def __init__(self, mcp_registry):
25
+ self.mcp = mcp_registry
26
+ self.store = mcp_registry.get_store_client()
27
+ self.discovery = get_company_discovery_service()
28
+
29
+ async def run(
30
+ self,
31
+ company_names: Optional[List[str]] = None,
32
+ company_ids: Optional[List[str]] = None,
33
+ use_seed_file: bool = False
34
+ ) -> List[Prospect]:
35
+ """
36
+ Discover companies and create prospects
37
+
38
+ Args:
39
+ company_names: List of company names to discover (NEW - dynamic mode)
40
+ company_ids: List of company IDs from seed file (LEGACY - static mode)
41
+ use_seed_file: If True, load from seed file instead of discovery
42
+
43
+ Returns:
44
+ List of Prospect objects
45
+ """
46
+ prospects = []
47
+
48
+ # Mode 1: Dynamic discovery from company names (NEW)
49
+ if company_names and not use_seed_file:
50
+ logger.info(f"Hunter: Dynamic discovery mode - discovering {len(company_names)} companies")
51
+
52
+ for company_name in company_names:
53
+ try:
54
+ logger.info(f"Hunter: Discovering '{company_name}'...")
55
+
56
+ # Discover company information from web (or use fallback if configured)
57
+ company = await self.discovery.discover_company(company_name, skip_search=SKIP_WEB_SEARCH)
58
+
59
+ if not company:
60
+ logger.warning(f"Hunter: Could not discover company '{company_name}'")
61
+ # Create a minimal fallback company
62
+ company = self._create_fallback_company(company_name)
63
+
64
+ # Create prospect
65
+ prospect = Prospect(
66
+ id=company.id,
67
+ company=company,
68
+ status="new"
69
+ )
70
+
71
+ # Save to store
72
+ await self.store.save_prospect(prospect)
73
+ prospects.append(prospect)
74
+
75
+ logger.info(f"Hunter: Successfully created prospect for '{company_name}'")
76
+
77
+ except Exception as e:
78
+ logger.error(f"Hunter: Error discovering '{company_name}': {str(e)}")
79
+ # Create fallback and continue
80
+ company = self._create_fallback_company(company_name)
81
+ prospect = Prospect(
82
+ id=company.id,
83
+ company=company,
84
+ status="new"
85
+ )
86
+ await self.store.save_prospect(prospect)
87
+ prospects.append(prospect)
88
+
89
+ # Mode 2: Legacy mode - load from seed file (BACKWARDS COMPATIBLE)
90
+ else:
91
+ logger.info("Hunter: Legacy mode - loading from seed file")
92
+
93
+ try:
94
+ # Load from seed file
95
+ with open(COMPANIES_FILE) as f:
96
+ companies_data = json.load(f)
97
+
98
+ for company_data in companies_data:
99
+ # Filter by IDs if specified
100
+ if company_ids and company_data["id"] not in company_ids:
101
+ continue
102
+
103
+ company = Company(**company_data)
104
+
105
+ # Create prospect
106
+ prospect = Prospect(
107
+ id=company.id,
108
+ company=company,
109
+ status="new"
110
+ )
111
+
112
+ # Save to store
113
+ await self.store.save_prospect(prospect)
114
+ prospects.append(prospect)
115
+
116
+ logger.info(f"Hunter: Loaded {len(prospects)} companies from seed file")
117
+
118
+ except FileNotFoundError:
119
+ logger.error(f"Hunter: Seed file not found: {COMPANIES_FILE}")
120
+ # If no seed file and no company names provided, return empty
121
+ if not company_names:
122
+ return []
123
+ except Exception as e:
124
+ logger.error(f"Hunter: Error loading seed file: {str(e)}")
125
+ return []
126
+
127
+ return prospects
128
+
129
+ def _create_fallback_company(self, company_name: str) -> Company:
130
+ """Create a minimal fallback company when discovery fails"""
131
+ import re
132
+ import uuid
133
+
134
+ # Generate ID
135
+ slug = re.sub(r'[^a-zA-Z0-9]', '', company_name.lower())[:20]
136
+ company_id = f"{slug}_{str(uuid.uuid4())[:8]}"
137
+
138
+ # Create minimal company
139
+ company = Company(
140
+ id=company_id,
141
+ name=company_name,
142
+ domain=f"{slug}.com",
143
+ industry="Technology",
144
+ size=100,
145
+ pains=[
146
+ "Customer experience improvement needed",
147
+ "Operational efficiency challenges"
148
+ ],
149
+ notes=[
150
+ "Company information discovery in progress",
151
+ "Limited data available"
152
+ ]
153
+ )
154
+
155
+ logger.info(f"Hunter: Created fallback company for '{company_name}'")
156
+ return company
agents/scorer.py ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/scorer.py
2
+ from datetime import datetime, timedelta
3
+ from app.schema import Prospect
4
+ from app.config import MIN_FIT_SCORE
5
+
6
+ class Scorer:
7
+ """Scores prospects and drops low-quality ones"""
8
+
9
+ def __init__(self, mcp_registry):
10
+ self.mcp = mcp_registry
11
+ self.store = mcp_registry.get_store_client()
12
+
13
+ async def run(self, prospect: Prospect) -> Prospect:
14
+ """Score prospect based on various factors"""
15
+
16
+ score = 0.0
17
+
18
+ # Industry scoring
19
+ high_value_industries = ["SaaS", "FinTech", "E-commerce", "Healthcare Tech"]
20
+ if prospect.company.industry in high_value_industries:
21
+ score += 0.3
22
+ else:
23
+ score += 0.1
24
+
25
+ # Size scoring
26
+ if 100 <= prospect.company.size <= 5000:
27
+ score += 0.2 # Sweet spot
28
+ elif prospect.company.size > 5000:
29
+ score += 0.1 # Enterprise, harder to sell
30
+ else:
31
+ score += 0.05 # Too small
32
+
33
+ # Pain points alignment
34
+ cx_related_pains = ["customer retention", "NPS", "support efficiency", "personalization"]
35
+ matching_pains = sum(
36
+ 1 for pain in prospect.company.pains
37
+ if any(keyword in pain.lower() for keyword in cx_related_pains)
38
+ )
39
+ score += min(0.3, matching_pains * 0.1)
40
+
41
+ # Facts freshness
42
+ fresh_facts = 0
43
+ stale_facts = 0
44
+ now = datetime.utcnow()
45
+
46
+ for fact in prospect.facts:
47
+ age_hours = (now - fact.collected_at).total_seconds() / 3600
48
+ if age_hours > fact.ttl_hours:
49
+ stale_facts += 1
50
+ else:
51
+ fresh_facts += 1
52
+
53
+ if fresh_facts > 0:
54
+ score += min(0.2, fresh_facts * 0.05)
55
+
56
+ # Confidence from facts
57
+ if prospect.facts:
58
+ avg_confidence = sum(f.confidence for f in prospect.facts) / len(prospect.facts)
59
+ score += avg_confidence * 0.2
60
+
61
+ # Normalize score
62
+ prospect.fit_score = min(1.0, score)
63
+
64
+ # Decision
65
+ if prospect.fit_score < MIN_FIT_SCORE:
66
+ prospect.status = "dropped"
67
+ prospect.dropped_reason = f"Low fit score: {prospect.fit_score:.2f}"
68
+ elif stale_facts > fresh_facts:
69
+ prospect.status = "dropped"
70
+ prospect.dropped_reason = f"Stale facts: {stale_facts}/{len(prospect.facts)}"
71
+ else:
72
+ prospect.status = "scored"
73
+
74
+ await self.store.save_prospect(prospect)
75
+ return prospect
agents/sequencer.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/sequencer.py
2
+ from datetime import datetime
3
+ from app.schema import Prospect, Message
4
+ import uuid
5
+
6
+ class Sequencer:
7
+ """Sequences and sends outreach emails"""
8
+
9
+ def __init__(self, mcp_registry):
10
+ self.mcp = mcp_registry
11
+ self.email_client = mcp_registry.get_email_client()
12
+ self.calendar_client = mcp_registry.get_calendar_client()
13
+ self.store = mcp_registry.get_store_client()
14
+
15
+ async def run(self, prospect: Prospect) -> Prospect:
16
+ """Send email and create thread"""
17
+
18
+ # Check if we have minimum requirements
19
+ if not prospect.contacts:
20
+ # Try to generate a default contact if none exist
21
+ from app.schema import Contact
22
+ default_contact = Contact(
23
+ id=str(uuid.uuid4()),
24
+ name=f"Customer Success at {prospect.company.name}",
25
+ email=f"contact@{prospect.company.domain}",
26
+ title="Customer Success",
27
+ prospect_id=prospect.id
28
+ )
29
+ prospect.contacts = [default_contact]
30
+ await self.store.save_contact(default_contact)
31
+
32
+ if not prospect.email_draft:
33
+ # Generate a simple default email if none exists
34
+ prospect.email_draft = {
35
+ "subject": f"Improving {prospect.company.name}'s Customer Experience",
36
+ "body": f"""Dear {prospect.company.name} team,
37
+
38
+ We noticed your company is in the {prospect.company.industry} industry with {prospect.company.size} employees.
39
+ We'd love to discuss how we can help improve your customer experience.
40
+
41
+ Looking forward to connecting with you.
42
+
43
+ Best regards,
44
+ Lucidya Team"""
45
+ }
46
+
47
+ # Now proceed with sending
48
+ primary_contact = prospect.contacts[0]
49
+
50
+ # Get calendar slots
51
+ try:
52
+ slots = await self.calendar_client.suggest_slots()
53
+ except:
54
+ slots = [] # Continue even if calendar fails
55
+
56
+ # Generate ICS attachment for first slot
57
+ ics_content = ""
58
+ if slots:
59
+ try:
60
+ slot = slots[0]
61
+ ics_content = await self.calendar_client.generate_ics(
62
+ f"Meeting with {prospect.company.name}",
63
+ slot["start_iso"],
64
+ slot["end_iso"]
65
+ )
66
+ except:
67
+ pass # Continue without ICS
68
+
69
+ # Add calendar info to email
70
+ calendar_text = ""
71
+ if slots:
72
+ calendar_text = f"\n\nI have a few time slots available this week:\n"
73
+ for slot in slots[:3]:
74
+ calendar_text += f"- {slot['start_iso'][:16].replace('T', ' at ')}\n"
75
+
76
+ # Send email
77
+ email_body = prospect.email_draft["body"]
78
+ if calendar_text:
79
+ email_body = email_body.rstrip() + calendar_text
80
+
81
+ try:
82
+ result = await self.email_client.send(
83
+ to=primary_contact.email,
84
+ subject=prospect.email_draft["subject"],
85
+ body=email_body,
86
+ prospect_id=prospect.id # Add prospect_id for thread tracking
87
+ )
88
+
89
+ # Update prospect with thread ID
90
+ # Handle both dict and string responses
91
+ if isinstance(result, dict):
92
+ prospect.thread_id = result.get("thread_id", str(uuid.uuid4()))
93
+ elif isinstance(result, str):
94
+ prospect.thread_id = result
95
+ else:
96
+ prospect.thread_id = str(uuid.uuid4())
97
+ prospect.status = "sequenced"
98
+
99
+ except Exception as e:
100
+ # Even if email sending fails, don't block the prospect
101
+ prospect.thread_id = f"mock-thread-{uuid.uuid4()}"
102
+ prospect.status = "sequenced"
103
+ print(f"Warning: Email send failed for {prospect.company.name}: {e}")
104
+
105
+ await self.store.save_prospect(prospect)
106
+ return prospect
agents/writer.py ADDED
@@ -0,0 +1,261 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: agents/writer.py
2
+ import json
3
+ import re
4
+ import logging
5
+ from typing import AsyncGenerator
6
+ from app.schema import Prospect
7
+ from app.config import MODEL_NAME, HF_API_TOKEN, MODEL_NAME_FALLBACK
8
+ from app.logging_utils import log_event
9
+ from vector.retriever import Retriever
10
+ from huggingface_hub import AsyncInferenceClient
11
+
12
+ logger = logging.getLogger(__name__)
13
+
14
+ class Writer:
15
+ """Generates outreach content with HuggingFace Inference API streaming"""
16
+
17
+ def __init__(self, mcp_registry):
18
+ self.mcp = mcp_registry
19
+ self.store = mcp_registry.get_store_client()
20
+ self.retriever = Retriever()
21
+ # Initialize HF client
22
+ self.hf_client = AsyncInferenceClient(token=HF_API_TOKEN if HF_API_TOKEN else None)
23
+
24
+ async def run_streaming(self, prospect: Prospect) -> AsyncGenerator[dict, None]:
25
+ """Generate content with streaming tokens"""
26
+
27
+ # IMPORTANT: Log contact information for debugging
28
+ if prospect.contacts:
29
+ for contact in prospect.contacts:
30
+ log_event("writer", f"Using contact: {contact.name} ({contact.title}) - {contact.email}", "agent_log")
31
+ logger.info(f"Writer: Using contact: {contact.name} ({contact.title}) - {contact.email}")
32
+ else:
33
+ log_event("writer", "WARNING: No contacts found for this prospect!", "agent_log")
34
+ logger.warning(f"Writer: No contacts found for prospect {prospect.company.name}")
35
+
36
+ # Get relevant facts from vector store
37
+ try:
38
+ relevant_facts = self.retriever.retrieve(prospect.company.id, k=5)
39
+ except:
40
+ relevant_facts = []
41
+
42
+ # Build comprehensive context
43
+ context = f"""
44
+ COMPANY PROFILE:
45
+ Name: {prospect.company.name}
46
+ Industry: {prospect.company.industry}
47
+ Size: {prospect.company.size} employees
48
+ Domain: {prospect.company.domain}
49
+
50
+ KEY CHALLENGES:
51
+ {chr(10).join(f'• {pain}' for pain in prospect.company.pains)}
52
+
53
+ BUSINESS CONTEXT:
54
+ {chr(10).join(f'• {note}' for note in prospect.company.notes) if prospect.company.notes else '• No additional notes'}
55
+
56
+ RELEVANT INSIGHTS:
57
+ {chr(10).join(f'• {fact["text"]} (confidence: {fact.get("score", 0.7):.2f})' for fact in relevant_facts[:3]) if relevant_facts else '• Industry best practices suggest focusing on customer experience improvements'}
58
+ """
59
+
60
+ # Generate comprehensive summary first
61
+ summary_prompt = f"""{context}
62
+
63
+ Generate a comprehensive bullet-point summary for {prospect.company.name} that includes:
64
+ 1. Company overview (industry, size)
65
+ 2. Main challenges they face
66
+ 3. Specific opportunities for improvement
67
+ 4. Recommended actions
68
+
69
+ Format: Use 5-7 bullets, each starting with "•". Be specific and actionable.
70
+ Include the industry and size context in your summary."""
71
+
72
+ summary_text = ""
73
+
74
+ # Emit company header first
75
+ yield log_event("writer", f"Generating content for {prospect.company.name}", "company_start",
76
+ {"company": prospect.company.name,
77
+ "industry": prospect.company.industry,
78
+ "size": prospect.company.size})
79
+
80
+ # Summary generation with HF Inference API
81
+ try:
82
+ # Use text generation with streaming
83
+ stream = await self.hf_client.text_generation(
84
+ summary_prompt,
85
+ model=MODEL_NAME,
86
+ max_new_tokens=500,
87
+ temperature=0.7,
88
+ stream=True
89
+ )
90
+
91
+ async for token in stream:
92
+ summary_text += token
93
+ yield log_event(
94
+ "writer",
95
+ token,
96
+ "llm_token",
97
+ {
98
+ "type": "summary",
99
+ "token": token,
100
+ "prospect_id": prospect.id,
101
+ "company_id": prospect.company.id,
102
+ "company_name": prospect.company.name,
103
+ },
104
+ )
105
+
106
+ except Exception as e:
107
+ # Fallback summary if generation fails
108
+ summary_text = f"""• {prospect.company.name} is a {prospect.company.industry} company with {prospect.company.size} employees
109
+ • Main challenge: {prospect.company.pains[0] if prospect.company.pains else 'Customer experience improvement'}
110
+ • Opportunity: Implement modern CX solutions to improve customer satisfaction
111
+ • Recommended action: Schedule a consultation to discuss specific needs"""
112
+ yield log_event("writer", f"Summary generation failed, using default: {e}", "llm_error")
113
+
114
+ # Generate personalized email
115
+ # If we have a contact, instruct the greeting explicitly with name and title
116
+ greeting_hint = ""
117
+ contact_context = ""
118
+ if prospect.contacts:
119
+ contact = prospect.contacts[0]
120
+ first_name = (contact.name or "").split()[0]
121
+ full_name = contact.name
122
+ title = contact.title
123
+
124
+ if first_name:
125
+ greeting_hint = f"IMPORTANT: Start the email EXACTLY with this greeting: 'Hi {first_name},'\n"
126
+ contact_context = f"\nTARGET RECIPIENT:\nName: {full_name}\nTitle: {title}\nEmail: {contact.email}\n"
127
+
128
+ email_prompt = f"""{context}
129
+ {contact_context}
130
+ Company Summary:
131
+ {summary_text}
132
+
133
+ Write a highly personalized outreach email from a CX AI platform provider to {prospect.contacts[0].name if prospect.contacts else 'leaders'} at {prospect.company.name}.
134
+ {greeting_hint}
135
+ Requirements:
136
+ - Subject line that mentions their company name and industry
137
+ - Body: 150-180 words, professional and friendly
138
+ - Reference their specific industry ({prospect.company.industry}) and size ({prospect.company.size} employees)
139
+ - Address them by their first name in the greeting (e.g., "Hi {prospect.contacts[0].name.split()[0] if prospect.contacts else 'there'},")
140
+ - Acknowledge their role as {prospect.contacts[0].title if prospect.contacts else 'a leader'} in the organization
141
+ - Clearly connect their challenges to AI-powered customer experience solutions
142
+ - One clear call-to-action to schedule a short conversation or demo next week
143
+ - Do not write as if the email is from the company to us
144
+ - No exaggerated claims
145
+ - Sign off as: "The CX Team"
146
+
147
+ Format response exactly as:
148
+ Subject: [subject line]
149
+ Body: [email body]
150
+ """
151
+
152
+ email_text = ""
153
+
154
+ # Emit email generation start
155
+ yield log_event("writer", f"Generating email for {prospect.company.name}", "email_start",
156
+ {"company": prospect.company.name})
157
+
158
+ # Email generation with HF Inference API
159
+ try:
160
+ stream = await self.hf_client.text_generation(
161
+ email_prompt,
162
+ model=MODEL_NAME,
163
+ max_new_tokens=400,
164
+ temperature=0.7,
165
+ stream=True
166
+ )
167
+
168
+ async for token in stream:
169
+ email_text += token
170
+ yield log_event(
171
+ "writer",
172
+ token,
173
+ "llm_token",
174
+ {
175
+ "type": "email",
176
+ "token": token,
177
+ "prospect_id": prospect.id,
178
+ "company_id": prospect.company.id,
179
+ "company_name": prospect.company.name,
180
+ },
181
+ )
182
+
183
+ except Exception as e:
184
+ # Fallback email if generation fails - use contact name if available
185
+ contact_greeting = "Hi there,"
186
+ if prospect.contacts:
187
+ first_name = prospect.contacts[0].name.split()[0] if prospect.contacts[0].name else "there"
188
+ contact_greeting = f"Hi {first_name},"
189
+
190
+ email_text = f"""Subject: Improve {prospect.company.name}'s Customer Experience
191
+
192
+ Body: {contact_greeting}
193
+
194
+ As a {prospect.company.industry} company with {prospect.company.size} employees, you face unique customer experience challenges. We understand that {prospect.company.pains[0] if prospect.company.pains else 'improving customer satisfaction'} is a priority for your organization.
195
+
196
+ Our AI-powered platform has helped similar companies in the {prospect.company.industry} industry improve their customer experience metrics significantly. We'd love to discuss how we can help {prospect.company.name} achieve similar results.
197
+
198
+ Would you be available for a brief call next week to explore how we can address your specific needs?
199
+
200
+ Best regards,
201
+ The CX Team"""
202
+ yield log_event("writer", f"Email generation failed, using default: {e}", "llm_error")
203
+
204
+ # Parse email
205
+ email_parts = {"subject": "", "body": ""}
206
+ if "Subject:" in email_text and "Body:" in email_text:
207
+ parts = email_text.split("Body:")
208
+ email_parts["subject"] = parts[0].replace("Subject:", "").strip()
209
+ email_parts["body"] = parts[1].strip()
210
+ else:
211
+ # Fallback with company details - personalize with contact name
212
+ contact_greeting = "Hi there,"
213
+ if prospect.contacts:
214
+ first_name = prospect.contacts[0].name.split()[0] if prospect.contacts[0].name else "there"
215
+ contact_greeting = f"Hi {first_name},"
216
+
217
+ email_parts["subject"] = f"Transform {prospect.company.name}'s Customer Experience"
218
+ email_parts["body"] = email_text or f"""{contact_greeting}
219
+
220
+ As a leading {prospect.company.industry} company with {prospect.company.size} employees, we know you're focused on delivering exceptional customer experiences.
221
+
222
+ We'd like to discuss how our AI-powered platform can help address your specific challenges and improve your customer satisfaction metrics.
223
+
224
+ Best regards,
225
+ The CX Team"""
226
+
227
+ # Replace any placeholder tokens like [Team Name] with actual contact name if available
228
+ if prospect.contacts:
229
+ contact_name = prospect.contacts[0].name
230
+ if email_parts.get("subject"):
231
+ email_parts["subject"] = re.sub(r"\[[^\]]+\]", contact_name, email_parts["subject"])
232
+ if email_parts.get("body"):
233
+ email_parts["body"] = re.sub(r"\[[^\]]+\]", contact_name, email_parts["body"])
234
+
235
+ # Update prospect
236
+ prospect.summary = f"**{prospect.company.name} ({prospect.company.industry}, {prospect.company.size} employees)**\n\n{summary_text}"
237
+ prospect.email_draft = email_parts
238
+ prospect.status = "drafted"
239
+ await self.store.save_prospect(prospect)
240
+
241
+ # Emit completion event with company info
242
+ yield log_event(
243
+ "writer",
244
+ f"Generation complete for {prospect.company.name}",
245
+ "llm_done",
246
+ {
247
+ "prospect": prospect,
248
+ "summary": prospect.summary,
249
+ "email": email_parts,
250
+ "company_name": prospect.company.name,
251
+ "prospect_id": prospect.id,
252
+ "company_id": prospect.company.id,
253
+ },
254
+ )
255
+
256
+ async def run(self, prospect: Prospect) -> Prospect:
257
+ """Non-streaming version for compatibility"""
258
+ async for event in self.run_streaming(prospect):
259
+ if event["type"] == "llm_done":
260
+ return event["payload"]["prospect"]
261
+ return prospect
alembic.ini ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Alembic configuration file for CX AI Agent database migrations
2
+
3
+ [alembic]
4
+ # Path to migration scripts
5
+ script_location = migrations
6
+
7
+ # Template used to generate migration files
8
+ file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
9
+
10
+ # Logging configuration
11
+ [loggers]
12
+ keys = root,sqlalchemy,alembic
13
+
14
+ [handlers]
15
+ keys = console
16
+
17
+ [formatters]
18
+ keys = generic
19
+
20
+ [logger_root]
21
+ level = WARN
22
+ handlers = console
23
+ qualname =
24
+
25
+ [logger_sqlalchemy]
26
+ level = WARN
27
+ handlers =
28
+ qualname = sqlalchemy.engine
29
+
30
+ [logger_alembic]
31
+ level = INFO
32
+ handlers =
33
+ qualname = alembic
34
+
35
+ [handler_console]
36
+ class = StreamHandler
37
+ args = (sys.stderr,)
38
+ level = NOTSET
39
+ formatter = generic
40
+
41
+ [formatter_generic]
42
+ format = %(levelname)-5.5s [%(name)s] %(message)s
43
+ datefmt = %H:%M:%S
app.py ADDED
The diff for this file is too large to render. See raw diff
 
app/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # file: app/__init__.py
2
+ """Lucidya MCP Prototype - Core Application Package"""
3
+ __version__ = "0.1.0"
app/config.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: app/config.py
2
+ import os
3
+ from pathlib import Path
4
+ from dotenv import load_dotenv
5
+
6
+ load_dotenv()
7
+
8
+ # Paths
9
+ BASE_DIR = Path(__file__).parent.parent
10
+ DATA_DIR = BASE_DIR / "data"
11
+
12
+ # Hugging Face Inference API
13
+ HF_API_TOKEN = os.getenv("HF_API_TOKEN", "")
14
+
15
+ # LLM Configuration - Optimized for FREE HF CPU Inference
16
+ # Primary: Qwen2.5-3B (3B params - 2.3x faster than 7B, better for CPU)
17
+ # Alternative options for CPU:
18
+ # - "Qwen/Qwen2.5-3B-Instruct" (3B - fast, high quality)
19
+ # - "microsoft/Phi-3-mini-4k-instruct" (3.8B - ultra efficient)
20
+ # - "HuggingFaceTB/SmolLM2-1.7B-Instruct" (1.7B - fastest)
21
+ MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-3B-Instruct")
22
+ MODEL_NAME_FALLBACK = os.getenv("MODEL_NAME_FALLBACK", "microsoft/Phi-3-mini-4k-instruct")
23
+
24
+ # Web Search Configuration
25
+ # Set to "true" to skip web search and use fallback data (recommended for demo/rate-limited environments)
26
+ SKIP_WEB_SEARCH = os.getenv("SKIP_WEB_SEARCH", "false").lower() == "true"
27
+
28
+ # Vector Store
29
+ VECTOR_INDEX_PATH = os.getenv("VECTOR_INDEX_PATH", str(DATA_DIR / "faiss.index"))
30
+ EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
31
+ EMBEDDING_DIM = 384
32
+
33
+ # MCP Servers
34
+ MCP_SEARCH_PORT = int(os.getenv("MCP_SEARCH_PORT", "9001"))
35
+ MCP_EMAIL_PORT = int(os.getenv("MCP_EMAIL_PORT", "9002"))
36
+ MCP_CALENDAR_PORT = int(os.getenv("MCP_CALENDAR_PORT", "9003"))
37
+ MCP_STORE_PORT = int(os.getenv("MCP_STORE_PORT", "9004"))
38
+
39
+ # Compliance
40
+ COMPANY_FOOTER_PATH = os.getenv("COMPANY_FOOTER_PATH", str(DATA_DIR / "footer.txt"))
41
+ ENABLE_CAN_SPAM = os.getenv("ENABLE_CAN_SPAM", "true").lower() == "true"
42
+ ENABLE_PECR = os.getenv("ENABLE_PECR", "true").lower() == "true"
43
+ ENABLE_CASL = os.getenv("ENABLE_CASL", "true").lower() == "true"
44
+
45
+ # Scoring
46
+ MIN_FIT_SCORE = float(os.getenv("MIN_FIT_SCORE", "0.5"))
47
+ FACT_TTL_HOURS = int(os.getenv("FACT_TTL_HOURS", "168")) # 1 week
48
+
49
+ # Data Files
50
+ COMPANIES_FILE = DATA_DIR / "companies.json"
51
+ SUPPRESSION_FILE = DATA_DIR / "suppression.json"
app/logging_utils.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: app/logging_utils.py
2
+ import logging
3
+ from datetime import datetime
4
+ from rich.logging import RichHandler
5
+
6
+ def setup_logging(level=logging.INFO):
7
+ """Configure rich logging"""
8
+ logging.basicConfig(
9
+ level=level,
10
+ format="%(message)s",
11
+ datefmt="[%X]",
12
+ handlers=[RichHandler(rich_tracebacks=True)]
13
+ )
14
+
15
+ def log_event(agent: str, message: str, type: str = "agent_log", payload: dict = None) -> dict:
16
+ """Create a pipeline event for streaming"""
17
+ return {
18
+ "ts": datetime.utcnow().isoformat(),
19
+ "type": type,
20
+ "agent": agent,
21
+ "message": message,
22
+ "payload": payload or {}
23
+ }
24
+
25
+ logger = logging.getLogger(__name__)
app/main.py ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: app/main.py
2
+ import json
3
+ from datetime import datetime
4
+ from typing import AsyncGenerator
5
+ from fastapi import FastAPI, HTTPException
6
+ from fastapi.responses import StreamingResponse, JSONResponse
7
+ from fastapi.encoders import jsonable_encoder
8
+ from app.schema import PipelineRequest, WriterStreamRequest, Prospect, HandoffPacket
9
+ from app.orchestrator import Orchestrator
10
+ from app.config import MODEL_NAME, HF_API_TOKEN
11
+ from app.logging_utils import setup_logging
12
+ from mcp.registry import MCPRegistry
13
+ from vector.store import VectorStore
14
+ import requests
15
+
16
+ setup_logging()
17
+
18
+ app = FastAPI(title="CX AI Agent", version="1.0.0")
19
+ orchestrator = Orchestrator()
20
+ mcp = MCPRegistry()
21
+ vector_store = VectorStore()
22
+
23
+ @app.on_event("startup")
24
+ async def startup():
25
+ """Initialize connections on startup"""
26
+ await mcp.connect()
27
+
28
+ @app.get("/health")
29
+ async def health():
30
+ """Health check with HF API connectivity test"""
31
+ try:
32
+ # Check HF API
33
+ hf_ok = bool(HF_API_TOKEN)
34
+
35
+ # Check MCP servers
36
+ mcp_status = await mcp.health_check()
37
+
38
+ return {
39
+ "status": "healthy",
40
+ "timestamp": datetime.utcnow().isoformat(),
41
+ "hf_inference": {
42
+ "configured": hf_ok,
43
+ "model": MODEL_NAME
44
+ },
45
+ "mcp": mcp_status,
46
+ "vector_store": vector_store.is_initialized()
47
+ }
48
+ except Exception as e:
49
+ return JSONResponse(
50
+ status_code=503,
51
+ content={"status": "unhealthy", "error": str(e)}
52
+ )
53
+
54
+ async def stream_pipeline(request: PipelineRequest) -> AsyncGenerator[bytes, None]:
55
+ """
56
+ Stream NDJSON events from pipeline
57
+
58
+ Supports both dynamic (company_names) and legacy (company_ids) modes
59
+ """
60
+ async for event in orchestrator.run_pipeline(
61
+ company_ids=request.company_ids,
62
+ company_names=request.company_names,
63
+ use_seed_file=request.use_seed_file
64
+ ):
65
+ # Ensure nested Pydantic models (e.g., Prospect) are JSON-serializable
66
+ yield (json.dumps(jsonable_encoder(event)) + "\n").encode()
67
+
68
+ @app.post("/run")
69
+ async def run_pipeline(request: PipelineRequest):
70
+ """
71
+ Run the full pipeline with NDJSON streaming
72
+
73
+ NEW: Accepts company_names for dynamic discovery
74
+ LEGACY: Still supports company_ids for backwards compatibility
75
+
76
+ Example (Dynamic):
77
+ {"company_names": ["Shopify", "Stripe", "Zendesk"]}
78
+
79
+ Example (Legacy):
80
+ {"company_ids": ["acme", "techcorp"], "use_seed_file": true}
81
+ """
82
+ return StreamingResponse(
83
+ stream_pipeline(request),
84
+ media_type="application/x-ndjson"
85
+ )
86
+
87
+ async def stream_writer_test(company_id: str) -> AsyncGenerator[bytes, None]:
88
+ """Stream only Writer agent output for testing"""
89
+ from agents.writer import Writer
90
+
91
+ # Get company from store
92
+ store = mcp.get_store_client()
93
+ company = await store.get_company(company_id)
94
+
95
+ if not company:
96
+ yield (json.dumps({"error": f"Company {company_id} not found"}) + "\n").encode()
97
+ return
98
+
99
+ # Create a test prospect
100
+ prospect = Prospect(
101
+ id=f"{company_id}_test",
102
+ company=company,
103
+ contacts=[],
104
+ facts=[],
105
+ fit_score=0.8,
106
+ status="scored"
107
+ )
108
+
109
+ writer = Writer(mcp)
110
+ async for event in writer.run_streaming(prospect):
111
+ # Ensure nested Pydantic models (e.g., Prospect) are JSON-serializable
112
+ yield (json.dumps(jsonable_encoder(event)) + "\n").encode()
113
+
114
+ @app.post("/writer/stream")
115
+ async def writer_stream_test(request: WriterStreamRequest):
116
+ """Test endpoint for Writer streaming"""
117
+ return StreamingResponse(
118
+ stream_writer_test(request.company_id),
119
+ media_type="application/x-ndjson"
120
+ )
121
+
122
+ @app.get("/prospects")
123
+ async def list_prospects():
124
+ """List all prospects with status and scores"""
125
+ store = mcp.get_store_client()
126
+ prospects = await store.list_prospects()
127
+ return {
128
+ "count": len(prospects),
129
+ "prospects": [
130
+ {
131
+ "id": p.id,
132
+ "company": p.company.name,
133
+ "status": p.status,
134
+ "fit_score": p.fit_score,
135
+ "contacts": len(p.contacts),
136
+ "facts": len(p.facts)
137
+ }
138
+ for p in prospects
139
+ ]
140
+ }
141
+
142
+ @app.get("/prospects/{prospect_id}")
143
+ async def get_prospect(prospect_id: str):
144
+ """Get detailed prospect information"""
145
+ store = mcp.get_store_client()
146
+ prospect = await store.get_prospect(prospect_id)
147
+
148
+ if not prospect:
149
+ raise HTTPException(status_code=404, detail="Prospect not found")
150
+
151
+ # Get thread if exists
152
+ email_client = mcp.get_email_client()
153
+ thread = None
154
+ if prospect.thread_id:
155
+ thread = await email_client.get_thread(prospect.id)
156
+
157
+ return {
158
+ "prospect": prospect.dict(),
159
+ "thread": thread.dict() if thread else None
160
+ }
161
+
162
+ @app.get("/handoff/{prospect_id}")
163
+ async def get_handoff(prospect_id: str):
164
+ """Get handoff packet for a prospect"""
165
+ store = mcp.get_store_client()
166
+ prospect = await store.get_prospect(prospect_id)
167
+
168
+ if not prospect:
169
+ raise HTTPException(status_code=404, detail="Prospect not found")
170
+
171
+ if prospect.status != "ready_for_handoff":
172
+ raise HTTPException(status_code=400,
173
+ detail=f"Prospect not ready for handoff (status: {prospect.status})")
174
+
175
+ # Get thread
176
+ email_client = mcp.get_email_client()
177
+ thread = None
178
+ if prospect.thread_id:
179
+ thread = await email_client.get_thread(prospect.id)
180
+
181
+ # Get calendar slots
182
+ calendar_client = mcp.get_calendar_client()
183
+ slots = await calendar_client.suggest_slots()
184
+
185
+ packet = HandoffPacket(
186
+ prospect=prospect,
187
+ thread=thread,
188
+ calendar_slots=slots,
189
+ generated_at=datetime.utcnow()
190
+ )
191
+
192
+ return packet.dict()
193
+
194
+ @app.post("/reset")
195
+ async def reset_system():
196
+ """Clear store, reload seeds, rebuild FAISS"""
197
+ store = mcp.get_store_client()
198
+
199
+ # Clear all data
200
+ await store.clear_all()
201
+
202
+ # Reload seed companies
203
+ import json
204
+ from app.config import COMPANIES_FILE
205
+
206
+ with open(COMPANIES_FILE) as f:
207
+ companies = json.load(f)
208
+
209
+ for company_data in companies:
210
+ await store.save_company(company_data)
211
+
212
+ # Rebuild vector index
213
+ vector_store.rebuild_index()
214
+
215
+ return {
216
+ "status": "reset_complete",
217
+ "companies_loaded": len(companies),
218
+ "timestamp": datetime.utcnow().isoformat()
219
+ }
220
+
221
+ if __name__ == "__main__":
222
+ import uvicorn
223
+ uvicorn.run(app, host="0.0.0.0", port=8000)
app/orchestrator.py ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: app/orchestrator.py
2
+ import asyncio
3
+ from typing import List, AsyncGenerator, Optional
4
+ from app.schema import Prospect, PipelineEvent, Company
5
+ from app.logging_utils import log_event, logger
6
+ from agents import (
7
+ Hunter, Enricher, Contactor, Scorer,
8
+ Writer, Compliance, Sequencer, Curator
9
+ )
10
+ from mcp.registry import MCPRegistry
11
+
12
+ class Orchestrator:
13
+ def __init__(self):
14
+ self.mcp = MCPRegistry()
15
+ self.hunter = Hunter(self.mcp)
16
+ self.enricher = Enricher(self.mcp)
17
+ self.contactor = Contactor(self.mcp)
18
+ self.scorer = Scorer(self.mcp)
19
+ self.writer = Writer(self.mcp)
20
+ self.compliance = Compliance(self.mcp)
21
+ self.sequencer = Sequencer(self.mcp)
22
+ self.curator = Curator(self.mcp)
23
+
24
+ async def run_pipeline(
25
+ self,
26
+ company_ids: Optional[List[str]] = None,
27
+ company_names: Optional[List[str]] = None,
28
+ use_seed_file: bool = False
29
+ ) -> AsyncGenerator[dict, None]:
30
+ """
31
+ Run the full pipeline with streaming events and detailed MCP tracking
32
+
33
+ Args:
34
+ company_ids: Legacy mode - company IDs from seed file
35
+ company_names: Dynamic mode - company names to discover
36
+ use_seed_file: Force legacy mode with seed file
37
+ """
38
+
39
+ # Hunter phase
40
+ if company_names and not use_seed_file:
41
+ yield log_event("hunter", "Starting dynamic company discovery", "agent_start")
42
+ yield log_event("hunter", f"Discovering {len(company_names)} companies via web search", "mcp_call",
43
+ {"mcp_server": "web_search", "method": "discover_companies", "count": len(company_names)})
44
+
45
+ prospects = await self.hunter.run(company_names=company_names, use_seed_file=False)
46
+
47
+ yield log_event("hunter", f"Discovered {len(prospects)} companies from web search", "mcp_response",
48
+ {"mcp_server": "web_search", "companies_discovered": len(prospects)})
49
+ else:
50
+ yield log_event("hunter", "Starting prospect discovery (legacy mode)", "agent_start")
51
+ yield log_event("hunter", "Calling MCP Store to load seed companies", "mcp_call",
52
+ {"mcp_server": "store", "method": "load_companies"})
53
+
54
+ prospects = await self.hunter.run(company_ids=company_ids, use_seed_file=True)
55
+
56
+ yield log_event("hunter", f"MCP Store returned {len(prospects)} companies", "mcp_response",
57
+ {"mcp_server": "store", "companies_count": len(prospects)})
58
+ yield log_event("hunter", f"Found {len(prospects)} prospects", "agent_end",
59
+ {"count": len(prospects)})
60
+
61
+ for prospect in prospects:
62
+ try:
63
+ company_name = prospect.company.name
64
+
65
+ # Enricher phase
66
+ yield log_event("enricher", f"Enriching {company_name}", "agent_start")
67
+ yield log_event("enricher", f"Calling MCP Search for company facts", "mcp_call",
68
+ {"mcp_server": "search", "company": company_name})
69
+
70
+ prospect = await self.enricher.run(prospect)
71
+
72
+ yield log_event("enricher", f"MCP Search returned facts", "mcp_response",
73
+ {"mcp_server": "search", "facts_found": len(prospect.facts)})
74
+ yield log_event("enricher", f"Calling MCP Store to save {len(prospect.facts)} facts", "mcp_call",
75
+ {"mcp_server": "store", "method": "save_facts"})
76
+ yield log_event("enricher", f"Added {len(prospect.facts)} facts", "agent_end",
77
+ {"facts_count": len(prospect.facts)})
78
+
79
+ # Contactor phase
80
+ yield log_event("contactor", f"Finding contacts for {company_name}", "agent_start")
81
+ yield log_event("contactor", f"Calling MCP Store to check suppressions", "mcp_call",
82
+ {"mcp_server": "store", "method": "check_suppression", "domain": prospect.company.domain})
83
+
84
+ # Check suppression
85
+ store = self.mcp.get_store_client()
86
+ suppressed = await store.check_suppression("domain", prospect.company.domain)
87
+
88
+ if suppressed:
89
+ yield log_event("contactor", f"Domain {prospect.company.domain} is suppressed", "mcp_response",
90
+ {"mcp_server": "store", "suppressed": True})
91
+ else:
92
+ yield log_event("contactor", f"Domain {prospect.company.domain} is not suppressed", "mcp_response",
93
+ {"mcp_server": "store", "suppressed": False})
94
+
95
+ prospect = await self.contactor.run(prospect)
96
+
97
+ if prospect.contacts:
98
+ yield log_event("contactor", f"Calling MCP Store to save {len(prospect.contacts)} contacts", "mcp_call",
99
+ {"mcp_server": "store", "method": "save_contacts"})
100
+
101
+ yield log_event("contactor", f"Found {len(prospect.contacts)} contacts", "agent_end",
102
+ {"contacts_count": len(prospect.contacts)})
103
+
104
+ # Scorer phase
105
+ yield log_event("scorer", f"Scoring {company_name}", "agent_start")
106
+ yield log_event("scorer", "Calculating fit score based on industry, size, and pain points", "agent_log")
107
+
108
+ prospect = await self.scorer.run(prospect)
109
+
110
+ yield log_event("scorer", f"Calling MCP Store to save prospect with score", "mcp_call",
111
+ {"mcp_server": "store", "method": "save_prospect", "fit_score": prospect.fit_score})
112
+ yield log_event("scorer", f"Fit score: {prospect.fit_score:.2f}", "agent_end",
113
+ {"fit_score": prospect.fit_score, "status": prospect.status})
114
+
115
+ if prospect.status == "dropped":
116
+ yield log_event("scorer", f"Dropped: {prospect.dropped_reason}", "agent_log",
117
+ {"reason": prospect.dropped_reason})
118
+ continue
119
+
120
+ # Writer phase with streaming
121
+ yield log_event("writer", f"Drafting outreach for {company_name}", "agent_start")
122
+ yield log_event("writer", "Calling Vector Store for relevant facts", "mcp_call",
123
+ {"mcp_server": "vector", "method": "retrieve", "company_id": prospect.company.id})
124
+ yield log_event("writer", "Calling HuggingFace Inference API for content generation", "mcp_call",
125
+ {"mcp_server": "hf_inference", "model": "Qwen/Qwen2.5-7B-Instruct"})
126
+
127
+ async for event in self.writer.run_streaming(prospect):
128
+ if event["type"] == "llm_token":
129
+ yield event
130
+ elif event["type"] == "llm_done":
131
+ yield event
132
+ prospect = event["payload"]["prospect"]
133
+ yield log_event("writer", "HuggingFace Inference completed generation", "mcp_response",
134
+ {"mcp_server": "hf_inference", "has_summary": bool(prospect.summary),
135
+ "has_email": bool(prospect.email_draft)})
136
+
137
+ yield log_event("writer", f"Calling MCP Store to save draft", "mcp_call",
138
+ {"mcp_server": "store", "method": "save_prospect"})
139
+ yield log_event("writer", "Draft complete", "agent_end",
140
+ {"has_summary": bool(prospect.summary),
141
+ "has_email": bool(prospect.email_draft)})
142
+
143
+ # Compliance phase
144
+ yield log_event("compliance", f"Checking compliance for {company_name}", "agent_start")
145
+ yield log_event("compliance", "Calling MCP Store to check email/domain suppressions", "mcp_call",
146
+ {"mcp_server": "store", "method": "check_suppression"})
147
+
148
+ # Check each contact for suppression
149
+ for contact in prospect.contacts:
150
+ email_suppressed = await store.check_suppression("email", contact.email)
151
+ if email_suppressed:
152
+ yield log_event("compliance", f"Email {contact.email} is suppressed", "mcp_response",
153
+ {"mcp_server": "store", "suppressed": True})
154
+
155
+ yield log_event("compliance", "Checking CAN-SPAM, PECR, CASL requirements", "agent_log")
156
+
157
+ prospect = await self.compliance.run(prospect)
158
+
159
+ if prospect.status == "blocked":
160
+ yield log_event("compliance", f"Blocked: {prospect.dropped_reason}", "policy_block",
161
+ {"reason": prospect.dropped_reason})
162
+ continue
163
+ else:
164
+ yield log_event("compliance", "All compliance checks passed", "policy_pass")
165
+ yield log_event("compliance", "Footer appended to email", "agent_log")
166
+
167
+ # Sequencer phase
168
+ yield log_event("sequencer", f"Sequencing outreach for {company_name}", "agent_start")
169
+
170
+ if not prospect.contacts or not prospect.email_draft:
171
+ yield log_event("sequencer", "Missing contacts or email draft", "agent_log",
172
+ {"has_contacts": bool(prospect.contacts),
173
+ "has_email": bool(prospect.email_draft)})
174
+ prospect.status = "blocked"
175
+ prospect.dropped_reason = "No contacts or email draft available"
176
+ await store.save_prospect(prospect)
177
+ yield log_event("sequencer", f"Blocked: {prospect.dropped_reason}", "agent_end")
178
+ continue
179
+
180
+ yield log_event("sequencer", "Calling MCP Calendar for available slots", "mcp_call",
181
+ {"mcp_server": "calendar", "method": "suggest_slots"})
182
+
183
+ calendar = self.mcp.get_calendar_client()
184
+ slots = await calendar.suggest_slots()
185
+
186
+ yield log_event("sequencer", f"MCP Calendar returned {len(slots)} slots", "mcp_response",
187
+ {"mcp_server": "calendar", "slots_count": len(slots)})
188
+
189
+ if slots:
190
+ yield log_event("sequencer", "Calling MCP Calendar to generate ICS", "mcp_call",
191
+ {"mcp_server": "calendar", "method": "generate_ics"})
192
+
193
+ yield log_event("sequencer", f"Calling MCP Email to send to {prospect.contacts[0].email}", "mcp_call",
194
+ {"mcp_server": "email", "method": "send", "recipient": prospect.contacts[0].email})
195
+
196
+ prospect = await self.sequencer.run(prospect)
197
+
198
+ yield log_event("sequencer", f"MCP Email created thread", "mcp_response",
199
+ {"mcp_server": "email", "thread_id": prospect.thread_id})
200
+ yield log_event("sequencer", f"Thread created: {prospect.thread_id}", "agent_end",
201
+ {"thread_id": prospect.thread_id})
202
+
203
+ # Curator phase
204
+ yield log_event("curator", f"Creating handoff for {company_name}", "agent_start")
205
+ yield log_event("curator", "Calling MCP Email to retrieve thread", "mcp_call",
206
+ {"mcp_server": "email", "method": "get_thread", "prospect_id": prospect.id})
207
+
208
+ email_client = self.mcp.get_email_client()
209
+ thread = await email_client.get_thread(prospect.id) if prospect.thread_id else None
210
+
211
+ if thread:
212
+ yield log_event("curator", f"MCP Email returned thread with messages", "mcp_response",
213
+ {"mcp_server": "email", "has_thread": True})
214
+
215
+ yield log_event("curator", "Calling MCP Calendar for meeting slots", "mcp_call",
216
+ {"mcp_server": "calendar", "method": "suggest_slots"})
217
+
218
+ prospect = await self.curator.run(prospect)
219
+
220
+ yield log_event("curator", "Calling MCP Store to save handoff packet", "mcp_call",
221
+ {"mcp_server": "store", "method": "save_handoff"})
222
+ yield log_event("curator", "Handoff packet created and saved", "mcp_response",
223
+ {"mcp_server": "store", "saved": True})
224
+ yield log_event("curator", "Handoff ready", "agent_end",
225
+ {"prospect_id": prospect.id, "status": "ready_for_handoff"})
226
+
227
+ except Exception as e:
228
+ logger.error(f"Pipeline error for {prospect.company.name}: {e}")
229
+ yield log_event("orchestrator", f"Error: {str(e)}", "agent_log",
230
+ {"error": str(e), "prospect_id": prospect.id})
app/schema.py ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # file: app/schema.py
2
+ from datetime import datetime
3
+ from typing import List, Optional, Dict, Any
4
+ from pydantic import BaseModel, Field, EmailStr
5
+
6
+ class Company(BaseModel):
7
+ id: Optional[str] = None # Auto-generated if not provided
8
+ name: str
9
+ domain: str
10
+ industry: str
11
+ size: Optional[str] = None # Changed to string to accept "500-1000 employees" format
12
+ pains: List[str] = []
13
+ notes: List[str] = []
14
+ summary: Optional[str] = None
15
+
16
+ class Contact(BaseModel):
17
+ id: str
18
+ name: str
19
+ email: EmailStr
20
+ title: str
21
+ prospect_id: str
22
+
23
+ class Fact(BaseModel):
24
+ id: str
25
+ source: str
26
+ text: str
27
+ collected_at: datetime
28
+ ttl_hours: int
29
+ confidence: float
30
+ company_id: str
31
+
32
+ class Prospect(BaseModel):
33
+ id: str
34
+ company: Company
35
+ contacts: List[Contact] = []
36
+ facts: List[Fact] = []
37
+ fit_score: float = 0.0
38
+ status: str = "new" # new, enriched, scored, drafted, compliant, sequenced, ready_for_handoff, dropped
39
+ dropped_reason: Optional[str] = None
40
+ summary: Optional[str] = None
41
+ email_draft: Optional[Dict[str, str]] = None
42
+ thread_id: Optional[str] = None
43
+
44
+ class Message(BaseModel):
45
+ id: str
46
+ thread_id: str
47
+ prospect_id: str
48
+ direction: str # outbound, inbound
49
+ subject: str
50
+ body: str
51
+ sent_at: datetime
52
+
53
+ class Thread(BaseModel):
54
+ id: str
55
+ prospect_id: str
56
+ messages: List[Message] = []
57
+
58
+ class Suppression(BaseModel):
59
+ id: str
60
+ type: str # email, domain, company
61
+ value: str
62
+ reason: str
63
+ expires_at: Optional[datetime] = None
64
+
65
+ class HandoffPacket(BaseModel):
66
+ prospect: Prospect
67
+ thread: Optional[Thread]
68
+ calendar_slots: List[Dict[str, str]] = []
69
+ generated_at: datetime
70
+
71
+ class PipelineEvent(BaseModel):
72
+ ts: datetime
73
+ type: str # agent_start, agent_log, agent_end, llm_token, llm_done, policy_block, policy_pass
74
+ agent: str
75
+ message: str
76
+ payload: Dict[str, Any] = {}
77
+
78
+ class PipelineRequest(BaseModel):
79
+ """
80
+ Pipeline request supporting both dynamic and static modes
81
+
82
+ NEW: company_names - List of company names to discover dynamically
83
+ LEGACY: company_ids - List of company IDs from seed file (backwards compatible)
84
+ """
85
+ company_names: Optional[List[str]] = None # NEW: Dynamic discovery mode
86
+ company_ids: Optional[List[str]] = None # LEGACY: Static mode
87
+ use_seed_file: bool = False # Force legacy mode
88
+
89
+ class WriterStreamRequest(BaseModel):
90
+ company_id: str
app_mcp_autonomous.py ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ CX AI Agent - Autonomous MCP Demo
3
+
4
+ This is the PROPER MCP implementation where:
5
+ - AI (Claude 3.5 Sonnet) autonomously calls MCP tools
6
+ - NO hardcoded workflow
7
+ - AI decides which tools to use and when
8
+ - Full Model Context Protocol demonstration
9
+
10
+ Perfect for MCP hackathon!
11
+ """
12
+
13
+ import os
14
+ import gradio as gr
15
+ import asyncio
16
+ from pathlib import Path
17
+ from dotenv import load_dotenv
18
+
19
+ # Load environment variables
20
+ load_dotenv()
21
+
22
+ # Set in-memory MCP mode for HF Spaces
23
+ os.environ["USE_IN_MEMORY_MCP"] = "true"
24
+
25
+ from mcp.registry import get_mcp_registry
26
+ from mcp.agents.autonomous_agent import AutonomousMCPAgent
27
+
28
+
29
+ # Initialize MCP registry
30
+ mcp_registry = get_mcp_registry()
31
+
32
+
33
+ async def run_autonomous_agent(task: str, api_key: str):
34
+ """
35
+ Run the autonomous AI agent with MCP tool calling.
36
+
37
+ Args:
38
+ task: The task for the AI to complete autonomously
39
+ api_key: Anthropic API key for Claude
40
+
41
+ Yields:
42
+ Progress updates from the agent
43
+ """
44
+
45
+ if not api_key:
46
+ yield "❌ Error: Please provide an Anthropic API key"
47
+ return
48
+
49
+ if not task:
50
+ yield "❌ Error: Please provide a task description"
51
+ return
52
+
53
+ # Create autonomous agent
54
+ try:
55
+ agent = AutonomousMCPAgent(mcp_registry=mcp_registry, api_key=api_key)
56
+ except Exception as e:
57
+ yield f"❌ Error initializing agent: {str(e)}"
58
+ return
59
+
60
+ # Run agent autonomously
61
+ output_text = ""
62
+
63
+ try:
64
+ async for event in agent.run(task, max_iterations=15):
65
+ event_type = event.get("type")
66
+ message = event.get("message", "")
67
+
68
+ # Format the message based on event type
69
+ if event_type == "agent_start":
70
+ output_text += f"\n{'='*60}\n"
71
+ output_text += f"{message}\n"
72
+ output_text += f"Model: {event.get('model')}\n"
73
+ output_text += f"{'='*60}\n\n"
74
+
75
+ elif event_type == "iteration_start":
76
+ output_text += f"\n{message}\n"
77
+
78
+ elif event_type == "tool_call":
79
+ tool = event.get("tool")
80
+ tool_input = event.get("input", {})
81
+ output_text += f"\n{message}\n"
82
+ output_text += f" Input: {tool_input}\n"
83
+
84
+ elif event_type == "tool_result":
85
+ tool = event.get("tool")
86
+ result = event.get("result", {})
87
+ output_text += f"{message}\n"
88
+
89
+ # Show some result details
90
+ if isinstance(result, dict):
91
+ if "count" in result:
92
+ output_text += f" → Returned {result['count']} items\n"
93
+ elif "status" in result:
94
+ output_text += f" → Status: {result['status']}\n"
95
+
96
+ elif event_type == "tool_error":
97
+ tool = event.get("tool")
98
+ error = event.get("error")
99
+ output_text += f"\n{message}\n"
100
+ output_text += f" Error: {error}\n"
101
+
102
+ elif event_type == "agent_complete":
103
+ final_response = event.get("final_response", "")
104
+ iterations = event.get("iterations", 0)
105
+ output_text += f"\n{'='*60}\n"
106
+ output_text += f"{message}\n"
107
+ output_text += f"Iterations: {iterations}\n"
108
+ output_text += f"{'='*60}\n\n"
109
+ output_text += f"**Final Response:**\n\n{final_response}\n"
110
+
111
+ elif event_type == "agent_error":
112
+ error = event.get("error")
113
+ output_text += f"\n{message}\n"
114
+ output_text += f"Error: {error}\n"
115
+
116
+ elif event_type == "agent_max_iterations":
117
+ iterations = event.get("iterations", 0)
118
+ output_text += f"\n{message}\n"
119
+
120
+ yield output_text
121
+
122
+ except Exception as e:
123
+ output_text += f"\n\n❌ Agent execution failed: {str(e)}\n"
124
+ yield output_text
125
+
126
+
127
+ def create_demo():
128
+ """Create Gradio demo interface"""
129
+
130
+ with gr.Blocks(title="CX AI Agent - Autonomous MCP Demo", theme=gr.themes.Soft()) as demo:
131
+ gr.Markdown("""
132
+ # 🤖 CX AI Agent - Autonomous MCP Demo
133
+
134
+ This demo shows **true AI-driven MCP usage** where Claude 3.5 Sonnet:
135
+ - ✅ Autonomously decides which MCP tools to call
136
+ - ✅ Uses Model Context Protocol servers (Search, Store, Email, Calendar)
137
+ - ✅ NO hardcoded workflow - AI makes all decisions
138
+ - ✅ Proper MCP protocol implementation
139
+
140
+ ## Available MCP Tools:
141
+ - 🔍 **Search**: Web search, news search
142
+ - 💾 **Store**: Save/retrieve prospects, companies, contacts, facts
143
+ - 📧 **Email**: Send emails, track threads
144
+ - 📅 **Calendar**: Suggest meeting times, generate invites
145
+
146
+ ## Example Tasks:
147
+ - "Research Shopify and determine if they're a good B2B prospect"
148
+ - "Find 3 e-commerce companies and save them as prospects"
149
+ - "Create a personalized outreach campaign for Stripe"
150
+ - "Find recent news about AI startups and save as facts"
151
+ """)
152
+
153
+ with gr.Row():
154
+ with gr.Column():
155
+ api_key_input = gr.Textbox(
156
+ label="Anthropic API Key",
157
+ type="password",
158
+ placeholder="sk-ant-...",
159
+ info="Required for Claude 3.5 Sonnet (get one at console.anthropic.com)"
160
+ )
161
+
162
+ task_input = gr.Textbox(
163
+ label="Task for AI Agent",
164
+ placeholder="Research Shopify and create a prospect profile with facts",
165
+ lines=3,
166
+ info="Describe what you want the AI to do autonomously"
167
+ )
168
+
169
+ # Example tasks dropdown
170
+ example_tasks = gr.Dropdown(
171
+ label="Example Tasks (click to use)",
172
+ choices=[
173
+ "Research Shopify and determine if they're a good B2B SaaS prospect",
174
+ "Find recent news about Stripe and save as facts in the database",
175
+ "Create a prospect profile for Notion including company info and facts",
176
+ "Search for B2B SaaS companies in the e-commerce space and save top 3 prospects",
177
+ "Research Figma's recent product launches and save relevant facts",
178
+ ],
179
+ interactive=True
180
+ )
181
+
182
+ def use_example(example):
183
+ return example
184
+
185
+ example_tasks.change(fn=use_example, inputs=[example_tasks], outputs=[task_input])
186
+
187
+ run_btn = gr.Button("🚀 Run Autonomous Agent", variant="primary", size="lg")
188
+
189
+ with gr.Column():
190
+ output = gr.Textbox(
191
+ label="Agent Progress & Results",
192
+ lines=25,
193
+ max_lines=50,
194
+ show_copy_button=True
195
+ )
196
+
197
+ run_btn.click(
198
+ fn=run_autonomous_agent,
199
+ inputs=[task_input, api_key_input],
200
+ outputs=[output]
201
+ )
202
+
203
+ gr.Markdown("""
204
+ ## 🎯 How It Works
205
+
206
+ 1. **You provide a task** - Tell the AI what you want to accomplish
207
+ 2. **AI analyzes the task** - Claude understands what needs to be done
208
+ 3. **AI decides which tools to use** - Autonomously chooses MCP tools
209
+ 4. **AI executes tools** - Calls MCP servers (search, store, email, calendar)
210
+ 5. **AI continues until complete** - Keeps working until task is done
211
+
212
+ ## 🏆 True MCP Implementation
213
+
214
+ This is **NOT** a hardcoded workflow! The AI:
215
+ - ✅ Decides which tools to call based on context
216
+ - ✅ Adapts to new information
217
+ - ✅ Can call tools in any order
218
+ - ✅ Reasons about what information it needs
219
+ - ✅ Stores data for later use
220
+
221
+ ## 💡 Tips
222
+
223
+ - Be specific about what you want
224
+ - The AI can search, save data, and reason about prospects
225
+ - Try multi-step tasks to see autonomous decision-making
226
+ - Check the progress log to see which tools the AI chooses
227
+
228
+ ---
229
+
230
+ **Powered by:** Claude 3.5 Sonnet + Model Context Protocol (MCP)
231
+ """)
232
+
233
+ return demo
234
+
235
+
236
+ if __name__ == "__main__":
237
+ demo = create_demo()
238
+ demo.launch(
239
+ server_name="0.0.0.0",
240
+ server_port=7860,
241
+ show_error=True
242
+ )
assets/.gitkeep ADDED
@@ -0,0 +1 @@
 
 
1
+
check_api_keys.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Quick diagnostic to check if API keys are accessible
3
+ """
4
+ import os
5
+ from dotenv import load_dotenv
6
+
7
+ # Load .env file
8
+ load_dotenv()
9
+
10
+ print("=" * 80)
11
+ print("API KEY DIAGNOSTIC CHECK")
12
+ print("=" * 80)
13
+ print()
14
+
15
+ # Check SERPER_API_KEY
16
+ serper_key = os.getenv('SERPER_API_KEY')
17
+ print(f"SERPER_API_KEY: {'✓ FOUND' if serper_key else '✗ NOT FOUND'}")
18
+ if serper_key:
19
+ print(f" Value: {serper_key[:10]}..." if len(serper_key) > 10 else f" Value: {serper_key}")
20
+ print(f" Length: {len(serper_key)} characters")
21
+ else:
22
+ print(" ⚠ This key is REQUIRED for real contact discovery!")
23
+ print(" Get it from: https://serper.dev")
24
+
25
+ print()
26
+
27
+ # Check HF_API_TOKEN
28
+ hf_token = os.getenv('HF_API_TOKEN')
29
+ print(f"HF_API_TOKEN: {'✓ FOUND' if hf_token else '✗ NOT FOUND'}")
30
+ if hf_token:
31
+ print(f" Value: {hf_token[:10]}..." if len(hf_token) > 10 else f" Value: {hf_token}")
32
+ print(f" Length: {len(hf_token)} characters")
33
+ else:
34
+ print(" ⚠ This key is needed for AI email generation")
35
+
36
+ print()
37
+
38
+ # Check if running in HF Space
39
+ space_id = os.getenv('SPACE_ID')
40
+ space_author = os.getenv('SPACE_AUTHOR_NAME')
41
+ if space_id or space_author:
42
+ print(f"🚀 Running in HuggingFace Space")
43
+ print(f" Space ID: {space_id}")
44
+ print(f" Author: {space_author}")
45
+ print()
46
+ print("NOTE: In HF Spaces, secrets should be set in:")
47
+ print(" Settings → Repository secrets")
48
+ print(" Then restart the Space for changes to take effect")
49
+ else:
50
+ print("💻 Running locally")
51
+ print()
52
+ print("For local development, create a .env file with:")
53
+ print(" SERPER_API_KEY=your-key-here")
54
+ print(" HF_API_TOKEN=your-token-here")
55
+
56
+ print()
57
+ print("=" * 80)
58
+
59
+ # Test web search service
60
+ print("\nTesting WebSearchService initialization...")
61
+ try:
62
+ from services.web_search import get_search_service
63
+ search = get_search_service()
64
+ if search.api_key:
65
+ print("✓ WebSearchService initialized with API key")
66
+ else:
67
+ print("✗ WebSearchService initialized WITHOUT API key")
68
+ print(" Web search will fail!")
69
+ except Exception as e:
70
+ print(f"✗ Error initializing WebSearchService: {e}")
71
+
72
+ print()
73
+ print("=" * 80)
create_branding_images.py ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Create placeholder branding images for OmniFlow CX
3
+ These are simple placeholder images that can be replaced with professional designs
4
+ """
5
+ from PIL import Image, ImageDraw, ImageFont
6
+ import os
7
+
8
+ def create_logo():
9
+ """Create Logo.png - App logo"""
10
+ width, height = 400, 120
11
+ img = Image.new('RGB', (width, height), color='#1e3a8a') # Dark blue
12
+ draw = ImageDraw.Draw(img)
13
+
14
+ # Try to use a nice font, fallback to default
15
+ try:
16
+ font = ImageFont.truetype("arial.ttf", 48)
17
+ small_font = ImageFont.truetype("arial.ttf", 20)
18
+ except:
19
+ font = ImageFont.load_default()
20
+ small_font = ImageFont.load_default()
21
+
22
+ # Draw wave emoji and text
23
+ text = "🌊 OmniFlow CX"
24
+ bbox = draw.textbbox((0, 0), text, font=font)
25
+ text_width = bbox[2] - bbox[0]
26
+ text_height = bbox[3] - bbox[1]
27
+ x = (width - text_width) / 2
28
+ y = (height - text_height) / 2 - 10
29
+
30
+ draw.text((x, y), text, fill='white', font=font)
31
+
32
+ # Subtitle
33
+ subtitle = "MCP-Powered B2B Sales Automation"
34
+ bbox2 = draw.textbbox((0, 0), subtitle, font=small_font)
35
+ text_width2 = bbox2[2] - bbox2[0]
36
+ x2 = (width - text_width2) / 2
37
+
38
+ draw.text((x2, y + 60), subtitle, fill='#93c5fd', font=small_font) # Light blue
39
+
40
+ img.save('Logo.png')
41
+ print("[OK] Created Logo.png")
42
+
43
+ def create_banner():
44
+ """Create Banner.png - Banner image"""
45
+ width, height = 1200, 300
46
+ img = Image.new('RGB', (width, height), color='#0f172a') # Very dark blue
47
+ draw = ImageDraw.Draw(img)
48
+
49
+ try:
50
+ font = ImageFont.truetype("arial.ttf", 72)
51
+ subtitle_font = ImageFont.truetype("arial.ttf", 32)
52
+ except:
53
+ font = ImageFont.load_default()
54
+ subtitle_font = ImageFont.load_default()
55
+
56
+ # Main title
57
+ text = "🌊 OmniFlow CX"
58
+ bbox = draw.textbbox((0, 0), text, font=font)
59
+ text_width = bbox[2] - bbox[0]
60
+ x = (width - text_width) / 2
61
+
62
+ draw.text((x, 60), text, fill='white', font=font)
63
+
64
+ # Subtitle
65
+ subtitle = "Intelligent B2B Sales Automation • Model Context Protocol"
66
+ bbox2 = draw.textbbox((0, 0), subtitle, font=subtitle_font)
67
+ text_width2 = bbox2[2] - bbox2[0]
68
+ x2 = (width - text_width2) / 2
69
+
70
+ draw.text((x2, 160), subtitle, fill='#60a5fa', font=subtitle_font)
71
+
72
+ # Bottom text
73
+ bottom_text = "🏆 Hugging Face + Anthropic MCP Hackathon 2024"
74
+ try:
75
+ bottom_font = ImageFont.truetype("arial.ttf", 24)
76
+ except:
77
+ bottom_font = ImageFont.load_default()
78
+ bbox3 = draw.textbbox((0, 0), bottom_text, font=bottom_font)
79
+ text_width3 = bbox3[2] - bbox3[0]
80
+ x3 = (width - text_width3) / 2
81
+
82
+ draw.text((x3, 230), bottom_text, fill='#fbbf24', font=bottom_font) # Yellow
83
+
84
+ img.save('Banner.png')
85
+ print("[OK] Created Banner.png")
86
+
87
+ def create_ai_chatbot_logo():
88
+ """Create AI_chatbot_logo.png - AI assistant avatar"""
89
+ width, height = 200, 200
90
+ img = Image.new('RGBA', (width, height), color=(30, 58, 138, 255)) # Dark blue with transparency
91
+ draw = ImageDraw.Draw(img)
92
+
93
+ # Draw a circle
94
+ draw.ellipse([20, 20, 180, 180], fill='#3b82f6', outline='white', width=4)
95
+
96
+ try:
97
+ font = ImageFont.truetype("arial.ttf", 80)
98
+ except:
99
+ font = ImageFont.load_default()
100
+
101
+ # Robot emoji
102
+ text = "🤖"
103
+ bbox = draw.textbbox((0, 0), text, font=font)
104
+ text_width = bbox[2] - bbox[0]
105
+ text_height = bbox[3] - bbox[1]
106
+ x = (width - text_width) / 2
107
+ y = (height - text_height) / 2
108
+
109
+ draw.text((x, y), text, font=font)
110
+
111
+ img.save('AI_chatbot_logo.png')
112
+ print("[OK] Created AI_chatbot_logo.png")
113
+
114
+ if __name__ == "__main__":
115
+ print("Creating OmniFlow CX branding images...")
116
+ print()
117
+
118
+ create_logo()
119
+ create_banner()
120
+ create_ai_chatbot_logo()
121
+
122
+ print()
123
+ print("[SUCCESS] All branding images created successfully!")
124
+ print()
125
+ print("Images created:")
126
+ print(" - Logo.png (400x120) - Main application logo")
127
+ print(" - Banner.png (1200x300) - Header banner")
128
+ print(" - AI_chatbot_logo.png (200x200) - AI assistant avatar")
129
+ print()
130
+ print("These are placeholder images. Replace with professional designs for production.")
data/companies.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "id": "acme",
4
+ "name": "Acme Corporation",
5
+ "domain": "acme.com",
6
+ "industry": "SaaS",
7
+ "size": 500,
8
+ "pains": [
9
+ "Low NPS scores in enterprise segment",
10
+ "Customer churn increasing 15% YoY",
11
+ "Support ticket volume overwhelming team",
12
+ "No unified view of customer journey"
13
+ ],
14
+ "notes": [
15
+ "Recently raised Series C funding",
16
+ "Expanding into European market",
17
+ "Current support stack is fragmented"
18
+ ]
19
+ },
20
+ {
21
+ "id": "techcorp",
22
+ "name": "TechCorp Industries",
23
+ "domain": "techcorp.io",
24
+ "industry": "FinTech",
25
+ "size": 1200,
26
+ "pains": [
27
+ "Regulatory compliance for customer communications",
28
+ "Multi-channel support inconsistency",
29
+ "Customer onboarding takes too long",
30
+ "Poor personalization in customer interactions"
31
+ ],
32
+ "notes": [
33
+ "IPO planned for next year",
34
+ "Heavy investment in AI initiatives",
35
+ "Customer base growing 40% annually"
36
+ ]
37
+ },
38
+ {
39
+ "id": "retailplus",
40
+ "name": "RetailPlus",
41
+ "domain": "retailplus.com",
42
+ "industry": "E-commerce",
43
+ "size": 300,
44
+ "pains": [
45
+ "Seasonal support spikes unmanageable",
46
+ "Customer retention below industry average",
47
+ "No proactive customer engagement",
48
+ "Reviews and feedback not actionable"
49
+ ],
50
+ "notes": [
51
+ "Omnichannel retail strategy",
52
+ "Looking to improve post-purchase experience",
53
+ "Current NPS score is 42"
54
+ ]
55
+ }
56
+ ]
data/companies_store.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "id": "acme",
4
+ "name": "Acme Corporation",
5
+ "domain": "acme.com",
6
+ "industry": "SaaS",
7
+ "size": 500,
8
+ "pains": [
9
+ "Low NPS scores in enterprise segment",
10
+ "Customer churn increasing 15% YoY",
11
+ "Support ticket volume overwhelming team",
12
+ "No unified view of customer journey"
13
+ ],
14
+ "notes": [
15
+ "Recently raised Series C funding",
16
+ "Expanding into European market",
17
+ "Current support stack is fragmented"
18
+ ]
19
+ },
20
+ {
21
+ "id": "techcorp",
22
+ "name": "TechCorp Industries",
23
+ "domain": "techcorp.io",
24
+ "industry": "FinTech",
25
+ "size": 1200,
26
+ "pains": [
27
+ "Regulatory compliance for customer communications",
28
+ "Multi-channel support inconsistency",
29
+ "Customer onboarding takes too long",
30
+ "Poor personalization in customer interactions"
31
+ ],
32
+ "notes": [
33
+ "IPO planned for next year",
34
+ "Heavy investment in AI initiatives",
35
+ "Customer base growing 40% annually"
36
+ ]
37
+ },
38
+ {
39
+ "id": "retailplus",
40
+ "name": "RetailPlus",
41
+ "domain": "retailplus.com",
42
+ "industry": "E-commerce",
43
+ "size": 300,
44
+ "pains": [
45
+ "Seasonal support spikes unmanageable",
46
+ "Customer retention below industry average",
47
+ "No proactive customer engagement",
48
+ "Reviews and feedback not actionable"
49
+ ],
50
+ "notes": [
51
+ "Omnichannel retail strategy",
52
+ "Looking to improve post-purchase experience",
53
+ "Current NPS score is 42"
54
+ ]
55
+ }
56
+ ]
data/contacts.json ADDED
@@ -0,0 +1 @@
 
 
1
+ []
data/facts.json ADDED
@@ -0,0 +1 @@
 
 
1
+ []
data/footer.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ Lucidya Inc.
4
+ Prince Turki Bin Abdulaziz Al Awwal Rd
5
+ Al Mohammadiyyah, Riyadh 12362
6
+ Saudi Arabia
7
+
8
+ This email was sent by Lucidya's AI-powered outreach system.
9
+ To opt out of future communications, click here: https://lucidya.com/unsubscribe
data/handoffs.json ADDED
@@ -0,0 +1 @@
 
 
1
+ []
data/prospects.json ADDED
@@ -0,0 +1 @@
 
 
1
+ []
data/suppression.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "id": "supp-001",
4
+ "type": "domain",
5
+ "value": "competitor.com",
6
+ "reason": "Competitor - do not contact",
7
+ "expires_at": null
8
+ },
9
+ {
10
+ "id": "supp-002",
11
+ "type": "email",
12
+ "value": "[email protected]",
13
+ "reason": "Bounced email",
14
+ "expires_at": "2024-12-31T23:59:59Z"
15
+ }
16
+ ]
database/manager.py ADDED
@@ -0,0 +1,297 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Database Manager for B2B Sales AI Agent
3
+ Handles database initialization, migrations, and session management
4
+ """
5
+ from sqlalchemy import create_engine, event
6
+ from sqlalchemy.orm import sessionmaker, scoped_session
7
+ from sqlalchemy.pool import StaticPool
8
+ import os
9
+ import logging
10
+ from pathlib import Path
11
+ from contextlib import contextmanager
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+
16
+ class DatabaseManager:
17
+ """
18
+ Manages SQLite database connections and sessions
19
+ """
20
+
21
+ def __init__(self, db_path: str = None):
22
+ """
23
+ Initialize database manager
24
+
25
+ Args:
26
+ db_path: Path to SQLite database file
27
+ """
28
+ if db_path is None:
29
+ # Default to data/cx_agent.db
30
+ # For HuggingFace Spaces, try /data first (persistent), fallback to /tmp
31
+ default_path = os.getenv('DATABASE_PATH', './data/cx_agent.db')
32
+
33
+ # Check if we're on HuggingFace Spaces
34
+ if os.path.exists('/data'):
35
+ # HF Spaces with persistent storage
36
+ default_path = '/data/cx_agent.db'
37
+ elif os.path.exists('/tmp'):
38
+ # Fallback to tmp if data dir not available
39
+ default_path = '/tmp/cx_agent.db'
40
+
41
+ db_path = default_path
42
+
43
+ self.db_path = db_path
44
+ self.engine = None
45
+ self.Session = None
46
+
47
+ def initialize(self):
48
+ """Initialize database connection and create tables"""
49
+ try:
50
+ print(f"📂 Initializing database at: {self.db_path}")
51
+ logger.info(f"Initializing database at: {self.db_path}")
52
+
53
+ # Ensure data directory exists
54
+ db_dir = Path(self.db_path).parent
55
+ db_dir.mkdir(parents=True, exist_ok=True)
56
+ print(f"📁 Database directory: {db_dir}")
57
+ logger.info(f"Database directory created/verified: {db_dir}")
58
+
59
+ # Create engine
60
+ self.engine = create_engine(
61
+ f'sqlite:///{self.db_path}',
62
+ connect_args={'check_same_thread': False},
63
+ poolclass=StaticPool,
64
+ echo=False # Set to True for SQL debugging
65
+ )
66
+
67
+ # Enable foreign keys for SQLite
68
+ @event.listens_for(self.engine, "connect")
69
+ def set_sqlite_pragma(dbapi_conn, connection_record):
70
+ cursor = dbapi_conn.cursor()
71
+ cursor.execute("PRAGMA foreign_keys=ON")
72
+ cursor.close()
73
+
74
+ # Create session factory
75
+ # expire_on_commit=False keeps objects accessible after commit
76
+ session_factory = sessionmaker(bind=self.engine, expire_on_commit=False)
77
+ self.Session = scoped_session(session_factory)
78
+
79
+ # Import models and create tables
80
+ try:
81
+ from models.database import Base as EnterpriseBase
82
+ EnterpriseBase.metadata.create_all(self.engine)
83
+ print("✅ Enterprise tables created")
84
+ logger.info("Enterprise tables created")
85
+ except ImportError as e:
86
+ print(f"⚠️ Could not import enterprise models: {e}")
87
+ logger.warning(f"Could not import enterprise models: {e}")
88
+
89
+ logger.info(f"Database initialized at {self.db_path}")
90
+
91
+ # Initialize with default data
92
+ self._initialize_default_data()
93
+
94
+ return True
95
+
96
+ except Exception as e:
97
+ logger.error(f"Failed to initialize database: {str(e)}")
98
+ raise
99
+
100
+ def _initialize_default_data(self):
101
+ """Insert default data for new databases"""
102
+ try:
103
+ from models.database import Setting, Sequence, SequenceEmail, Template
104
+
105
+ session = self.Session()
106
+
107
+ # Check if already initialized
108
+ existing_settings = session.query(Setting).first()
109
+ if existing_settings:
110
+ session.close()
111
+ return
112
+
113
+ # Default settings
114
+ default_settings = [
115
+ Setting(key='company_name', value='Your Company', description='Company name for email footers'),
116
+ Setting(key='company_address', value='123 Main St, City, State 12345', description='Physical address for CAN-SPAM compliance'),
117
+ Setting(key='sender_name', value='Sales Team', description='Default sender name'),
118
+ Setting(key='sender_email', value='[email protected]', description='Default sender email'),
119
+ Setting(key='daily_email_limit', value='1000', description='Max emails per day'),
120
+ Setting(key='enable_tracking', value='1', description='Enable email tracking'),
121
+ ]
122
+ session.add_all(default_settings)
123
+
124
+ # Default sequence template: Cold Outreach (3-touch)
125
+ cold_outreach = Sequence(
126
+ name='Cold Outreach - 3 Touch',
127
+ description='Standard 3-email cold outreach sequence',
128
+ category='outbound',
129
+ is_template=True
130
+ )
131
+ session.add(cold_outreach)
132
+ session.flush()
133
+
134
+ sequence_emails = [
135
+ SequenceEmail(
136
+ sequence_id=cold_outreach.id,
137
+ step_number=1,
138
+ wait_days=0,
139
+ subject='Quick question about {{company_name}}',
140
+ body='''Hi {{first_name}},
141
+
142
+ I noticed {{company_name}} is in the {{industry}} space with {{company_size}} employees.
143
+
144
+ Companies like yours often face challenges with {{pain_points}}.
145
+
146
+ We've helped similar companies reduce support costs by 35% and improve customer satisfaction significantly.
147
+
148
+ Would you be open to a brief 15-minute call to explore if we might be able to help?
149
+
150
+ Best regards,
151
+ {{sender_name}}'''
152
+ ),
153
+ SequenceEmail(
154
+ sequence_id=cold_outreach.id,
155
+ step_number=2,
156
+ wait_days=3,
157
+ subject='Re: Quick question about {{company_name}}',
158
+ body='''Hi {{first_name}},
159
+
160
+ I wanted to follow up on my previous email. I understand you're busy, so I'll keep this brief.
161
+
162
+ We recently helped a company similar to {{company_name}} achieve:
163
+ • 40% reduction in support ticket volume
164
+ • 25% improvement in customer satisfaction scores
165
+ • 30% faster response times
166
+
167
+ I'd love to share how we did it. Are you available for a quick call this week?
168
+
169
+ Best,
170
+ {{sender_name}}'''
171
+ ),
172
+ SequenceEmail(
173
+ sequence_id=cold_outreach.id,
174
+ step_number=3,
175
+ wait_days=7,
176
+ subject='Last attempt - {{company_name}}',
177
+ body='''Hi {{first_name}},
178
+
179
+ This is my last attempt to reach you. I completely understand if now isn't the right time.
180
+
181
+ If you're interested in learning how we can help {{company_name}} improve customer experience, I'm happy to send over some quick resources.
182
+
183
+ Otherwise, I'll assume this isn't a priority right now and won't bother you again.
184
+
185
+ Thanks for your time,
186
+ {{sender_name}}
187
+
188
+ P.S. If you'd prefer to be removed from my list, just reply "Not interested" and I'll make sure you don't hear from me again.'''
189
+ ),
190
+ ]
191
+ session.add_all(sequence_emails)
192
+
193
+ # Default email templates
194
+ templates = [
195
+ Template(
196
+ name='Meeting Request',
197
+ category='meeting_request',
198
+ subject='Meeting invitation - {{company_name}}',
199
+ body='''Hi {{first_name}},
200
+
201
+ Thank you for your interest! I'd love to schedule a call to discuss how we can help {{company_name}}.
202
+
203
+ Here are a few time slots that work for me:
204
+ • {{time_slot_1}}
205
+ • {{time_slot_2}}
206
+ • {{time_slot_3}}
207
+
208
+ Let me know which works best for you, or feel free to suggest another time.
209
+
210
+ Looking forward to speaking with you!
211
+
212
+ Best,
213
+ {{sender_name}}''',
214
+ variables='["first_name", "company_name", "time_slot_1", "time_slot_2", "time_slot_3", "sender_name"]'
215
+ ),
216
+ Template(
217
+ name='Follow-up After Meeting',
218
+ category='follow_up',
219
+ subject='Great speaking with you, {{first_name}}',
220
+ body='''Hi {{first_name}},
221
+
222
+ Thanks for taking the time to speak with me today about {{company_name}}'s customer experience goals.
223
+
224
+ As discussed, here are the next steps:
225
+ • {{next_step_1}}
226
+ • {{next_step_2}}
227
+
228
+ I'll follow up on {{follow_up_date}} as we agreed.
229
+
230
+ Please don't hesitate to reach out if you have any questions in the meantime.
231
+
232
+ Best regards,
233
+ {{sender_name}}''',
234
+ variables='["first_name", "company_name", "next_step_1", "next_step_2", "follow_up_date", "sender_name"]'
235
+ ),
236
+ ]
237
+ session.add_all(templates)
238
+
239
+ session.commit()
240
+ session.close()
241
+
242
+ logger.info("Default data initialized successfully")
243
+
244
+ except Exception as e:
245
+ logger.error(f"Failed to initialize default data: {str(e)}")
246
+ if session:
247
+ session.rollback()
248
+ session.close()
249
+
250
+
251
+ @contextmanager
252
+ def get_session(self):
253
+ """
254
+ Context manager for database sessions
255
+
256
+ Usage:
257
+ with db_manager.get_session() as session:
258
+ session.query(Contact).all()
259
+ """
260
+ session = self.Session()
261
+ try:
262
+ yield session
263
+ session.commit()
264
+ except Exception:
265
+ session.rollback()
266
+ raise
267
+ finally:
268
+ session.close()
269
+
270
+ def close(self):
271
+ """Close database connection"""
272
+ if self.Session:
273
+ self.Session.remove()
274
+ if self.engine:
275
+ self.engine.dispose()
276
+ logger.info("Database connection closed")
277
+
278
+
279
+ # Global database manager instance
280
+ _db_manager = None
281
+
282
+
283
+ def get_db_manager() -> DatabaseManager:
284
+ """Get or create global database manager instance"""
285
+ global _db_manager
286
+ if _db_manager is None:
287
+ _db_manager = DatabaseManager()
288
+ _db_manager.initialize()
289
+ return _db_manager
290
+
291
+
292
+ def init_database(db_path: str = None):
293
+ """Initialize database with custom path"""
294
+ global _db_manager
295
+ _db_manager = DatabaseManager(db_path)
296
+ _db_manager.initialize()
297
+ return _db_manager
database/schema.sql ADDED
@@ -0,0 +1,358 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ -- CX AI Agent - Enterprise Database Schema
2
+ -- SQLite Schema for Campaign Management, Contact Tracking, and Analytics
3
+
4
+ -- =============================================================================
5
+ -- COMPANIES
6
+ -- =============================================================================
7
+ CREATE TABLE IF NOT EXISTS companies (
8
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
9
+ name TEXT NOT NULL,
10
+ domain TEXT UNIQUE,
11
+ industry TEXT,
12
+ size TEXT,
13
+ revenue TEXT,
14
+ location TEXT,
15
+ description TEXT,
16
+ pain_points TEXT, -- JSON array
17
+ website TEXT,
18
+ linkedin_url TEXT,
19
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
20
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
21
+ );
22
+
23
+ CREATE INDEX idx_companies_domain ON companies(domain);
24
+ CREATE INDEX idx_companies_industry ON companies(industry);
25
+
26
+ -- =============================================================================
27
+ -- CONTACTS
28
+ -- =============================================================================
29
+ CREATE TABLE IF NOT EXISTS contacts (
30
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
31
+ company_id INTEGER,
32
+ first_name TEXT,
33
+ last_name TEXT,
34
+ email TEXT UNIQUE NOT NULL,
35
+ phone TEXT,
36
+ job_title TEXT,
37
+ department TEXT,
38
+ seniority_level TEXT, -- C-Level, VP, Director, Manager, Individual Contributor
39
+ linkedin_url TEXT,
40
+ twitter_url TEXT,
41
+ location TEXT,
42
+ timezone TEXT,
43
+
44
+ -- Scoring
45
+ fit_score REAL DEFAULT 0.0,
46
+ engagement_score REAL DEFAULT 0.0,
47
+ intent_score REAL DEFAULT 0.0,
48
+ overall_score REAL DEFAULT 0.0,
49
+
50
+ -- Status & Lifecycle
51
+ status TEXT DEFAULT 'new', -- new, contacted, responded, meeting_scheduled, qualified, lost, customer
52
+ lifecycle_stage TEXT DEFAULT 'lead', -- lead, mql, sql, opportunity, customer, churned
53
+
54
+ -- Tracking
55
+ source TEXT, -- discovery_agent, manual_import, api, referral
56
+ first_contacted_at TIMESTAMP,
57
+ last_contacted_at TIMESTAMP,
58
+ last_activity_at TIMESTAMP,
59
+
60
+ -- Metadata
61
+ tags TEXT, -- JSON array
62
+ notes TEXT,
63
+ custom_fields TEXT, -- JSON object for extensibility
64
+ is_suppressed BOOLEAN DEFAULT 0,
65
+ suppression_reason TEXT,
66
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
67
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
68
+
69
+ FOREIGN KEY (company_id) REFERENCES companies(id) ON DELETE SET NULL
70
+ );
71
+
72
+ CREATE INDEX idx_contacts_email ON contacts(email);
73
+ CREATE INDEX idx_contacts_company ON contacts(company_id);
74
+ CREATE INDEX idx_contacts_status ON contacts(status);
75
+ CREATE INDEX idx_contacts_lifecycle_stage ON contacts(lifecycle_stage);
76
+ CREATE INDEX idx_contacts_overall_score ON contacts(overall_score);
77
+
78
+ -- =============================================================================
79
+ -- CAMPAIGNS
80
+ -- =============================================================================
81
+ CREATE TABLE IF NOT EXISTS campaigns (
82
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
83
+ name TEXT NOT NULL,
84
+ description TEXT,
85
+ status TEXT DEFAULT 'draft', -- draft, active, paused, completed, archived
86
+
87
+ -- Targeting
88
+ target_industries TEXT, -- JSON array
89
+ target_company_sizes TEXT, -- JSON array
90
+ target_locations TEXT, -- JSON array
91
+ target_job_titles TEXT, -- JSON array
92
+
93
+ -- Configuration
94
+ sequence_id INTEGER,
95
+ goal_contacts INTEGER,
96
+ goal_response_rate REAL,
97
+ goal_meetings INTEGER,
98
+
99
+ -- Tracking
100
+ contacts_discovered INTEGER DEFAULT 0,
101
+ contacts_enriched INTEGER DEFAULT 0,
102
+ contacts_scored INTEGER DEFAULT 0,
103
+ contacts_contacted INTEGER DEFAULT 0,
104
+ contacts_responded INTEGER DEFAULT 0,
105
+ meetings_booked INTEGER DEFAULT 0,
106
+
107
+ -- Dates
108
+ started_at TIMESTAMP,
109
+ completed_at TIMESTAMP,
110
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
111
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
112
+ created_by TEXT,
113
+
114
+ FOREIGN KEY (sequence_id) REFERENCES sequences(id) ON DELETE SET NULL
115
+ );
116
+
117
+ CREATE INDEX idx_campaigns_status ON campaigns(status);
118
+
119
+ -- =============================================================================
120
+ -- CAMPAIGN CONTACTS (Many-to-Many with Stage Tracking)
121
+ -- =============================================================================
122
+ CREATE TABLE IF NOT EXISTS campaign_contacts (
123
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
124
+ campaign_id INTEGER NOT NULL,
125
+ contact_id INTEGER NOT NULL,
126
+ stage TEXT DEFAULT 'discovery', -- discovery, enrichment, scoring, outreach, responded, meeting, closed_won, closed_lost
127
+ stage_updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
128
+ added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
129
+ notes TEXT,
130
+
131
+ FOREIGN KEY (campaign_id) REFERENCES campaigns(id) ON DELETE CASCADE,
132
+ FOREIGN KEY (contact_id) REFERENCES contacts(id) ON DELETE CASCADE,
133
+ UNIQUE(campaign_id, contact_id)
134
+ );
135
+
136
+ CREATE INDEX idx_campaign_contacts_campaign ON campaign_contacts(campaign_id);
137
+ CREATE INDEX idx_campaign_contacts_contact ON campaign_contacts(contact_id);
138
+ CREATE INDEX idx_campaign_contacts_stage ON campaign_contacts(stage);
139
+
140
+ -- =============================================================================
141
+ -- EMAIL SEQUENCES
142
+ -- =============================================================================
143
+ CREATE TABLE IF NOT EXISTS sequences (
144
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
145
+ name TEXT NOT NULL,
146
+ description TEXT,
147
+ category TEXT DEFAULT 'outbound', -- outbound, nurture, re-engagement
148
+ is_active BOOLEAN DEFAULT 1,
149
+ is_template BOOLEAN DEFAULT 0,
150
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
151
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
152
+ created_by TEXT
153
+ );
154
+
155
+ -- =============================================================================
156
+ -- SEQUENCE EMAILS (Steps in a sequence)
157
+ -- =============================================================================
158
+ CREATE TABLE IF NOT EXISTS sequence_emails (
159
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
160
+ sequence_id INTEGER NOT NULL,
161
+ step_number INTEGER NOT NULL,
162
+ wait_days INTEGER DEFAULT 0, -- Days to wait after previous email
163
+ subject TEXT NOT NULL,
164
+ body TEXT NOT NULL,
165
+ send_time_preference TEXT, -- morning, afternoon, evening, or specific time
166
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
167
+
168
+ FOREIGN KEY (sequence_id) REFERENCES sequences(id) ON DELETE CASCADE,
169
+ UNIQUE(sequence_id, step_number)
170
+ );
171
+
172
+ CREATE INDEX idx_sequence_emails_sequence ON sequence_emails(sequence_id);
173
+
174
+ -- =============================================================================
175
+ -- EMAIL ACTIVITIES (Tracking email interactions)
176
+ -- =============================================================================
177
+ CREATE TABLE IF NOT EXISTS email_activities (
178
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
179
+ contact_id INTEGER NOT NULL,
180
+ campaign_id INTEGER,
181
+ sequence_email_id INTEGER,
182
+ type TEXT NOT NULL, -- sent, delivered, opened, clicked, replied, bounced, unsubscribed, complained
183
+ subject TEXT,
184
+ preview TEXT,
185
+ link_url TEXT, -- For click tracking
186
+ meta_data TEXT, -- JSON for additional data
187
+ occurred_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
188
+
189
+ FOREIGN KEY (contact_id) REFERENCES contacts(id) ON DELETE CASCADE,
190
+ FOREIGN KEY (campaign_id) REFERENCES campaigns(id) ON DELETE SET NULL,
191
+ FOREIGN KEY (sequence_email_id) REFERENCES sequence_emails(id) ON DELETE SET NULL
192
+ );
193
+
194
+ CREATE INDEX idx_email_activities_contact ON email_activities(contact_id);
195
+ CREATE INDEX idx_email_activities_campaign ON email_activities(campaign_id);
196
+ CREATE INDEX idx_email_activities_type ON email_activities(type);
197
+ CREATE INDEX idx_email_activities_occurred ON email_activities(occurred_at);
198
+
199
+ -- =============================================================================
200
+ -- MEETINGS
201
+ -- =============================================================================
202
+ CREATE TABLE IF NOT EXISTS meetings (
203
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
204
+ contact_id INTEGER NOT NULL,
205
+ campaign_id INTEGER,
206
+ title TEXT NOT NULL,
207
+ description TEXT,
208
+ scheduled_at TIMESTAMP NOT NULL,
209
+ duration_minutes INTEGER DEFAULT 30,
210
+ meeting_url TEXT,
211
+ location TEXT,
212
+ status TEXT DEFAULT 'scheduled', -- scheduled, completed, cancelled, no_show, rescheduled
213
+ outcome TEXT, -- interested, not_interested, needs_follow_up, closed_won
214
+ notes TEXT,
215
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
216
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
217
+
218
+ FOREIGN KEY (contact_id) REFERENCES contacts(id) ON DELETE CASCADE,
219
+ FOREIGN KEY (campaign_id) REFERENCES campaigns(id) ON DELETE SET NULL
220
+ );
221
+
222
+ CREATE INDEX idx_meetings_contact ON meetings(contact_id);
223
+ CREATE INDEX idx_meetings_campaign ON meetings(campaign_id);
224
+ CREATE INDEX idx_meetings_scheduled ON meetings(scheduled_at);
225
+ CREATE INDEX idx_meetings_status ON meetings(status);
226
+
227
+ -- =============================================================================
228
+ -- ACTIVITIES (General activity log)
229
+ -- =============================================================================
230
+ CREATE TABLE IF NOT EXISTS activities (
231
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
232
+ contact_id INTEGER,
233
+ campaign_id INTEGER,
234
+ meeting_id INTEGER,
235
+ type TEXT NOT NULL, -- discovery, enrichment, email_sent, email_opened, reply_received, meeting_scheduled, meeting_completed, note_added, status_changed
236
+ description TEXT,
237
+ meta_data TEXT, -- JSON for additional context
238
+ performed_by TEXT, -- agent_name or 'user'
239
+ occurred_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
240
+
241
+ FOREIGN KEY (contact_id) REFERENCES contacts(id) ON DELETE CASCADE,
242
+ FOREIGN KEY (campaign_id) REFERENCES campaigns(id) ON DELETE SET NULL,
243
+ FOREIGN KEY (meeting_id) REFERENCES meetings(id) ON DELETE SET NULL
244
+ );
245
+
246
+ CREATE INDEX idx_activities_contact ON activities(contact_id);
247
+ CREATE INDEX idx_activities_campaign ON activities(campaign_id);
248
+ CREATE INDEX idx_activities_type ON activities(type);
249
+ CREATE INDEX idx_activities_occurred ON activities(occurred_at);
250
+
251
+ -- =============================================================================
252
+ -- AB TESTS (for email sequences)
253
+ -- =============================================================================
254
+ CREATE TABLE IF NOT EXISTS ab_tests (
255
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
256
+ campaign_id INTEGER NOT NULL,
257
+ sequence_id INTEGER NOT NULL,
258
+ name TEXT NOT NULL,
259
+ description TEXT,
260
+ test_type TEXT NOT NULL, -- subject_line, body, send_time, from_name
261
+ variant_a TEXT NOT NULL, -- JSON configuration
262
+ variant_b TEXT NOT NULL, -- JSON configuration
263
+ winner TEXT, -- 'a', 'b', or null if test ongoing
264
+ status TEXT DEFAULT 'running', -- running, completed, cancelled
265
+ started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
266
+ completed_at TIMESTAMP,
267
+
268
+ FOREIGN KEY (campaign_id) REFERENCES campaigns(id) ON DELETE CASCADE,
269
+ FOREIGN KEY (sequence_id) REFERENCES sequences(id) ON DELETE CASCADE
270
+ );
271
+
272
+ -- =============================================================================
273
+ -- AB TEST RESULTS
274
+ -- =============================================================================
275
+ CREATE TABLE IF NOT EXISTS ab_test_results (
276
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
277
+ ab_test_id INTEGER NOT NULL,
278
+ variant TEXT NOT NULL, -- 'a' or 'b'
279
+ emails_sent INTEGER DEFAULT 0,
280
+ emails_delivered INTEGER DEFAULT 0,
281
+ emails_opened INTEGER DEFAULT 0,
282
+ emails_clicked INTEGER DEFAULT 0,
283
+ emails_replied INTEGER DEFAULT 0,
284
+ meetings_booked INTEGER DEFAULT 0,
285
+
286
+ FOREIGN KEY (ab_test_id) REFERENCES ab_tests(id) ON DELETE CASCADE,
287
+ UNIQUE(ab_test_id, variant)
288
+ );
289
+
290
+ -- =============================================================================
291
+ -- TEMPLATES (Email templates)
292
+ -- =============================================================================
293
+ CREATE TABLE IF NOT EXISTS templates (
294
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
295
+ name TEXT NOT NULL,
296
+ category TEXT, -- cold_outreach, follow_up, meeting_request, thank_you
297
+ subject TEXT NOT NULL,
298
+ body TEXT NOT NULL,
299
+ variables TEXT, -- JSON array of variable names
300
+ is_active BOOLEAN DEFAULT 1,
301
+ usage_count INTEGER DEFAULT 0,
302
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
303
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
304
+ );
305
+
306
+ -- =============================================================================
307
+ -- ANALYTICS SNAPSHOTS (Daily/hourly aggregated metrics)
308
+ -- =============================================================================
309
+ CREATE TABLE IF NOT EXISTS analytics_snapshots (
310
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
311
+ campaign_id INTEGER,
312
+ date DATE NOT NULL,
313
+ hour INTEGER, -- null for daily snapshots
314
+
315
+ -- Metrics
316
+ contacts_discovered INTEGER DEFAULT 0,
317
+ contacts_enriched INTEGER DEFAULT 0,
318
+ emails_sent INTEGER DEFAULT 0,
319
+ emails_opened INTEGER DEFAULT 0,
320
+ emails_clicked INTEGER DEFAULT 0,
321
+ emails_replied INTEGER DEFAULT 0,
322
+ meetings_booked INTEGER DEFAULT 0,
323
+
324
+ -- Rates
325
+ open_rate REAL DEFAULT 0.0,
326
+ click_rate REAL DEFAULT 0.0,
327
+ response_rate REAL DEFAULT 0.0,
328
+ meeting_rate REAL DEFAULT 0.0,
329
+
330
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
331
+
332
+ FOREIGN KEY (campaign_id) REFERENCES campaigns(id) ON DELETE CASCADE,
333
+ UNIQUE(campaign_id, date, hour)
334
+ );
335
+
336
+ CREATE INDEX idx_analytics_campaign ON analytics_snapshots(campaign_id);
337
+ CREATE INDEX idx_analytics_date ON analytics_snapshots(date);
338
+
339
+ -- =============================================================================
340
+ -- SETTINGS (Application configuration)
341
+ -- =============================================================================
342
+ CREATE TABLE IF NOT EXISTS settings (
343
+ key TEXT PRIMARY KEY,
344
+ value TEXT NOT NULL,
345
+ description TEXT,
346
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
347
+ );
348
+
349
+ -- Insert default settings
350
+ INSERT OR IGNORE INTO settings (key, value, description) VALUES
351
+ ('company_name', 'Your Company', 'Company name for email footers'),
352
+ ('company_address', '123 Main St, City, State 12345', 'Physical address for CAN-SPAM compliance'),
353
+ ('sender_name', 'Sales Team', 'Default sender name for emails'),
354
+ ('sender_email', '[email protected]', 'Default sender email'),
355
+ ('daily_email_limit', '1000', 'Maximum emails to send per day'),
356
+ ('enable_tracking', '1', 'Enable email open and click tracking'),
357
+ ('auto_pause_on_low_score', '1', 'Automatically pause contacts with low engagement'),
358
+ ('min_engagement_score', '0.3', 'Minimum engagement score before auto-pause');
database/schema_extended.sql ADDED
@@ -0,0 +1,472 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ -- CX Platform - Extended Database Schema
2
+ -- Adds tickets, knowledge base, chat, and customer interaction tracking
3
+
4
+ -- =============================================================================
5
+ -- CUSTOMERS (Enhanced from contacts)
6
+ -- =============================================================================
7
+ CREATE TABLE IF NOT EXISTS cx_customers (
8
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
9
+ email TEXT UNIQUE NOT NULL,
10
+ first_name TEXT,
11
+ last_name TEXT,
12
+ company TEXT,
13
+ phone TEXT,
14
+
15
+ -- Segmentation
16
+ segment TEXT DEFAULT 'standard', -- vip, standard, at_risk, churned
17
+ lifecycle_stage TEXT DEFAULT 'active', -- new, active, at_risk, churned
18
+
19
+ -- Metrics
20
+ lifetime_value REAL DEFAULT 0.0,
21
+ satisfaction_score REAL DEFAULT 0.0, -- CSAT average
22
+ nps_score INTEGER, -- Net Promoter Score
23
+ sentiment TEXT DEFAULT 'neutral', -- positive, neutral, negative
24
+
25
+ -- Tracking
26
+ first_interaction_at TIMESTAMP,
27
+ last_interaction_at TIMESTAMP,
28
+ total_interactions INTEGER DEFAULT 0,
29
+ total_tickets INTEGER DEFAULT 0,
30
+
31
+ -- Metadata
32
+ tags TEXT, -- JSON array
33
+ custom_fields TEXT, -- JSON object
34
+ notes TEXT,
35
+
36
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
37
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
38
+ );
39
+
40
+ CREATE INDEX idx_cx_customers_email ON cx_customers(email);
41
+ CREATE INDEX idx_cx_customers_segment ON cx_customers(segment);
42
+ CREATE INDEX idx_cx_customers_sentiment ON cx_customers(sentiment);
43
+
44
+ -- =============================================================================
45
+ -- TICKETS
46
+ -- =============================================================================
47
+ CREATE TABLE IF NOT EXISTS cx_tickets (
48
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
49
+ customer_id INTEGER NOT NULL,
50
+
51
+ -- Core fields
52
+ subject TEXT NOT NULL,
53
+ description TEXT,
54
+ status TEXT DEFAULT 'new', -- new, open, pending, resolved, closed
55
+ priority TEXT DEFAULT 'medium', -- low, medium, high, urgent
56
+ category TEXT, -- technical, billing, feature_request, etc.
57
+
58
+ -- Assignment
59
+ assigned_to TEXT, -- agent name/id
60
+ assigned_team TEXT,
61
+
62
+ -- SLA
63
+ sla_due_at TIMESTAMP,
64
+ first_response_at TIMESTAMP,
65
+ resolved_at TIMESTAMP,
66
+ closed_at TIMESTAMP,
67
+
68
+ -- Metrics
69
+ response_time_minutes INTEGER,
70
+ resolution_time_minutes INTEGER,
71
+ reopened_count INTEGER DEFAULT 0,
72
+
73
+ -- AI fields
74
+ sentiment TEXT, -- detected from description
75
+ ai_suggested_category TEXT,
76
+ ai_confidence REAL,
77
+ auto_resolved BOOLEAN DEFAULT 0,
78
+
79
+ -- Metadata
80
+ source TEXT DEFAULT 'manual', -- manual, email, chat, api, web_form
81
+ tags TEXT, -- JSON array
82
+ custom_fields TEXT, -- JSON
83
+
84
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
85
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
86
+
87
+ FOREIGN KEY (customer_id) REFERENCES cx_customers(id) ON DELETE CASCADE
88
+ );
89
+
90
+ CREATE INDEX idx_cx_tickets_customer ON cx_tickets(customer_id);
91
+ CREATE INDEX idx_cx_tickets_status ON cx_tickets(status);
92
+ CREATE INDEX idx_cx_tickets_priority ON cx_tickets(priority);
93
+ CREATE INDEX idx_cx_tickets_assigned_to ON cx_tickets(assigned_to);
94
+ CREATE INDEX idx_cx_tickets_sla_due ON cx_tickets(sla_due_at);
95
+
96
+ -- =============================================================================
97
+ -- TICKET MESSAGES
98
+ -- =============================================================================
99
+ CREATE TABLE IF NOT EXISTS cx_ticket_messages (
100
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
101
+ ticket_id INTEGER NOT NULL,
102
+
103
+ -- Sender
104
+ sender_type TEXT NOT NULL, -- customer, agent, system, ai_bot
105
+ sender_id TEXT, -- customer_id, agent_id, or 'system'
106
+ sender_name TEXT,
107
+
108
+ -- Message
109
+ message TEXT NOT NULL,
110
+ message_html TEXT,
111
+ is_internal BOOLEAN DEFAULT 0, -- internal note vs customer-visible
112
+
113
+ -- AI fields
114
+ sentiment TEXT,
115
+ intent TEXT, -- question, complaint, praise, feedback
116
+
117
+ -- Metadata
118
+ meta_data TEXT, -- JSON
119
+
120
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
121
+
122
+ FOREIGN KEY (ticket_id) REFERENCES cx_tickets(id) ON DELETE CASCADE
123
+ );
124
+
125
+ CREATE INDEX idx_cx_ticket_messages_ticket ON cx_ticket_messages(ticket_id);
126
+ CREATE INDEX idx_cx_ticket_messages_created ON cx_ticket_messages(created_at);
127
+
128
+ -- =============================================================================
129
+ -- TICKET ATTACHMENTS
130
+ -- =============================================================================
131
+ CREATE TABLE IF NOT EXISTS cx_ticket_attachments (
132
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
133
+ ticket_id INTEGER NOT NULL,
134
+ message_id INTEGER,
135
+
136
+ filename TEXT NOT NULL,
137
+ file_path TEXT NOT NULL,
138
+ file_size INTEGER,
139
+ mime_type TEXT,
140
+
141
+ uploaded_by TEXT,
142
+ uploaded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
143
+
144
+ FOREIGN KEY (ticket_id) REFERENCES cx_tickets(id) ON DELETE CASCADE,
145
+ FOREIGN KEY (message_id) REFERENCES cx_ticket_messages(id) ON DELETE SET NULL
146
+ );
147
+
148
+ -- =============================================================================
149
+ -- KNOWLEDGE BASE
150
+ -- =============================================================================
151
+ CREATE TABLE IF NOT EXISTS cx_kb_categories (
152
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
153
+ name TEXT NOT NULL,
154
+ description TEXT,
155
+ parent_id INTEGER,
156
+ display_order INTEGER DEFAULT 0,
157
+ icon TEXT,
158
+
159
+ is_active BOOLEAN DEFAULT 1,
160
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
161
+
162
+ FOREIGN KEY (parent_id) REFERENCES cx_kb_categories(id) ON DELETE SET NULL
163
+ );
164
+
165
+ CREATE TABLE IF NOT EXISTS cx_kb_articles (
166
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
167
+ category_id INTEGER,
168
+
169
+ -- Content
170
+ title TEXT NOT NULL,
171
+ summary TEXT,
172
+ content TEXT NOT NULL,
173
+ content_html TEXT,
174
+
175
+ -- Status
176
+ status TEXT DEFAULT 'draft', -- draft, published, archived
177
+ visibility TEXT DEFAULT 'public', -- public, internal, private
178
+
179
+ -- SEO
180
+ slug TEXT UNIQUE,
181
+ meta_description TEXT,
182
+
183
+ -- Metrics
184
+ view_count INTEGER DEFAULT 0,
185
+ helpful_count INTEGER DEFAULT 0,
186
+ not_helpful_count INTEGER DEFAULT 0,
187
+ average_rating REAL DEFAULT 0.0,
188
+
189
+ -- AI fields
190
+ ai_generated BOOLEAN DEFAULT 0,
191
+ ai_confidence REAL,
192
+ keywords TEXT, -- JSON array for semantic search
193
+
194
+ -- Versioning
195
+ version INTEGER DEFAULT 1,
196
+
197
+ -- Metadata
198
+ tags TEXT, -- JSON array
199
+ related_articles TEXT, -- JSON array of article IDs
200
+
201
+ -- Authoring
202
+ author TEXT,
203
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
204
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
205
+ published_at TIMESTAMP,
206
+
207
+ FOREIGN KEY (category_id) REFERENCES cx_kb_categories(id) ON DELETE SET NULL
208
+ );
209
+
210
+ CREATE INDEX idx_cx_kb_articles_category ON cx_kb_articles(category_id);
211
+ CREATE INDEX idx_cx_kb_articles_status ON cx_kb_articles(status);
212
+ CREATE INDEX idx_cx_kb_articles_slug ON cx_kb_articles(slug);
213
+
214
+ -- =============================================================================
215
+ -- KB ARTICLE VERSIONS
216
+ -- =============================================================================
217
+ CREATE TABLE IF NOT EXISTS cx_kb_article_versions (
218
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
219
+ article_id INTEGER NOT NULL,
220
+
221
+ version INTEGER NOT NULL,
222
+ title TEXT NOT NULL,
223
+ content TEXT NOT NULL,
224
+
225
+ changed_by TEXT,
226
+ change_note TEXT,
227
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
228
+
229
+ FOREIGN KEY (article_id) REFERENCES cx_kb_articles(id) ON DELETE CASCADE,
230
+ UNIQUE(article_id, version)
231
+ );
232
+
233
+ -- =============================================================================
234
+ -- LIVE CHAT SESSIONS
235
+ -- =============================================================================
236
+ CREATE TABLE IF NOT EXISTS cx_chat_sessions (
237
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
238
+ customer_id INTEGER,
239
+
240
+ -- Session info
241
+ session_id TEXT UNIQUE NOT NULL,
242
+ status TEXT DEFAULT 'active', -- active, waiting, assigned, closed
243
+
244
+ -- Routing
245
+ assigned_to TEXT, -- agent name/id
246
+ assigned_at TIMESTAMP,
247
+
248
+ -- AI bot
249
+ bot_active BOOLEAN DEFAULT 1,
250
+ bot_handed_off BOOLEAN DEFAULT 0,
251
+ bot_handoff_reason TEXT,
252
+
253
+ -- Metrics
254
+ wait_time_seconds INTEGER DEFAULT 0,
255
+ response_time_seconds INTEGER DEFAULT 0,
256
+ message_count INTEGER DEFAULT 0,
257
+
258
+ -- Metadata
259
+ page_url TEXT,
260
+ referrer TEXT,
261
+ user_agent TEXT,
262
+ ip_address TEXT,
263
+
264
+ -- Satisfaction
265
+ rated BOOLEAN DEFAULT 0,
266
+ rating INTEGER,
267
+ feedback TEXT,
268
+
269
+ started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
270
+ ended_at TIMESTAMP,
271
+
272
+ FOREIGN KEY (customer_id) REFERENCES cx_customers(id) ON DELETE SET NULL
273
+ );
274
+
275
+ CREATE INDEX idx_cx_chat_sessions_customer ON cx_chat_sessions(customer_id);
276
+ CREATE INDEX idx_cx_chat_sessions_status ON cx_chat_sessions(status);
277
+ CREATE INDEX idx_cx_chat_sessions_assigned_to ON cx_chat_sessions(assigned_to);
278
+
279
+ -- =============================================================================
280
+ -- CHAT MESSAGES
281
+ -- =============================================================================
282
+ CREATE TABLE IF NOT EXISTS cx_chat_messages (
283
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
284
+ session_id INTEGER NOT NULL,
285
+
286
+ -- Sender
287
+ sender_type TEXT NOT NULL, -- customer, agent, bot, system
288
+ sender_id TEXT,
289
+ sender_name TEXT,
290
+
291
+ -- Message
292
+ message TEXT NOT NULL,
293
+ message_type TEXT DEFAULT 'text', -- text, image, file, system_message
294
+
295
+ -- AI fields
296
+ is_bot_response BOOLEAN DEFAULT 0,
297
+ bot_confidence REAL,
298
+ intent TEXT,
299
+
300
+ -- Status
301
+ is_read BOOLEAN DEFAULT 0,
302
+ read_at TIMESTAMP,
303
+
304
+ -- Metadata
305
+ meta_data TEXT, -- JSON
306
+
307
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
308
+
309
+ FOREIGN KEY (session_id) REFERENCES cx_chat_sessions(id) ON DELETE CASCADE
310
+ );
311
+
312
+ CREATE INDEX idx_cx_chat_messages_session ON cx_chat_messages(session_id);
313
+ CREATE INDEX idx_cx_chat_messages_created ON cx_chat_messages(created_at);
314
+
315
+ -- =============================================================================
316
+ -- AUTOMATION RULES
317
+ -- =============================================================================
318
+ CREATE TABLE IF NOT EXISTS cx_automation_rules (
319
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
320
+
321
+ name TEXT NOT NULL,
322
+ description TEXT,
323
+ is_active BOOLEAN DEFAULT 1,
324
+
325
+ -- Trigger
326
+ trigger_type TEXT NOT NULL, -- ticket_created, ticket_updated, time_based, etc.
327
+ trigger_conditions TEXT NOT NULL, -- JSON
328
+
329
+ -- Actions
330
+ actions TEXT NOT NULL, -- JSON array of actions
331
+
332
+ -- Execution
333
+ execution_count INTEGER DEFAULT 0,
334
+ last_executed_at TIMESTAMP,
335
+
336
+ -- Priority
337
+ priority INTEGER DEFAULT 0,
338
+
339
+ created_by TEXT,
340
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
341
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
342
+ );
343
+
344
+ -- =============================================================================
345
+ -- CUSTOMER INTERACTIONS
346
+ -- =============================================================================
347
+ CREATE TABLE IF NOT EXISTS cx_interactions (
348
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
349
+ customer_id INTEGER NOT NULL,
350
+
351
+ type TEXT NOT NULL, -- ticket, chat, email, call, meeting
352
+ channel TEXT, -- web, email, phone, chat, api
353
+
354
+ summary TEXT,
355
+ sentiment TEXT,
356
+ intent TEXT,
357
+
358
+ -- References
359
+ reference_type TEXT, -- ticket, chat_session, email, etc.
360
+ reference_id INTEGER,
361
+
362
+ -- Metrics
363
+ duration_seconds INTEGER,
364
+ satisfaction_rating INTEGER,
365
+
366
+ -- Agent
367
+ handled_by TEXT,
368
+
369
+ occurred_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
370
+
371
+ FOREIGN KEY (customer_id) REFERENCES cx_customers(id) ON DELETE CASCADE
372
+ );
373
+
374
+ CREATE INDEX idx_cx_interactions_customer ON cx_interactions(customer_id);
375
+ CREATE INDEX idx_cx_interactions_type ON cx_interactions(type);
376
+ CREATE INDEX idx_cx_interactions_occurred ON cx_interactions(occurred_at);
377
+
378
+ -- =============================================================================
379
+ -- ANALYTICS SNAPSHOTS (Enhanced)
380
+ -- =============================================================================
381
+ CREATE TABLE IF NOT EXISTS cx_analytics_daily (
382
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
383
+ date DATE NOT NULL UNIQUE,
384
+
385
+ -- Ticket metrics
386
+ tickets_created INTEGER DEFAULT 0,
387
+ tickets_resolved INTEGER DEFAULT 0,
388
+ tickets_reopened INTEGER DEFAULT 0,
389
+ avg_resolution_time_minutes REAL DEFAULT 0.0,
390
+ avg_first_response_minutes REAL DEFAULT 0.0,
391
+
392
+ -- Chat metrics
393
+ chats_started INTEGER DEFAULT 0,
394
+ chats_completed INTEGER DEFAULT 0,
395
+ avg_wait_time_seconds REAL DEFAULT 0.0,
396
+ bot_resolution_rate REAL DEFAULT 0.0,
397
+
398
+ -- Satisfaction
399
+ avg_csat REAL DEFAULT 0.0,
400
+ avg_nps INTEGER DEFAULT 0,
401
+
402
+ -- KB metrics
403
+ kb_views INTEGER DEFAULT 0,
404
+ kb_helpful_votes INTEGER DEFAULT 0,
405
+ kb_searches INTEGER DEFAULT 0,
406
+
407
+ -- Sentiment
408
+ positive_interactions INTEGER DEFAULT 0,
409
+ neutral_interactions INTEGER DEFAULT 0,
410
+ negative_interactions INTEGER DEFAULT 0,
411
+
412
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
413
+ );
414
+
415
+ CREATE INDEX idx_cx_analytics_daily_date ON cx_analytics_daily(date);
416
+
417
+ -- =============================================================================
418
+ -- CANNED RESPONSES (Templates)
419
+ -- =============================================================================
420
+ CREATE TABLE IF NOT EXISTS cx_canned_responses (
421
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
422
+
423
+ name TEXT NOT NULL,
424
+ shortcut TEXT UNIQUE, -- e.g., "/greeting"
425
+ category TEXT,
426
+
427
+ subject TEXT,
428
+ content TEXT NOT NULL,
429
+
430
+ -- Usage
431
+ use_count INTEGER DEFAULT 0,
432
+ last_used_at TIMESTAMP,
433
+
434
+ is_active BOOLEAN DEFAULT 1,
435
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
436
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
437
+ );
438
+
439
+ -- =============================================================================
440
+ -- AGENT PERFORMANCE
441
+ -- =============================================================================
442
+ CREATE TABLE IF NOT EXISTS cx_agent_stats (
443
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
444
+ agent_id TEXT NOT NULL,
445
+ agent_name TEXT NOT NULL,
446
+ date DATE NOT NULL,
447
+
448
+ -- Tickets
449
+ tickets_handled INTEGER DEFAULT 0,
450
+ tickets_resolved INTEGER DEFAULT 0,
451
+ avg_resolution_time_minutes REAL DEFAULT 0.0,
452
+
453
+ -- Chats
454
+ chats_handled INTEGER DEFAULT 0,
455
+ avg_chat_duration_minutes REAL DEFAULT 0.0,
456
+
457
+ -- Quality
458
+ avg_csat REAL DEFAULT 0.0,
459
+ positive_feedbacks INTEGER DEFAULT 0,
460
+ negative_feedbacks INTEGER DEFAULT 0,
461
+
462
+ -- Efficiency
463
+ avg_response_time_minutes REAL DEFAULT 0.0,
464
+ first_contact_resolutions INTEGER DEFAULT 0,
465
+
466
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
467
+
468
+ UNIQUE(agent_id, date)
469
+ );
470
+
471
+ CREATE INDEX idx_cx_agent_stats_agent ON cx_agent_stats(agent_id);
472
+ CREATE INDEX idx_cx_agent_stats_date ON cx_agent_stats(date);
mcp/__init__.py ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ # file: mcp/__init__.py
2
+ """Model Context Protocol implementation"""
mcp/agents/autonomous_agent.py ADDED
@@ -0,0 +1,413 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Autonomous AI Agent with MCP Tool Calling
3
+
4
+ This agent uses Claude 3.5 Sonnet (or compatible LLM) to autonomously
5
+ decide which MCP tools to call based on the user's task.
6
+
7
+ This is TRUE AI-driven MCP usage - no hardcoded workflow!
8
+ """
9
+
10
+ import os
11
+ import json
12
+ import uuid
13
+ import logging
14
+ from typing import List, Dict, Any, AsyncGenerator
15
+ from anthropic import AsyncAnthropic
16
+
17
+ from mcp.tools.definitions import MCP_TOOLS
18
+ from mcp.registry import MCPRegistry
19
+
20
+ logger = logging.getLogger(__name__)
21
+
22
+
23
+ class AutonomousMCPAgent:
24
+ """
25
+ AI Agent that autonomously uses MCP servers as tools.
26
+
27
+ Key Features:
28
+ - Uses Claude 3.5 Sonnet for tool calling
29
+ - Autonomously decides which MCP tools to use
30
+ - No hardcoded workflow - AI makes all decisions
31
+ - Proper MCP protocol implementation
32
+ """
33
+
34
+ def __init__(self, mcp_registry: MCPRegistry, api_key: str = None):
35
+ """
36
+ Initialize the autonomous agent
37
+
38
+ Args:
39
+ mcp_registry: MCP registry with all servers
40
+ api_key: Anthropic API key (or use ANTHROPIC_API_KEY env var)
41
+ """
42
+ self.mcp_registry = mcp_registry
43
+ self.api_key = api_key or os.getenv("ANTHROPIC_API_KEY")
44
+
45
+ if not self.api_key:
46
+ raise ValueError(
47
+ "Anthropic API key required for autonomous agent. "
48
+ "Set ANTHROPIC_API_KEY environment variable or pass api_key parameter."
49
+ )
50
+
51
+ self.client = AsyncAnthropic(api_key=self.api_key)
52
+ self.model = "claude-3-5-sonnet-20241022"
53
+
54
+ # System prompt for the agent
55
+ self.system_prompt = """You are an autonomous AI agent for B2B sales automation.
56
+
57
+ You have access to MCP (Model Context Protocol) servers that provide tools for:
58
+ - Web search (find company information, news, insights)
59
+ - Data storage (save prospects, companies, contacts, facts)
60
+ - Email management (send emails, track threads)
61
+ - Calendar (schedule meetings)
62
+
63
+ Your goal is to help with B2B sales tasks like:
64
+ - Finding and researching potential customers
65
+ - Enriching company data with facts and insights
66
+ - Finding decision-maker contacts
67
+ - Drafting personalized outreach emails
68
+ - Managing prospect pipeline
69
+
70
+ IMPORTANT:
71
+ 1. Think step-by-step about what information you need
72
+ 2. Use tools autonomously to gather information
73
+ 3. Save important data to the store for persistence
74
+ 4. Be thorough in research before making recommendations
75
+ 5. Always check suppression list before suggesting email sends
76
+
77
+ You should:
78
+ - Search for company information when needed
79
+ - Save prospects and companies to the database
80
+ - Find and save contacts
81
+ - Generate personalized outreach based on research
82
+ - Track your progress and findings
83
+
84
+ Work autonomously - decide which tools to use and when!"""
85
+
86
+ logger.info(f"Autonomous MCP Agent initialized with model: {self.model}")
87
+
88
+ async def run(
89
+ self,
90
+ task: str,
91
+ max_iterations: int = 15
92
+ ) -> AsyncGenerator[Dict[str, Any], None]:
93
+ """
94
+ Run the agent autonomously on a task.
95
+
96
+ The agent will:
97
+ 1. Understand the task
98
+ 2. Decide which MCP tools to call
99
+ 3. Execute tools autonomously
100
+ 4. Continue until task is complete or max iterations reached
101
+
102
+ Args:
103
+ task: The task to complete (e.g., "Research and create outreach for Shopify")
104
+ max_iterations: Maximum tool calls to prevent infinite loops
105
+
106
+ Yields:
107
+ Events showing agent's progress and tool calls
108
+ """
109
+
110
+ yield {
111
+ "type": "agent_start",
112
+ "message": f"🤖 Autonomous AI Agent starting task: {task}",
113
+ "model": self.model
114
+ }
115
+
116
+ # Initialize conversation
117
+ messages = [
118
+ {
119
+ "role": "user",
120
+ "content": task
121
+ }
122
+ ]
123
+
124
+ iteration = 0
125
+
126
+ while iteration < max_iterations:
127
+ iteration += 1
128
+
129
+ yield {
130
+ "type": "iteration_start",
131
+ "iteration": iteration,
132
+ "message": f"🔄 Iteration {iteration}: AI deciding next action..."
133
+ }
134
+
135
+ try:
136
+ # Call Claude with tools
137
+ response = await self.client.messages.create(
138
+ model=self.model,
139
+ max_tokens=4096,
140
+ system=self.system_prompt,
141
+ messages=messages,
142
+ tools=MCP_TOOLS
143
+ )
144
+
145
+ # Add assistant response to conversation
146
+ messages.append({
147
+ "role": "assistant",
148
+ "content": response.content
149
+ })
150
+
151
+ # Check if AI wants to use tools
152
+ tool_calls = [block for block in response.content if block.type == "tool_use"]
153
+
154
+ if not tool_calls:
155
+ # AI is done - no more tools to call
156
+ final_text = next(
157
+ (block.text for block in response.content if hasattr(block, "text")),
158
+ "Task completed!"
159
+ )
160
+
161
+ yield {
162
+ "type": "agent_complete",
163
+ "message": f"✅ Task complete!",
164
+ "final_response": final_text,
165
+ "iterations": iteration
166
+ }
167
+ break
168
+
169
+ # Execute tool calls
170
+ tool_results = []
171
+
172
+ for tool_call in tool_calls:
173
+ tool_name = tool_call.name
174
+ tool_input = tool_call.input
175
+
176
+ yield {
177
+ "type": "tool_call",
178
+ "tool": tool_name,
179
+ "input": tool_input,
180
+ "message": f"🔧 AI calling tool: {tool_name}"
181
+ }
182
+
183
+ # Execute the MCP tool
184
+ try:
185
+ result = await self._execute_mcp_tool(tool_name, tool_input)
186
+
187
+ yield {
188
+ "type": "tool_result",
189
+ "tool": tool_name,
190
+ "result": result,
191
+ "message": f"✓ Tool {tool_name} completed"
192
+ }
193
+
194
+ # Add tool result to conversation
195
+ tool_results.append({
196
+ "type": "tool_result",
197
+ "tool_use_id": tool_call.id,
198
+ "content": json.dumps(result, default=str)
199
+ })
200
+
201
+ except Exception as e:
202
+ error_msg = str(e)
203
+ logger.error(f"Tool execution failed: {tool_name} - {error_msg}")
204
+
205
+ yield {
206
+ "type": "tool_error",
207
+ "tool": tool_name,
208
+ "error": error_msg,
209
+ "message": f"❌ Tool {tool_name} failed: {error_msg}"
210
+ }
211
+
212
+ tool_results.append({
213
+ "type": "tool_result",
214
+ "tool_use_id": tool_call.id,
215
+ "content": json.dumps({"error": error_msg}),
216
+ "is_error": True
217
+ })
218
+
219
+ # Add tool results to conversation
220
+ messages.append({
221
+ "role": "user",
222
+ "content": tool_results
223
+ })
224
+
225
+ except Exception as e:
226
+ logger.error(f"Agent iteration failed: {e}")
227
+ yield {
228
+ "type": "agent_error",
229
+ "error": str(e),
230
+ "message": f"❌ Agent error: {str(e)}"
231
+ }
232
+ break
233
+
234
+ if iteration >= max_iterations:
235
+ yield {
236
+ "type": "agent_max_iterations",
237
+ "message": f"⚠️ Reached maximum iterations ({max_iterations})",
238
+ "iterations": iteration
239
+ }
240
+
241
+ async def _execute_mcp_tool(self, tool_name: str, tool_input: Dict[str, Any]) -> Any:
242
+ """
243
+ Execute an MCP tool by routing to the appropriate MCP server.
244
+
245
+ This is where we actually call the MCP servers!
246
+ """
247
+
248
+ # ============ SEARCH MCP SERVER ============
249
+ if tool_name == "search_web":
250
+ query = tool_input["query"]
251
+ max_results = tool_input.get("max_results", 5)
252
+
253
+ results = await self.mcp_registry.search.query(query, max_results=max_results)
254
+ return {
255
+ "results": results,
256
+ "count": len(results)
257
+ }
258
+
259
+ elif tool_name == "search_news":
260
+ query = tool_input["query"]
261
+ max_results = tool_input.get("max_results", 5)
262
+
263
+ results = await self.mcp_registry.search.query(f"{query} news", max_results=max_results)
264
+ return {
265
+ "results": results,
266
+ "count": len(results)
267
+ }
268
+
269
+ # ============ STORE MCP SERVER ============
270
+ elif tool_name == "save_prospect":
271
+ prospect_data = {
272
+ "id": tool_input.get("prospect_id", str(uuid.uuid4())),
273
+ "company": {
274
+ "id": tool_input.get("company_id"),
275
+ "name": tool_input.get("company_name"),
276
+ "domain": tool_input.get("company_domain")
277
+ },
278
+ "fit_score": tool_input.get("fit_score", 0),
279
+ "status": tool_input.get("status", "new"),
280
+ "metadata": tool_input.get("metadata", {})
281
+ }
282
+
283
+ result = await self.mcp_registry.store.save_prospect(prospect_data)
284
+ return {"status": result, "prospect_id": prospect_data["id"]}
285
+
286
+ elif tool_name == "get_prospect":
287
+ prospect_id = tool_input["prospect_id"]
288
+ prospect = await self.mcp_registry.store.get_prospect(prospect_id)
289
+ return prospect or {"error": "Prospect not found"}
290
+
291
+ elif tool_name == "list_prospects":
292
+ prospects = await self.mcp_registry.store.list_prospects()
293
+ status_filter = tool_input.get("status")
294
+
295
+ if status_filter:
296
+ prospects = [p for p in prospects if p.get("status") == status_filter]
297
+
298
+ return {
299
+ "prospects": prospects,
300
+ "count": len(prospects)
301
+ }
302
+
303
+ elif tool_name == "save_company":
304
+ company_data = {
305
+ "id": tool_input.get("company_id", str(uuid.uuid4())),
306
+ "name": tool_input["name"],
307
+ "domain": tool_input["domain"],
308
+ "industry": tool_input.get("industry"),
309
+ "description": tool_input.get("description"),
310
+ "employee_count": tool_input.get("employee_count")
311
+ }
312
+
313
+ result = await self.mcp_registry.store.save_company(company_data)
314
+ return {"status": result, "company_id": company_data["id"]}
315
+
316
+ elif tool_name == "get_company":
317
+ company_id = tool_input["company_id"]
318
+ company = await self.mcp_registry.store.get_company(company_id)
319
+ return company or {"error": "Company not found"}
320
+
321
+ elif tool_name == "save_fact":
322
+ fact_data = {
323
+ "id": tool_input.get("fact_id", str(uuid.uuid4())),
324
+ "company_id": tool_input["company_id"],
325
+ "fact_type": tool_input["fact_type"],
326
+ "content": tool_input["content"],
327
+ "source_url": tool_input.get("source_url"),
328
+ "confidence_score": tool_input.get("confidence_score", 0.8)
329
+ }
330
+
331
+ result = await self.mcp_registry.store.save_fact(fact_data)
332
+ return {"status": result, "fact_id": fact_data["id"]}
333
+
334
+ elif tool_name == "save_contact":
335
+ contact_data = {
336
+ "id": tool_input.get("contact_id", str(uuid.uuid4())),
337
+ "company_id": tool_input["company_id"],
338
+ "email": tool_input["email"],
339
+ "first_name": tool_input.get("first_name"),
340
+ "last_name": tool_input.get("last_name"),
341
+ "title": tool_input.get("title"),
342
+ "seniority": tool_input.get("seniority")
343
+ }
344
+
345
+ result = await self.mcp_registry.store.save_contact(contact_data)
346
+ return {"status": result, "contact_id": contact_data["id"]}
347
+
348
+ elif tool_name == "list_contacts_by_domain":
349
+ domain = tool_input["domain"]
350
+ contacts = await self.mcp_registry.store.list_contacts_by_domain(domain)
351
+ return {
352
+ "contacts": contacts,
353
+ "count": len(contacts)
354
+ }
355
+
356
+ elif tool_name == "check_suppression":
357
+ supp_type = tool_input["suppression_type"]
358
+ value = tool_input["value"]
359
+
360
+ is_suppressed = await self.mcp_registry.store.check_suppression(supp_type, value)
361
+ return {
362
+ "suppressed": is_suppressed,
363
+ "value": value,
364
+ "type": supp_type
365
+ }
366
+
367
+ # ============ EMAIL MCP SERVER ============
368
+ elif tool_name == "send_email":
369
+ to = tool_input["to"]
370
+ subject = tool_input["subject"]
371
+ body = tool_input["body"]
372
+ prospect_id = tool_input["prospect_id"]
373
+
374
+ thread_id = await self.mcp_registry.email.send(to, subject, body, prospect_id)
375
+ return {
376
+ "status": "sent",
377
+ "thread_id": thread_id,
378
+ "to": to
379
+ }
380
+
381
+ elif tool_name == "get_email_thread":
382
+ prospect_id = tool_input["prospect_id"]
383
+ thread = await self.mcp_registry.email.get_thread(prospect_id)
384
+ return thread or {"error": "No email thread found"}
385
+
386
+ # ============ CALENDAR MCP SERVER ============
387
+ elif tool_name == "suggest_meeting_slots":
388
+ num_slots = tool_input.get("num_slots", 3)
389
+ slots = await self.mcp_registry.calendar.suggest_slots()
390
+ return {
391
+ "slots": slots[:num_slots],
392
+ "count": len(slots[:num_slots])
393
+ }
394
+
395
+ elif tool_name == "generate_calendar_invite":
396
+ start_time = tool_input["start_time"]
397
+ end_time = tool_input["end_time"]
398
+ title = tool_input["title"]
399
+
400
+ slot = {
401
+ "start_iso": start_time,
402
+ "end_iso": end_time,
403
+ "title": title
404
+ }
405
+
406
+ ics = await self.mcp_registry.calendar.generate_ics(slot)
407
+ return {
408
+ "ics_content": ics,
409
+ "meeting": slot
410
+ }
411
+
412
+ else:
413
+ raise ValueError(f"Unknown MCP tool: {tool_name}")
mcp/agents/autonomous_agent_granite.py ADDED
@@ -0,0 +1,686 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Autonomous AI Agent with MCP Tool Calling using Granite 4.0 H-1B (Open Source)
3
+
4
+ This agent uses IBM Granite 4.0 H-1B (1.5B params) loaded locally via transformers
5
+ to autonomously decide which MCP tools to call.
6
+
7
+ Granite 4.0 H-1B is optimized for tool calling and function calling tasks.
8
+ Uses ReAct (Reasoning + Acting) prompting pattern for reliable tool calling.
9
+ """
10
+
11
+ import os
12
+ import re
13
+ import json
14
+ import uuid
15
+ import logging
16
+ import asyncio
17
+ from typing import List, Dict, Any, AsyncGenerator
18
+ from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
19
+ import torch
20
+
21
+ from mcp.tools.definitions import MCP_TOOLS, list_all_tools
22
+ from mcp.registry import MCPRegistry
23
+
24
+ logger = logging.getLogger(__name__)
25
+
26
+
27
+ class AutonomousMCPAgentGranite:
28
+ """
29
+ AI Agent that autonomously uses MCP servers as tools using Granite 4.
30
+
31
+ Uses ReAct (Reasoning + Acting) pattern:
32
+ 1. Thought: AI reasons about what to do next
33
+ 2. Action: AI decides which tool to call
34
+ 3. Observation: AI sees the tool result
35
+ 4. Repeat until task complete
36
+ """
37
+
38
+ def __init__(self, mcp_registry: MCPRegistry, hf_token: str = None):
39
+ """
40
+ Initialize the autonomous agent with Granite 4.0 H-1B
41
+
42
+ Args:
43
+ mcp_registry: MCP registry with all servers
44
+ hf_token: HuggingFace token (optional, for accessing private models)
45
+ """
46
+ self.mcp_registry = mcp_registry
47
+ self.hf_token = hf_token or os.getenv("HF_API_TOKEN") or os.getenv("HF_TOKEN")
48
+
49
+ # Use Granite 4.0 H-1B (1.5B params, optimized for tool calling)
50
+ self.model_name = "ibm-granite/granite-4.0-h-1b"
51
+
52
+ logger.info(f"Loading Granite 4.0 H-1B model locally...")
53
+
54
+ # Load model with optimizations for CPU/limited memory
55
+ try:
56
+ logger.info(f"📥 Downloading tokenizer from {self.model_name}...")
57
+ # Use bfloat16 for better efficiency, float32 fallback for CPU
58
+ self.tokenizer = AutoTokenizer.from_pretrained(
59
+ self.model_name,
60
+ token=self.hf_token,
61
+ trust_remote_code=True
62
+ )
63
+ logger.info(f"✓ Tokenizer loaded successfully")
64
+
65
+ # Check device availability
66
+ device = "cuda" if torch.cuda.is_available() else "cpu"
67
+ dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
68
+ logger.info(f"💻 Device: {device}, dtype: {dtype}")
69
+
70
+ logger.info(f"📥 Downloading model weights (~1.5GB)...")
71
+
72
+ # For hybrid models like Granite H-1B, we need explicit device placement
73
+ if torch.cuda.is_available():
74
+ # GPU available - use device_map
75
+ self.model = AutoModelForCausalLM.from_pretrained(
76
+ self.model_name,
77
+ token=self.hf_token,
78
+ torch_dtype=dtype,
79
+ device_map="auto",
80
+ low_cpu_mem_usage=True,
81
+ trust_remote_code=True
82
+ )
83
+ else:
84
+ # CPU only - load with 8-bit quantization to reduce memory
85
+ logger.info(f"⚠️ Loading on CPU (no GPU available)")
86
+ logger.info(f"💾 Using 8-bit quantization to reduce memory usage")
87
+
88
+ try:
89
+ # Try loading with 8-bit quantization (requires bitsandbytes)
90
+ from transformers import BitsAndBytesConfig
91
+
92
+ quantization_config = BitsAndBytesConfig(
93
+ load_in_8bit=True,
94
+ llm_int8_threshold=6.0
95
+ )
96
+
97
+ self.model = AutoModelForCausalLM.from_pretrained(
98
+ self.model_name,
99
+ token=self.hf_token,
100
+ quantization_config=quantization_config,
101
+ low_cpu_mem_usage=False,
102
+ trust_remote_code=True
103
+ )
104
+ logger.info(f"✓ Loaded with 8-bit quantization (~50% memory reduction)")
105
+ except (ImportError, Exception) as e:
106
+ # Fallback to float32 if 8-bit fails
107
+ logger.warning(f"⚠️ 8-bit quantization failed: {e}")
108
+ logger.info(f"⚠️ Falling back to float32 (may use ~4-6GB RAM)")
109
+
110
+ self.model = AutoModelForCausalLM.from_pretrained(
111
+ self.model_name,
112
+ token=self.hf_token,
113
+ torch_dtype=torch.float32, # Use float32 for CPU
114
+ low_cpu_mem_usage=False, # Disable to avoid meta device
115
+ trust_remote_code=True
116
+ )
117
+
118
+ # Verify all parameters are on CPU, not meta
119
+ logger.info(f"🔍 Verifying model is materialized on CPU...")
120
+ param_devices = set()
121
+ for param in self.model.parameters():
122
+ param_devices.add(str(param.device))
123
+
124
+ if 'meta' in param_devices:
125
+ logger.error(f"❌ Model still has parameters on meta device!")
126
+ raise RuntimeError("Model not properly materialized. Try upgrading transformers: pip install --upgrade transformers")
127
+
128
+ logger.info(f"✓ All parameters on: {param_devices}")
129
+
130
+ logger.info(f"✓ Model weights loaded")
131
+
132
+ # Set model to eval mode
133
+ self.model.eval()
134
+ logger.info(f"✓ Model set to evaluation mode")
135
+
136
+ # Get model device and memory info
137
+ try:
138
+ model_device = next(self.model.parameters()).device
139
+ logger.info(f"✓ Model loaded successfully on device: {model_device}")
140
+ except StopIteration:
141
+ logger.warning(f"⚠️ Could not determine model device (no parameters)")
142
+
143
+ # Memory info if available
144
+ if torch.cuda.is_available():
145
+ memory_allocated = torch.cuda.memory_allocated() / 1024**3
146
+ logger.info(f"📊 GPU Memory allocated: {memory_allocated:.2f} GB")
147
+
148
+ except Exception as e:
149
+ logger.error(f"❌ Failed to load model: {e}", exc_info=True)
150
+ raise
151
+
152
+ # Create tool descriptions for the AI
153
+ self.tools_description = self._create_tools_description()
154
+
155
+ logger.info(f"Autonomous MCP Agent initialized with model: {self.model_name}")
156
+
157
+ def _generate_text(self, prompt: str) -> str:
158
+ """
159
+ Generate text using the local Granite model (synchronous, for use in executor)
160
+
161
+ Args:
162
+ prompt: The input prompt
163
+
164
+ Returns:
165
+ Generated text
166
+ """
167
+ import time
168
+ import gc
169
+ start_time = time.time()
170
+
171
+ # Force garbage collection before inference to free memory
172
+ gc.collect()
173
+ if torch.cuda.is_available():
174
+ torch.cuda.empty_cache()
175
+
176
+ # Tokenize input with aggressive truncation to save memory
177
+ logger.info(f"🔤 Tokenizing input (length: {len(prompt)} chars)...")
178
+ inputs = self.tokenizer(
179
+ prompt,
180
+ return_tensors="pt",
181
+ truncation=True,
182
+ max_length=2048 # Reduced from 4096 to save memory
183
+ )
184
+ num_input_tokens = inputs["input_ids"].shape[-1]
185
+ logger.info(f"✓ Tokenized to {num_input_tokens} tokens")
186
+
187
+ # Get target device - handle models split across devices
188
+ try:
189
+ target_device = next(self.model.parameters()).device
190
+ except StopIteration:
191
+ # Fallback if no parameters found
192
+ target_device = torch.device('cpu')
193
+
194
+ logger.info(f"📍 Moving inputs to device: {target_device}")
195
+
196
+ # Move to same device as model
197
+ inputs = {k: v.to(target_device) for k, v in inputs.items()}
198
+
199
+ # Generate with memory-efficient settings
200
+ logger.info(f"🤖 Generating response (max 400 tokens, temp=0.1)...")
201
+ with torch.no_grad():
202
+ outputs = self.model.generate(
203
+ **inputs,
204
+ max_new_tokens=400, # Reduced from 800 to save memory
205
+ temperature=0.1, # Low temperature for deterministic reasoning
206
+ top_p=0.9,
207
+ do_sample=True,
208
+ pad_token_id=self.tokenizer.eos_token_id,
209
+ eos_token_id=self.tokenizer.eos_token_id,
210
+ use_cache=True, # Use KV cache for efficiency
211
+ num_beams=1, # Greedy decoding to save memory
212
+ )
213
+
214
+ # Decode only the new tokens
215
+ response = self.tokenizer.decode(
216
+ outputs[0][inputs["input_ids"].shape[-1]:],
217
+ skip_special_tokens=True
218
+ )
219
+
220
+ elapsed = time.time() - start_time
221
+ num_output_tokens = outputs.shape[-1] - num_input_tokens
222
+ tokens_per_sec = num_output_tokens / elapsed if elapsed > 0 else 0
223
+
224
+ logger.info(f"✓ Generated {num_output_tokens} tokens in {elapsed:.1f}s ({tokens_per_sec:.1f} tokens/sec)")
225
+ logger.info(f"📝 Response preview: {response[:100]}...")
226
+
227
+ # Clean up to free memory
228
+ del inputs, outputs
229
+ gc.collect()
230
+ if torch.cuda.is_available():
231
+ torch.cuda.empty_cache()
232
+
233
+ return response
234
+
235
+ def _create_tools_description(self) -> str:
236
+ """Create a formatted description of all available tools for the AI"""
237
+ tools_text = "## Available MCP Tools:\n\n"
238
+
239
+ for tool in MCP_TOOLS:
240
+ tools_text += f"**{tool['name']}**\n"
241
+ tools_text += f" Description: {tool['description']}\n"
242
+ tools_text += f" Parameters:\n"
243
+
244
+ for prop_name, prop_data in tool['input_schema']['properties'].items():
245
+ required = prop_name in tool['input_schema'].get('required', [])
246
+ tools_text += f" - {prop_name} ({prop_data['type']}){'*' if required else ''}: {prop_data.get('description', '')}\n"
247
+
248
+ tools_text += "\n"
249
+
250
+ return tools_text
251
+
252
+ def _create_system_prompt(self) -> str:
253
+ """Create the system prompt for ReAct pattern"""
254
+ return f"""You are an autonomous AI agent for B2B sales automation using the ReAct (Reasoning + Acting) framework.
255
+
256
+ You have access to MCP (Model Context Protocol) tools that let you:
257
+ - Search the web for company information and news
258
+ - Save prospects, companies, contacts, and facts to a database
259
+ - Send emails and manage email threads
260
+ - Schedule meetings and generate calendar invites
261
+
262
+ {self.tools_description}
263
+
264
+ ## ReAct Format:
265
+
266
+ You must respond using this EXACT format:
267
+
268
+ Thought: [Your reasoning about what to do next]
269
+ Action: [tool_name]
270
+ Action Input: {{"param1": "value1", "param2": "value2"}}
271
+
272
+ After you see the Observation, you can continue with more Thought/Action/Observation cycles.
273
+
274
+ When you've completed the task, respond with:
275
+ Thought: [Your final reasoning]
276
+ Final Answer: [Your complete response to the user]
277
+
278
+ ## Important Rules:
279
+ 1. Always use "Thought:" to reason before acting
280
+ 2. Always use "Action:" followed by exact tool name
281
+ 3. Always use "Action Input:" with valid JSON
282
+ 4. Use tools multiple times if needed
283
+ 5. Save important data to the database
284
+ 6. When done, give a "Final Answer:"
285
+
286
+ ## Example:
287
+
288
+ Thought: I need to research Shopify first
289
+ Action: search_web
290
+ Action Input: {{"query": "Shopify company information"}}
291
+
292
+ [You'll see Observation with results]
293
+
294
+ Thought: Now I should save the company data
295
+ Action: save_company
296
+ Action Input: {{"company_id": "shopify", "name": "Shopify", "domain": "shopify.com"}}
297
+
298
+ [Continue until task complete...]
299
+
300
+ Thought: I've gathered all the information and saved it
301
+ Final Answer: I've successfully researched Shopify and created a prospect profile with company information and recent facts.
302
+
303
+ Now complete your assigned task!"""
304
+
305
+ async def run(
306
+ self,
307
+ task: str,
308
+ max_iterations: int = 15
309
+ ) -> AsyncGenerator[Dict[str, Any], None]:
310
+ """
311
+ Run the agent autonomously on a task using ReAct pattern.
312
+
313
+ Args:
314
+ task: The task to complete
315
+ max_iterations: Maximum tool calls to prevent infinite loops
316
+
317
+ Yields:
318
+ Events showing agent's progress and tool calls
319
+ """
320
+
321
+ yield {
322
+ "type": "agent_start",
323
+ "message": f"🤖 Autonomous AI Agent (Granite 4) starting task",
324
+ "task": task,
325
+ "model": self.model
326
+ }
327
+
328
+ # Initialize conversation with system prompt and task
329
+ conversation_history = f"""{self._create_system_prompt()}
330
+
331
+ ## Task:
332
+ {task}
333
+
334
+ Begin!
335
+
336
+ """
337
+
338
+ iteration = 0
339
+
340
+ while iteration < max_iterations:
341
+ iteration += 1
342
+
343
+ yield {
344
+ "type": "iteration_start",
345
+ "iteration": iteration,
346
+ "message": f"🔄 Iteration {iteration}: AI reasoning..."
347
+ }
348
+
349
+ try:
350
+ # Get AI response using ReAct pattern
351
+ response_text = ""
352
+
353
+ try:
354
+ # Generate using local model
355
+ # Run in executor to avoid blocking the event loop
356
+ response_text = await asyncio.get_event_loop().run_in_executor(
357
+ None,
358
+ self._generate_text,
359
+ conversation_history
360
+ )
361
+
362
+ except Exception as gen_error:
363
+ logger.error(f"Text generation failed: {gen_error}", exc_info=True)
364
+ yield {
365
+ "type": "agent_error",
366
+ "error": str(gen_error),
367
+ "message": f"❌ Model error: {str(gen_error)}"
368
+ }
369
+ break
370
+
371
+ # Check if we got a response
372
+ if not response_text or not response_text.strip():
373
+ logger.warning("Empty response from model")
374
+ yield {
375
+ "type": "parse_error",
376
+ "message": "⚠️ Model returned empty response. Retrying...",
377
+ "response": ""
378
+ }
379
+ continue
380
+
381
+ # Log the raw response for debugging
382
+ logger.info(f"Model response (iteration {iteration}): {response_text[:200]}...")
383
+
384
+ # Parse the response for Thought, Action, Action Input
385
+ thought_match = re.search(r'Thought:\s*(.+?)(?=\n(?:Action:|Final Answer:)|$)', response_text, re.DOTALL)
386
+ action_match = re.search(r'Action:\s*(\w+)', response_text)
387
+ action_input_match = re.search(r'Action Input:\s*(\{.+?\})', response_text, re.DOTALL)
388
+ final_answer_match = re.search(r'Final Answer:\s*(.+?)$', response_text, re.DOTALL)
389
+
390
+ # Extract thought
391
+ if thought_match:
392
+ thought = thought_match.group(1).strip()
393
+ yield {
394
+ "type": "thought",
395
+ "thought": thought,
396
+ "message": f"💭 Thought: {thought}"
397
+ }
398
+
399
+ # Check if AI wants to finish
400
+ if final_answer_match:
401
+ final_answer = final_answer_match.group(1).strip()
402
+
403
+ yield {
404
+ "type": "agent_complete",
405
+ "message": "✅ Task complete!",
406
+ "final_answer": final_answer,
407
+ "iterations": iteration
408
+ }
409
+ break
410
+
411
+ # Execute action if present
412
+ if action_match and action_input_match:
413
+ tool_name = action_match.group(1).strip()
414
+ action_input_str = action_input_match.group(1).strip()
415
+
416
+ # Parse action input JSON
417
+ try:
418
+ tool_input = json.loads(action_input_str)
419
+ except json.JSONDecodeError as e:
420
+ error_msg = f"Invalid JSON in Action Input: {e}"
421
+ logger.error(error_msg)
422
+
423
+ # Give feedback to AI
424
+ conversation_history += response_text
425
+ conversation_history += f"\nObservation: Error - {error_msg}. Please provide valid JSON.\n\n"
426
+ continue
427
+
428
+ yield {
429
+ "type": "tool_call",
430
+ "tool": tool_name,
431
+ "input": tool_input,
432
+ "message": f"🔧 Action: {tool_name}"
433
+ }
434
+
435
+ # Execute the MCP tool
436
+ try:
437
+ result = await self._execute_mcp_tool(tool_name, tool_input)
438
+
439
+ yield {
440
+ "type": "tool_result",
441
+ "tool": tool_name,
442
+ "result": result,
443
+ "message": f"✓ Tool {tool_name} completed"
444
+ }
445
+
446
+ # Add to conversation history
447
+ conversation_history += response_text
448
+ conversation_history += f"\nObservation: {json.dumps(result, default=str)}\n\n"
449
+
450
+ except Exception as e:
451
+ error_msg = str(e)
452
+ logger.error(f"Tool execution failed: {tool_name} - {error_msg}")
453
+
454
+ yield {
455
+ "type": "tool_error",
456
+ "tool": tool_name,
457
+ "error": error_msg,
458
+ "message": f"❌ Tool {tool_name} failed: {error_msg}"
459
+ }
460
+
461
+ # Give error feedback to AI
462
+ conversation_history += response_text
463
+ conversation_history += f"\nObservation: Error - {error_msg}\n\n"
464
+
465
+ else:
466
+ # No action found - AI might be confused
467
+ yield {
468
+ "type": "parse_error",
469
+ "message": "⚠️ Could not parse Action from AI response",
470
+ "response": response_text
471
+ }
472
+
473
+ # Give feedback to AI
474
+ conversation_history += response_text
475
+ conversation_history += "\nObservation: Please follow the format: 'Action: tool_name' and 'Action Input: {...}'\n\n"
476
+
477
+ except (RuntimeError, StopIteration, StopAsyncIteration) as stop_err:
478
+ # Handle StopIteration errors that get wrapped in RuntimeError
479
+ error_msg = str(stop_err)
480
+ logger.error(f"Stop iteration in agent loop: {error_msg}", exc_info=True)
481
+
482
+ if "StopIteration" in error_msg or "StopAsyncIteration" in error_msg:
483
+ yield {
484
+ "type": "agent_error",
485
+ "error": "Model inference error - possibly model not available or API issue",
486
+ "message": f"❌ Model inference failed. Please check:\n"
487
+ f" 1. HF_API_TOKEN is valid\n"
488
+ f" 2. Model '{self.model}' is accessible\n"
489
+ f" 3. HuggingFace Inference API is operational"
490
+ }
491
+ else:
492
+ yield {
493
+ "type": "agent_error",
494
+ "error": error_msg,
495
+ "message": f"❌ Agent error: {error_msg}"
496
+ }
497
+ break
498
+ except Exception as e:
499
+ logger.error(f"Agent iteration failed: {e}", exc_info=True)
500
+ yield {
501
+ "type": "agent_error",
502
+ "error": str(e),
503
+ "message": f"❌ Agent error: {str(e)}"
504
+ }
505
+ break
506
+
507
+ if iteration >= max_iterations:
508
+ yield {
509
+ "type": "agent_max_iterations",
510
+ "message": f"⚠️ Reached maximum iterations ({max_iterations})",
511
+ "iterations": iteration
512
+ }
513
+
514
+ async def _execute_mcp_tool(self, tool_name: str, tool_input: Dict[str, Any]) -> Any:
515
+ """
516
+ Execute an MCP tool by routing to the appropriate MCP server.
517
+
518
+ This is where we actually call the MCP servers!
519
+ """
520
+
521
+ # ============ SEARCH MCP SERVER ============
522
+ if tool_name == "search_web":
523
+ query = tool_input["query"]
524
+ max_results = tool_input.get("max_results", 5)
525
+
526
+ results = await self.mcp_registry.search.query(query, max_results=max_results)
527
+ return {
528
+ "results": results[:max_results],
529
+ "count": len(results[:max_results])
530
+ }
531
+
532
+ elif tool_name == "search_news":
533
+ query = tool_input["query"]
534
+ max_results = tool_input.get("max_results", 5)
535
+
536
+ results = await self.mcp_registry.search.query(f"{query} news", max_results=max_results)
537
+ return {
538
+ "results": results[:max_results],
539
+ "count": len(results[:max_results])
540
+ }
541
+
542
+ # ============ STORE MCP SERVER ============
543
+ elif tool_name == "save_prospect":
544
+ prospect_data = {
545
+ "id": tool_input.get("prospect_id", str(uuid.uuid4())),
546
+ "company": {
547
+ "id": tool_input.get("company_id"),
548
+ "name": tool_input.get("company_name"),
549
+ "domain": tool_input.get("company_domain")
550
+ },
551
+ "fit_score": tool_input.get("fit_score", 0),
552
+ "status": tool_input.get("status", "new"),
553
+ "metadata": tool_input.get("metadata", {})
554
+ }
555
+
556
+ result = await self.mcp_registry.store.save_prospect(prospect_data)
557
+ return {"status": result, "prospect_id": prospect_data["id"]}
558
+
559
+ elif tool_name == "get_prospect":
560
+ prospect_id = tool_input["prospect_id"]
561
+ prospect = await self.mcp_registry.store.get_prospect(prospect_id)
562
+ return prospect or {"error": "Prospect not found"}
563
+
564
+ elif tool_name == "list_prospects":
565
+ prospects = await self.mcp_registry.store.list_prospects()
566
+ status_filter = tool_input.get("status")
567
+
568
+ if status_filter:
569
+ prospects = [p for p in prospects if p.get("status") == status_filter]
570
+
571
+ return {
572
+ "prospects": prospects,
573
+ "count": len(prospects)
574
+ }
575
+
576
+ elif tool_name == "save_company":
577
+ company_data = {
578
+ "id": tool_input.get("company_id", str(uuid.uuid4())),
579
+ "name": tool_input["name"],
580
+ "domain": tool_input["domain"],
581
+ "industry": tool_input.get("industry"),
582
+ "description": tool_input.get("description"),
583
+ "employee_count": tool_input.get("employee_count")
584
+ }
585
+
586
+ result = await self.mcp_registry.store.save_company(company_data)
587
+ return {"status": result, "company_id": company_data["id"]}
588
+
589
+ elif tool_name == "get_company":
590
+ company_id = tool_input["company_id"]
591
+ company = await self.mcp_registry.store.get_company(company_id)
592
+ return company or {"error": "Company not found"}
593
+
594
+ elif tool_name == "save_fact":
595
+ fact_data = {
596
+ "id": tool_input.get("fact_id", str(uuid.uuid4())),
597
+ "company_id": tool_input["company_id"],
598
+ "fact_type": tool_input["fact_type"],
599
+ "content": tool_input["content"],
600
+ "source_url": tool_input.get("source_url"),
601
+ "confidence_score": tool_input.get("confidence_score", 0.8)
602
+ }
603
+
604
+ result = await self.mcp_registry.store.save_fact(fact_data)
605
+ return {"status": result, "fact_id": fact_data["id"]}
606
+
607
+ elif tool_name == "save_contact":
608
+ contact_data = {
609
+ "id": tool_input.get("contact_id", str(uuid.uuid4())),
610
+ "company_id": tool_input["company_id"],
611
+ "email": tool_input["email"],
612
+ "first_name": tool_input.get("first_name"),
613
+ "last_name": tool_input.get("last_name"),
614
+ "title": tool_input.get("title"),
615
+ "seniority": tool_input.get("seniority")
616
+ }
617
+
618
+ result = await self.mcp_registry.store.save_contact(contact_data)
619
+ return {"status": result, "contact_id": contact_data["id"]}
620
+
621
+ elif tool_name == "list_contacts_by_domain":
622
+ domain = tool_input["domain"]
623
+ contacts = await self.mcp_registry.store.list_contacts_by_domain(domain)
624
+ return {
625
+ "contacts": contacts,
626
+ "count": len(contacts)
627
+ }
628
+
629
+ elif tool_name == "check_suppression":
630
+ supp_type = tool_input["suppression_type"]
631
+ value = tool_input["value"]
632
+
633
+ is_suppressed = await self.mcp_registry.store.check_suppression(supp_type, value)
634
+ return {
635
+ "suppressed": is_suppressed,
636
+ "value": value,
637
+ "type": supp_type
638
+ }
639
+
640
+ # ============ EMAIL MCP SERVER ============
641
+ elif tool_name == "send_email":
642
+ to = tool_input["to"]
643
+ subject = tool_input["subject"]
644
+ body = tool_input["body"]
645
+ prospect_id = tool_input["prospect_id"]
646
+
647
+ thread_id = await self.mcp_registry.email.send(to, subject, body, prospect_id)
648
+ return {
649
+ "status": "sent",
650
+ "thread_id": thread_id,
651
+ "to": to
652
+ }
653
+
654
+ elif tool_name == "get_email_thread":
655
+ prospect_id = tool_input["prospect_id"]
656
+ thread = await self.mcp_registry.email.get_thread(prospect_id)
657
+ return thread or {"error": "No email thread found"}
658
+
659
+ # ============ CALENDAR MCP SERVER ============
660
+ elif tool_name == "suggest_meeting_slots":
661
+ num_slots = tool_input.get("num_slots", 3)
662
+ slots = await self.mcp_registry.calendar.suggest_slots()
663
+ return {
664
+ "slots": slots[:num_slots],
665
+ "count": len(slots[:num_slots])
666
+ }
667
+
668
+ elif tool_name == "generate_calendar_invite":
669
+ start_time = tool_input["start_time"]
670
+ end_time = tool_input["end_time"]
671
+ title = tool_input["title"]
672
+
673
+ slot = {
674
+ "start_iso": start_time,
675
+ "end_iso": end_time,
676
+ "title": title
677
+ }
678
+
679
+ ics = await self.mcp_registry.calendar.generate_ics(slot)
680
+ return {
681
+ "ics_content": ics,
682
+ "meeting": slot
683
+ }
684
+
685
+ else:
686
+ raise ValueError(f"Unknown MCP tool: {tool_name}")
mcp/agents/autonomous_agent_groq.py ADDED
@@ -0,0 +1,334 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Autonomous AI Agent with MCP Tool Calling using Groq API
3
+
4
+ Groq offers FREE API access with fast inference on Llama, Mixtral models.
5
+ No payment required - just need a free API key from console.groq.com
6
+ """
7
+
8
+ import os
9
+ import json
10
+ import uuid
11
+ import logging
12
+ import asyncio
13
+ from typing import List, Dict, Any, AsyncGenerator, Optional
14
+
15
+ from mcp.tools.definitions import MCP_TOOLS
16
+ from mcp.registry import MCPRegistry
17
+
18
+ logger = logging.getLogger(__name__)
19
+
20
+ # Groq FREE models
21
+ GROQ_MODELS = [
22
+ "llama-3.1-70b-versatile", # Best quality, free
23
+ "llama-3.1-8b-instant", # Fast, free
24
+ "mixtral-8x7b-32768", # Good for complex tasks
25
+ "gemma2-9b-it", # Google's model
26
+ ]
27
+
28
+ DEFAULT_MODEL = "llama-3.1-70b-versatile"
29
+
30
+
31
+ class AutonomousMCPAgentGroq:
32
+ """
33
+ AI Agent using Groq API (FREE, fast inference)
34
+
35
+ Get your free API key at: https://console.groq.com
36
+ """
37
+
38
+ def __init__(
39
+ self,
40
+ mcp_registry: MCPRegistry,
41
+ api_key: str = None,
42
+ model: str = None
43
+ ):
44
+ self.mcp_registry = mcp_registry
45
+ self.api_key = api_key or os.getenv("GROQ_API_KEY")
46
+ self.model = model or os.getenv("GROQ_MODEL", DEFAULT_MODEL)
47
+
48
+ if not self.api_key:
49
+ raise ValueError("GROQ_API_KEY is required. Get free key at https://console.groq.com")
50
+
51
+ # Build tools for the prompt
52
+ self.tools_description = self._build_tools_description()
53
+
54
+ logger.info(f"Groq Agent initialized with model: {self.model}")
55
+
56
+ def _build_tools_description(self) -> str:
57
+ """Build tool descriptions for the system prompt"""
58
+ tools_text = ""
59
+ for tool in MCP_TOOLS:
60
+ tools_text += f"\n- **{tool['name']}**: {tool['description']}"
61
+ props = tool.get('input_schema', {}).get('properties', {})
62
+ required = tool.get('input_schema', {}).get('required', [])
63
+ if props:
64
+ tools_text += "\n Parameters:"
65
+ for param, details in props.items():
66
+ req = "(required)" if param in required else "(optional)"
67
+ tools_text += f"\n - {param} {req}: {details.get('description', '')}"
68
+ return tools_text
69
+
70
+ def _build_system_prompt(self) -> str:
71
+ return f"""You are an AI sales agent with access to tools. Use tools to complete tasks.
72
+
73
+ AVAILABLE TOOLS:
74
+ {self.tools_description}
75
+
76
+ TO USE A TOOL, respond with JSON in this exact format:
77
+ ```json
78
+ {{"tool": "tool_name", "parameters": {{"param1": "value1"}}}}
79
+ ```
80
+
81
+ RULES:
82
+ 1. Use search_web to find information
83
+ 2. Use save_prospect, save_contact to store data
84
+ 3. Use send_email to draft emails
85
+ 4. After completing all tasks, provide a summary
86
+ 5. Say "DONE" when finished
87
+
88
+ Be concise and focused."""
89
+
90
+ async def run(self, task: str, max_iterations: int = 15) -> AsyncGenerator[Dict[str, Any], None]:
91
+ """Run the agent on a task"""
92
+ import requests
93
+
94
+ yield {
95
+ "type": "agent_start",
96
+ "message": f"Starting task with {self.model}",
97
+ "model": self.model
98
+ }
99
+
100
+ system_prompt = self._build_system_prompt()
101
+ messages = [
102
+ {"role": "system", "content": system_prompt},
103
+ {"role": "user", "content": task}
104
+ ]
105
+
106
+ for iteration in range(1, max_iterations + 1):
107
+ yield {
108
+ "type": "iteration_start",
109
+ "iteration": iteration,
110
+ "message": f"Iteration {iteration}: AI reasoning..."
111
+ }
112
+
113
+ try:
114
+ # Call Groq API
115
+ response = self._call_groq(messages)
116
+ assistant_content = response.get("choices", [{}])[0].get("message", {}).get("content", "")
117
+
118
+ if not assistant_content:
119
+ continue
120
+
121
+ # Check for completion
122
+ if "DONE" in assistant_content.upper():
123
+ yield {
124
+ "type": "thought",
125
+ "thought": assistant_content.replace("DONE", "").strip(),
126
+ "message": "Task complete"
127
+ }
128
+ yield {
129
+ "type": "agent_complete",
130
+ "message": "Task complete!",
131
+ "final_answer": assistant_content.replace("DONE", "").strip(),
132
+ "iterations": iteration
133
+ }
134
+ return
135
+
136
+ # Try to parse tool calls
137
+ tool_calls = self._parse_tool_calls(assistant_content)
138
+
139
+ if tool_calls:
140
+ messages.append({"role": "assistant", "content": assistant_content})
141
+ tool_results = []
142
+
143
+ for tool_call in tool_calls:
144
+ tool_name = tool_call.get("tool", "")
145
+ tool_params = tool_call.get("parameters", {})
146
+
147
+ yield {
148
+ "type": "tool_call",
149
+ "tool": tool_name,
150
+ "input": tool_params,
151
+ "message": f"Calling: {tool_name}"
152
+ }
153
+
154
+ try:
155
+ result = await self._execute_tool(tool_name, tool_params)
156
+ yield {
157
+ "type": "tool_result",
158
+ "tool": tool_name,
159
+ "result": result,
160
+ "message": f"Tool {tool_name} completed"
161
+ }
162
+ tool_results.append({"tool": tool_name, "result": result})
163
+ except Exception as e:
164
+ yield {
165
+ "type": "tool_error",
166
+ "tool": tool_name,
167
+ "error": str(e),
168
+ "message": f"Tool error: {e}"
169
+ }
170
+ tool_results.append({"tool": tool_name, "error": str(e)})
171
+
172
+ # Add tool results to conversation
173
+ results_text = "Tool results:\n" + json.dumps(tool_results, indent=2, default=str)[:2000]
174
+ messages.append({"role": "user", "content": results_text})
175
+ else:
176
+ # No tool calls - just a response
177
+ yield {
178
+ "type": "thought",
179
+ "thought": assistant_content,
180
+ "message": f"AI: {assistant_content[:100]}..."
181
+ }
182
+ messages.append({"role": "assistant", "content": assistant_content})
183
+ messages.append({"role": "user", "content": "Continue with the task. Use tools to gather data. Say DONE when finished."})
184
+
185
+ except Exception as e:
186
+ logger.error(f"Error in iteration {iteration}: {e}")
187
+ yield {
188
+ "type": "agent_error",
189
+ "error": str(e),
190
+ "message": f"Error: {e}"
191
+ }
192
+ return
193
+
194
+ yield {
195
+ "type": "agent_max_iterations",
196
+ "message": f"Reached max iterations ({max_iterations})",
197
+ "iterations": max_iterations
198
+ }
199
+
200
+ def _call_groq(self, messages: List[Dict]) -> Dict:
201
+ """Call Groq API"""
202
+ import requests
203
+
204
+ url = "https://api.groq.com/openai/v1/chat/completions"
205
+ headers = {
206
+ "Authorization": f"Bearer {self.api_key}",
207
+ "Content-Type": "application/json"
208
+ }
209
+ payload = {
210
+ "model": self.model,
211
+ "messages": messages,
212
+ "max_tokens": 2048,
213
+ "temperature": 0.7
214
+ }
215
+
216
+ response = requests.post(url, headers=headers, json=payload, timeout=60)
217
+ response.raise_for_status()
218
+ return response.json()
219
+
220
+ def _parse_tool_calls(self, text: str) -> List[Dict]:
221
+ """Parse tool calls from response text"""
222
+ import re
223
+
224
+ tool_calls = []
225
+
226
+ # Match JSON blocks
227
+ patterns = [
228
+ r'```json\s*(\{[^`]+\})\s*```',
229
+ r'```\s*(\{[^`]+\})\s*```',
230
+ r'(\{"tool":\s*"[^"]+",\s*"parameters":\s*\{[^}]*\}\})',
231
+ ]
232
+
233
+ for pattern in patterns:
234
+ matches = re.findall(pattern, text, re.DOTALL)
235
+ for match in matches:
236
+ try:
237
+ data = json.loads(match.strip())
238
+ if "tool" in data:
239
+ tool_calls.append(data)
240
+ except json.JSONDecodeError:
241
+ continue
242
+
243
+ return tool_calls
244
+
245
+ async def _execute_tool(self, tool_name: str, tool_input: Dict[str, Any]) -> Any:
246
+ """Execute an MCP tool"""
247
+
248
+ if tool_name == "search_web":
249
+ query = tool_input.get("query", "")
250
+ max_results = tool_input.get("max_results", 5)
251
+ results = await self.mcp_registry.search.query(query, max_results=max_results)
252
+ return {"results": results[:max_results], "count": len(results[:max_results])}
253
+
254
+ elif tool_name == "search_news":
255
+ query = tool_input.get("query", "")
256
+ max_results = tool_input.get("max_results", 5)
257
+ results = await self.mcp_registry.search.query(f"{query} news", max_results=max_results)
258
+ return {"results": results[:max_results], "count": len(results[:max_results])}
259
+
260
+ elif tool_name == "save_prospect":
261
+ prospect_data = {
262
+ "id": tool_input.get("prospect_id", str(uuid.uuid4())),
263
+ "company": {
264
+ "id": tool_input.get("company_id"),
265
+ "name": tool_input.get("company_name"),
266
+ "domain": tool_input.get("company_domain")
267
+ },
268
+ "fit_score": tool_input.get("fit_score", 0),
269
+ "status": tool_input.get("status", "new"),
270
+ "metadata": tool_input.get("metadata", {})
271
+ }
272
+ result = await self.mcp_registry.store.save_prospect(prospect_data)
273
+ return {"status": result, "prospect_id": prospect_data["id"]}
274
+
275
+ elif tool_name == "save_company":
276
+ company_data = {
277
+ "id": tool_input.get("company_id", str(uuid.uuid4())),
278
+ "name": tool_input.get("name", ""),
279
+ "domain": tool_input.get("domain", ""),
280
+ "industry": tool_input.get("industry"),
281
+ "description": tool_input.get("description"),
282
+ "employee_count": tool_input.get("employee_count")
283
+ }
284
+ result = await self.mcp_registry.store.save_company(company_data)
285
+ return {"status": result, "company_id": company_data["id"]}
286
+
287
+ elif tool_name == "save_contact":
288
+ contact_data = {
289
+ "id": tool_input.get("contact_id", str(uuid.uuid4())),
290
+ "company_id": tool_input.get("company_id", ""),
291
+ "email": tool_input.get("email", ""),
292
+ "first_name": tool_input.get("first_name"),
293
+ "last_name": tool_input.get("last_name"),
294
+ "title": tool_input.get("title"),
295
+ "seniority": tool_input.get("seniority")
296
+ }
297
+ result = await self.mcp_registry.store.save_contact(contact_data)
298
+ return {"status": result, "contact_id": contact_data["id"]}
299
+
300
+ elif tool_name == "save_fact":
301
+ fact_data = {
302
+ "id": tool_input.get("fact_id", str(uuid.uuid4())),
303
+ "company_id": tool_input.get("company_id", ""),
304
+ "fact_type": tool_input.get("fact_type", ""),
305
+ "content": tool_input.get("content", ""),
306
+ "source_url": tool_input.get("source_url"),
307
+ "confidence_score": tool_input.get("confidence_score", 0.8)
308
+ }
309
+ result = await self.mcp_registry.store.save_fact(fact_data)
310
+ return {"status": result, "fact_id": fact_data["id"]}
311
+
312
+ elif tool_name == "send_email":
313
+ to = tool_input.get("to", "")
314
+ subject = tool_input.get("subject", "")
315
+ body = tool_input.get("body", "")
316
+ prospect_id = tool_input.get("prospect_id", "")
317
+ thread_id = await self.mcp_registry.email.send(to, subject, body, prospect_id)
318
+ return {"status": "sent", "thread_id": thread_id, "to": to}
319
+
320
+ elif tool_name == "list_prospects":
321
+ prospects = await self.mcp_registry.store.list_prospects()
322
+ return {"prospects": prospects, "count": len(prospects)}
323
+
324
+ elif tool_name == "get_prospect":
325
+ prospect_id = tool_input.get("prospect_id", "")
326
+ prospect = await self.mcp_registry.store.get_prospect(prospect_id)
327
+ return prospect or {"error": "Prospect not found"}
328
+
329
+ elif tool_name == "suggest_meeting_slots":
330
+ slots = await self.mcp_registry.calendar.suggest_slots()
331
+ return {"slots": slots[:3], "count": len(slots[:3])}
332
+
333
+ else:
334
+ raise ValueError(f"Unknown tool: {tool_name}")
mcp/agents/autonomous_agent_hf.py ADDED
@@ -0,0 +1,1215 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Autonomous AI Agent with MCP Tool Calling using HuggingFace Inference Providers
3
+
4
+ This agent uses HuggingFace's Inference Providers API with native tool calling
5
+ support to autonomously decide which MCP tools to call.
6
+
7
+ Benefits:
8
+ - Uses HuggingFace unified API (single HF token for all providers)
9
+ - Native tool calling support (OpenAI-compatible API)
10
+ - Multiple providers: Nebius, Together, Sambanova, etc.
11
+ - Models like Qwen2.5-72B-Instruct with strong tool calling
12
+ - Free tier available with HuggingFace account
13
+ """
14
+
15
+ import os
16
+ import json
17
+ import uuid
18
+ import logging
19
+ import asyncio
20
+ from typing import List, Dict, Any, AsyncGenerator
21
+
22
+ from mcp.tools.definitions import MCP_TOOLS, list_all_tools
23
+ from mcp.registry import MCPRegistry
24
+
25
+ logger = logging.getLogger(__name__)
26
+
27
+ # Free models available via HuggingFace Serverless Inference API
28
+ # These don't require paid provider credits
29
+ FREE_MODELS = [
30
+ "mistralai/Mistral-7B-Instruct-v0.3", # Fast, good quality
31
+ "microsoft/Phi-3-mini-4k-instruct", # Small, fast
32
+ "HuggingFaceH4/zephyr-7b-beta", # Good for chat
33
+ "meta-llama/Llama-3.2-3B-Instruct", # Meta's small model
34
+ "Qwen/Qwen2.5-3B-Instruct", # Qwen small
35
+ ]
36
+
37
+ # Paid provider models (require credits)
38
+ QWEN3_MODELS = [
39
+ "Qwen/Qwen3-32B",
40
+ "Qwen/Qwen3-8B",
41
+ "Qwen/Qwen3-4B",
42
+ ]
43
+
44
+ # HuggingFace Inference Providers
45
+ HF_PROVIDERS = {
46
+ "nscale": {"models": QWEN3_MODELS, "default": "Qwen/Qwen3-32B"}, # nscale provider
47
+ "nebius": {"models": QWEN3_MODELS, "default": "Qwen/Qwen3-32B"},
48
+ "together": {"models": QWEN3_MODELS, "default": "Qwen/Qwen3-32B"},
49
+ "sambanova": {"models": QWEN3_MODELS, "default": "Qwen/Qwen3-8B"},
50
+ "fireworks-ai": {"models": QWEN3_MODELS, "default": "Qwen/Qwen3-8B"},
51
+ "cerebras": {"models": ["Qwen/Qwen3-32B"], "default": "Qwen/Qwen3-32B"},
52
+ }
53
+
54
+ # Default to FREE serverless API (no provider = serverless)
55
+ DEFAULT_PROVIDER = "hf-inference" # Special value for free serverless
56
+ DEFAULT_MODEL = "mistralai/Mistral-7B-Instruct-v0.3"
57
+
58
+
59
+ class AutonomousMCPAgentHF:
60
+ """
61
+ AI Agent that autonomously uses MCP servers as tools using HuggingFace Inference Providers.
62
+
63
+ Uses native tool calling (OpenAI-compatible) for reliable tool execution.
64
+ HuggingFace routes requests to inference providers like Nebius, Together, etc.
65
+ """
66
+
67
+ def __init__(
68
+ self,
69
+ mcp_registry: MCPRegistry,
70
+ hf_token: str = None,
71
+ provider: str = None,
72
+ model: str = None
73
+ ):
74
+ """
75
+ Initialize the autonomous agent with HuggingFace Inference Providers
76
+
77
+ Args:
78
+ mcp_registry: MCP registry with all servers
79
+ hf_token: HuggingFace token (get at huggingface.co/settings/tokens)
80
+ provider: Inference provider (nebius, together, sambanova, etc.)
81
+ model: Model to use (default: Qwen/Qwen2.5-72B-Instruct)
82
+ """
83
+ self.mcp_registry = mcp_registry
84
+ self.hf_token = hf_token or os.getenv("HF_TOKEN") or os.getenv("HF_API_TOKEN")
85
+ self.model = model or os.getenv("HF_MODEL") or DEFAULT_MODEL
86
+
87
+ # Use provider in this order: passed param > env var > auto-detect
88
+ if provider:
89
+ self.provider = provider
90
+ elif os.getenv("HF_PROVIDER"):
91
+ self.provider = os.getenv("HF_PROVIDER")
92
+ elif self.model in QWEN3_MODELS or self.model.startswith("Qwen/Qwen3"):
93
+ # Qwen3 models need a provider (use nscale by default)
94
+ self.provider = "nscale"
95
+ else:
96
+ self.provider = DEFAULT_PROVIDER
97
+
98
+ if not self.hf_token:
99
+ raise ValueError(
100
+ "HF_TOKEN is required!\n"
101
+ "Get a token at: https://huggingface.co/settings/tokens\n"
102
+ "Then set: export HF_TOKEN=hf_your_token_here"
103
+ )
104
+
105
+ # Initialize HuggingFace InferenceClient
106
+ try:
107
+ from huggingface_hub import InferenceClient
108
+ # For serverless API (hf-inference), don't pass provider
109
+ if self.provider == "hf-inference":
110
+ self.client = InferenceClient(token=self.hf_token)
111
+ else:
112
+ self.client = InferenceClient(
113
+ provider=self.provider,
114
+ token=self.hf_token
115
+ )
116
+ logger.info(f"HuggingFace InferenceClient initialized")
117
+ logger.info(f" Provider: {self.provider}")
118
+ logger.info(f" Model: {self.model}")
119
+ except ImportError:
120
+ raise ImportError(
121
+ "huggingface_hub package not installed or outdated!\n"
122
+ "Install/upgrade with: pip install --upgrade huggingface_hub"
123
+ )
124
+
125
+ # Create tool definitions in OpenAI/HF format
126
+ self.tools = self._create_tool_definitions()
127
+
128
+ logger.info(f"Autonomous MCP Agent initialized with HuggingFace ({self.provider})")
129
+ logger.info(f"Available tools: {len(self.tools)}")
130
+
131
+ def _create_tool_definitions(self) -> List[Dict[str, Any]]:
132
+ """Convert MCP tool definitions to OpenAI/HuggingFace function calling format"""
133
+ tools = []
134
+
135
+ for mcp_tool in MCP_TOOLS:
136
+ tool = {
137
+ "type": "function",
138
+ "function": {
139
+ "name": mcp_tool["name"],
140
+ "description": mcp_tool["description"],
141
+ "parameters": mcp_tool["input_schema"]
142
+ }
143
+ }
144
+ tools.append(tool)
145
+
146
+ return tools
147
+
148
+ async def run(
149
+ self,
150
+ task: str,
151
+ max_iterations: int = 15
152
+ ) -> AsyncGenerator[Dict[str, Any], None]:
153
+ """
154
+ Run the agent autonomously on a task using native tool calling.
155
+
156
+ Args:
157
+ task: The task to complete
158
+ max_iterations: Maximum tool calls to prevent infinite loops
159
+
160
+ Yields:
161
+ Events showing agent's progress and tool calls
162
+ """
163
+
164
+ yield {
165
+ "type": "agent_start",
166
+ "message": f"Autonomous AI Agent (HuggingFace) starting task",
167
+ "task": task,
168
+ "model": self.model,
169
+ "provider": self.provider
170
+ }
171
+
172
+ # System prompt for the agent
173
+ system_prompt = """You are an autonomous AI agent for B2B sales automation.
174
+
175
+ You have access to MCP tools including:
176
+ - search_web: Search the web for company information
177
+ - find_verified_contacts: Find REAL decision-makers (searches LinkedIn, company websites, directories)
178
+ - save_prospect: Save a prospect company to the database
179
+ - send_email: Draft outreach emails
180
+
181
+ CRITICAL RULE: Only save prospects that have verified contacts. No contacts = don't save.
182
+
183
+ REQUIRED WORKFLOW:
184
+ 1. search_web to find potential prospect companies
185
+ 2. find_verified_contacts FIRST to check if contacts exist
186
+ 3. IF contacts found (count > 0): save_prospect, then send_email
187
+ 4. IF no contacts found (count = 0): SKIP this company, try the next one
188
+
189
+ TOOL CALL FORMAT - output valid JSON:
190
+
191
+ Step 1 - Find contacts FIRST:
192
+ {"company_name": "Acme Corp", "company_domain": "acme.com", "target_titles": ["CEO", "Founder", "VP Sales", "CTO"], "max_contacts": 3}
193
+
194
+ Step 2 - ONLY if contacts found, save prospect:
195
+ {"prospect_id": "prospect_1", "company_id": "company_1", "company_name": "Acme Corp", "company_domain": "acme.com", "fit_score": 85}
196
+
197
+ The find_verified_contacts tool searches:
198
+ - Company website (team/about pages)
199
+ - LinkedIn profiles
200
+ - Crunchbase, ZoomInfo, directories
201
+ - Press releases and news
202
+ - Social media profiles
203
+
204
+ IMPORTANT:
205
+ - A prospect without contacts is USELESS - don't save it
206
+ - NEVER invent contact names or emails
207
+ - Keep searching until you find prospects WITH verified contacts
208
+
209
+ After completing, summarize:
210
+ - Prospects saved (with contacts)
211
+ - Companies skipped (no contacts)"""
212
+
213
+ # Initialize conversation
214
+ messages = [
215
+ {"role": "system", "content": system_prompt},
216
+ {"role": "user", "content": task}
217
+ ]
218
+
219
+ iteration = 0
220
+
221
+ while iteration < max_iterations:
222
+ iteration += 1
223
+
224
+ yield {
225
+ "type": "iteration_start",
226
+ "iteration": iteration,
227
+ "message": f"Iteration {iteration}: AI reasoning..."
228
+ }
229
+
230
+ try:
231
+ # Call HuggingFace Inference API with tools
232
+ logger.info(f"Calling HuggingFace API (iteration {iteration})...")
233
+ logger.info(f" Provider: {self.provider}, Model: {self.model}")
234
+
235
+ # Run synchronous API call in executor
236
+ response = await asyncio.get_event_loop().run_in_executor(
237
+ None,
238
+ self._call_inference_api,
239
+ messages
240
+ )
241
+
242
+ # Handle response
243
+ if response is None:
244
+ yield {
245
+ "type": "agent_error",
246
+ "error": "Empty response from API",
247
+ "message": "API returned empty response"
248
+ }
249
+ break
250
+
251
+ # Get the assistant message
252
+ assistant_message = response.choices[0].message
253
+
254
+ # Check if AI wants to call tools
255
+ if hasattr(assistant_message, 'tool_calls') and assistant_message.tool_calls:
256
+ # Process each tool call
257
+ tool_results = []
258
+
259
+ for tool_call in assistant_message.tool_calls:
260
+ tool_name = tool_call.function.name
261
+
262
+ try:
263
+ tool_input = json.loads(tool_call.function.arguments)
264
+ except json.JSONDecodeError:
265
+ tool_input = {}
266
+
267
+ yield {
268
+ "type": "tool_call",
269
+ "tool": tool_name,
270
+ "input": tool_input,
271
+ "message": f"Action: {tool_name}"
272
+ }
273
+
274
+ # Execute the MCP tool
275
+ try:
276
+ result = await self._execute_mcp_tool(tool_name, tool_input)
277
+
278
+ yield {
279
+ "type": "tool_result",
280
+ "tool": tool_name,
281
+ "result": result,
282
+ "message": f"Tool {tool_name} completed"
283
+ }
284
+
285
+ tool_results.append({
286
+ "tool_call_id": tool_call.id,
287
+ "role": "tool",
288
+ "content": json.dumps(result, default=str)
289
+ })
290
+
291
+ except Exception as e:
292
+ error_msg = str(e)
293
+ logger.error(f"Tool execution failed: {tool_name} - {error_msg}")
294
+
295
+ yield {
296
+ "type": "tool_error",
297
+ "tool": tool_name,
298
+ "error": error_msg,
299
+ "message": f"Tool {tool_name} failed: {error_msg}"
300
+ }
301
+
302
+ tool_results.append({
303
+ "tool_call_id": tool_call.id,
304
+ "role": "tool",
305
+ "content": json.dumps({"error": error_msg})
306
+ })
307
+
308
+ # Add assistant message and tool results to conversation
309
+ messages.append({
310
+ "role": "assistant",
311
+ "content": assistant_message.content or "",
312
+ "tool_calls": [
313
+ {
314
+ "id": tc.id,
315
+ "type": "function",
316
+ "function": {
317
+ "name": tc.function.name,
318
+ "arguments": tc.function.arguments
319
+ }
320
+ }
321
+ for tc in assistant_message.tool_calls
322
+ ]
323
+ })
324
+ messages.extend(tool_results)
325
+
326
+ else:
327
+ # No tool calls - AI is done or providing response
328
+ final_content = assistant_message.content or ""
329
+ raw_content = getattr(assistant_message, 'raw_content', final_content)
330
+
331
+ # Log for debugging
332
+ logger.info(f"Iteration {iteration}: No tool calls")
333
+ logger.info(f" Raw content length: {len(raw_content)}")
334
+ logger.info(f" Stripped content length: {len(final_content)}")
335
+ if raw_content and not final_content:
336
+ logger.info(f" Raw content preview: {raw_content[:200]}...")
337
+
338
+ # Always yield thought event if we have ANY content (for tracking)
339
+ if final_content:
340
+ yield {
341
+ "type": "thought",
342
+ "thought": final_content,
343
+ "message": f"AI Response: {final_content[:100]}..." if len(final_content) > 100 else f"AI Response: {final_content}"
344
+ }
345
+ elif raw_content:
346
+ # Content was stripped but raw exists - yield a minimal thought
347
+ yield {
348
+ "type": "thought",
349
+ "thought": f"[Processing: {len(raw_content)} chars of reasoning]",
350
+ "message": "AI is reasoning..."
351
+ }
352
+
353
+ # Check if this looks like a final answer (after at least one iteration)
354
+ if iteration > 1:
355
+ # Ensure we have some content for final answer
356
+ if not final_content and raw_content:
357
+ # Try to extract something useful from raw thinking
358
+ import re
359
+ think_match = re.search(r'<think>(.*?)</think>', raw_content, flags=re.DOTALL)
360
+ if think_match:
361
+ think_text = think_match.group(1).strip()
362
+ # Get last meaningful portion
363
+ sentences = [s.strip() for s in think_text.split('.') if len(s.strip()) > 20]
364
+ if sentences:
365
+ final_content = '. '.join(sentences[-5:]) + '.'
366
+ logger.info(f"Extracted final answer from thinking: {final_content[:100]}...")
367
+
368
+ yield {
369
+ "type": "agent_complete",
370
+ "message": "Task complete!",
371
+ "final_answer": final_content,
372
+ "iterations": iteration
373
+ }
374
+ break
375
+
376
+ # Add response to messages and continue
377
+ messages.append({
378
+ "role": "assistant",
379
+ "content": final_content or raw_content[:500] if raw_content else ""
380
+ })
381
+
382
+ except Exception as e:
383
+ error_msg = str(e)
384
+ logger.error(f"HuggingFace API error: {error_msg}", exc_info=True)
385
+
386
+ # Check for common errors
387
+ if "401" in error_msg or "unauthorized" in error_msg.lower():
388
+ yield {
389
+ "type": "agent_error",
390
+ "error": "Invalid HF_TOKEN",
391
+ "message": "Authentication failed. Please check your HF_TOKEN."
392
+ }
393
+ elif "rate" in error_msg.lower() or "limit" in error_msg.lower():
394
+ yield {
395
+ "type": "agent_error",
396
+ "error": "Rate limit reached",
397
+ "message": "Rate limit reached. Try again later or upgrade to HF PRO."
398
+ }
399
+ else:
400
+ yield {
401
+ "type": "agent_error",
402
+ "error": error_msg,
403
+ "message": f"API error: {error_msg}"
404
+ }
405
+ break
406
+
407
+ if iteration >= max_iterations:
408
+ yield {
409
+ "type": "agent_max_iterations",
410
+ "message": f"Reached maximum iterations ({max_iterations})",
411
+ "iterations": iteration
412
+ }
413
+
414
+ def _call_inference_api(self, messages: List[Dict], retry_count: int = 0) -> Any:
415
+ """
416
+ Call HuggingFace Inference API via the new router endpoint.
417
+ Uses the configured provider (e.g., nscale for Qwen3-32B).
418
+ """
419
+ import requests
420
+
421
+ headers = {
422
+ "Authorization": f"Bearer {self.hf_token}",
423
+ "Content-Type": "application/json"
424
+ }
425
+ last_error = None
426
+
427
+ # Add provider header if using a specific provider
428
+ if self.provider and self.provider != "hf-inference":
429
+ headers["X-HF-Provider"] = self.provider
430
+
431
+ # Use the router endpoint for chat completions
432
+ api_url = "https://router.huggingface.co/v1/chat/completions"
433
+
434
+ # Try the configured model first
435
+ try:
436
+ logger.info(f"Trying primary model: {self.model} via {self.provider}")
437
+
438
+ payload = {
439
+ "model": self.model,
440
+ "messages": messages,
441
+ "max_tokens": 2048,
442
+ "temperature": 0.7,
443
+ "stream": False,
444
+ "tools": self.tools, # Include tool definitions!
445
+ "tool_choice": "auto" # Let model decide when to use tools
446
+ }
447
+
448
+ response = requests.post(api_url, headers=headers, json=payload, timeout=120)
449
+
450
+ if response.status_code == 200:
451
+ result = response.json()
452
+ logger.info(f"Success with {self.model} via {self.provider}")
453
+ return self._create_chat_response(result)
454
+ elif response.status_code == 402:
455
+ logger.warning(f"Payment required for {self.model} via {self.provider}. Falling back...")
456
+ last_error = "Payment required - exceeded monthly credits"
457
+ elif response.status_code == 404:
458
+ logger.warning(f"Model {self.model} not found via {self.provider}. Falling back...")
459
+ last_error = f"Model not found via {self.provider}"
460
+ else:
461
+ logger.warning(f"Model {self.model} returned {response.status_code}: {response.text[:200]}")
462
+ last_error = f"HTTP {response.status_code}"
463
+
464
+ except Exception as e:
465
+ last_error = str(e)
466
+ logger.warning(f"Primary model failed: {last_error}")
467
+
468
+ # Fallback models with their providers
469
+ fallback_models = [
470
+ ("Qwen/Qwen2.5-72B-Instruct", None), # No provider = serverless
471
+ ("meta-llama/Llama-3.1-70B-Instruct", None),
472
+ ("mistralai/Mixtral-8x7B-Instruct-v0.1", None),
473
+ ("Qwen/Qwen3-32B", "nebius"), # Try nebius as backup
474
+ ("Qwen/Qwen3-8B", "together"), # Try together as backup
475
+ ]
476
+
477
+ for model, provider in fallback_models:
478
+ try:
479
+ logger.info(f"Trying fallback model: {model}" + (f" via {provider}" if provider else ""))
480
+
481
+ payload = {
482
+ "model": model,
483
+ "messages": messages,
484
+ "max_tokens": 2048,
485
+ "temperature": 0.7,
486
+ "stream": False,
487
+ "tools": self.tools, # Include tool definitions!
488
+ "tool_choice": "auto"
489
+ }
490
+
491
+ # Set headers for this fallback
492
+ fallback_headers = {
493
+ "Authorization": f"Bearer {self.hf_token}",
494
+ "Content-Type": "application/json"
495
+ }
496
+ if provider:
497
+ fallback_headers["X-HF-Provider"] = provider
498
+
499
+ response = requests.post(api_url, headers=fallback_headers, json=payload, timeout=120)
500
+
501
+ if response.status_code == 200:
502
+ result = response.json()
503
+ logger.info(f"Success with fallback model: {model}")
504
+ return self._create_chat_response(result)
505
+ elif response.status_code in [402, 404]:
506
+ logger.warning(f"Model {model} returned {response.status_code}, trying next...")
507
+ continue
508
+ elif response.status_code == 503:
509
+ logger.info(f"Model {model} is loading, trying next...")
510
+ continue
511
+ else:
512
+ logger.warning(f"Model {model} returned {response.status_code}")
513
+ continue
514
+
515
+ except Exception as e:
516
+ last_error = str(e)
517
+ logger.warning(f"Model {model} failed: {str(e)[:100]}")
518
+ continue
519
+
520
+ logger.error(f"All models failed. Last error: {last_error}")
521
+ raise Exception(f"All inference attempts failed: {last_error}")
522
+
523
+ def _strip_thinking_tags(self, text: str) -> str:
524
+ """Remove Qwen3's <think>...</think> tags and return the actual response"""
525
+ import re
526
+ if not text:
527
+ return ""
528
+ # Remove <think>...</think> blocks (Qwen3 chain-of-thought)
529
+ cleaned = re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL)
530
+ result = cleaned.strip()
531
+
532
+ # If stripped content is empty but original had thinking, extract a summary
533
+ if not result and '<think>' in text:
534
+ # Try to extract the last meaningful sentence from thinking as a fallback
535
+ think_match = re.search(r'<think>(.*?)</think>', text, flags=re.DOTALL)
536
+ if think_match:
537
+ think_content = think_match.group(1).strip()
538
+ # Get last few sentences as summary (model's conclusion)
539
+ sentences = [s.strip() for s in think_content.split('.') if s.strip()]
540
+ if sentences:
541
+ # Return last 2-3 meaningful sentences as the response
542
+ result = '. '.join(sentences[-3:]) + '.'
543
+ logger.info(f"Extracted thinking summary: {result[:100]}...")
544
+
545
+ return result
546
+
547
+ def _create_chat_response(self, result: dict) -> Any:
548
+ """Create a response object from chat completion result"""
549
+ strip_thinking = self._strip_thinking_tags
550
+
551
+ class MockChoice:
552
+ def __init__(self, message_data):
553
+ self.message = MockMessage(message_data)
554
+
555
+ class MockMessage:
556
+ def __init__(self, data):
557
+ # Handle None content properly (API might return {"content": null})
558
+ raw_content = data.get("content") or ""
559
+ # Strip Qwen3 thinking tags to get actual response
560
+ self.content = strip_thinking(raw_content)
561
+ # Store raw content for debugging/fallback
562
+ self.raw_content = raw_content
563
+ self.tool_calls = self._parse_tool_calls_from_response(data, raw_content)
564
+
565
+ def _parse_tool_calls_from_response(self, data, raw_content):
566
+ """Parse tool calls from API response or from content"""
567
+ # Check if API returned tool_calls directly
568
+ if "tool_calls" in data and data["tool_calls"]:
569
+ return [MockToolCall(tc) for tc in data["tool_calls"]]
570
+
571
+ # Otherwise try to parse from content (use raw content to find tool calls)
572
+ return self._parse_tool_calls_from_text(raw_content)
573
+
574
+ def _infer_tool_from_params(self, params):
575
+ """Infer tool name from parameter keys"""
576
+ if not isinstance(params, dict):
577
+ return None
578
+ keys = set(params.keys())
579
+
580
+ # Check for discover_prospects_with_contacts (HIGHEST PRIORITY - all-in-one tool)
581
+ if "client_company" in keys and "client_industry" in keys:
582
+ return "discover_prospects_with_contacts"
583
+ if "client_company" in keys and "target_prospects" in keys:
584
+ return "discover_prospects_with_contacts"
585
+ # Check for find_verified_contacts patterns (single company)
586
+ if "company_name" in keys and "company_domain" in keys and "target_titles" in keys:
587
+ return "find_verified_contacts"
588
+ if "company_name" in keys and "company_domain" in keys and "max_contacts" in keys:
589
+ return "find_verified_contacts"
590
+ # Check for save_prospect patterns
591
+ if "prospect_id" in keys or ("company_name" in keys and "fit_score" in keys):
592
+ return "save_prospect"
593
+ # Check for save_company patterns
594
+ if "company_id" in keys and ("name" in keys or "domain" in keys) and "prospect_id" not in keys:
595
+ return "save_company"
596
+ # Check for save_contact patterns (only for contacts returned by find_verified_contacts)
597
+ if "contact_id" in keys or ("email" in keys and ("first_name" in keys or "last_name" in keys)):
598
+ return "save_contact"
599
+ # Check for send_email patterns
600
+ if "to" in keys and "subject" in keys and "body" in keys:
601
+ return "send_email"
602
+ # Check for search patterns
603
+ if "query" in keys and len(keys) <= 2:
604
+ return "search_web"
605
+ # Check for save_fact patterns
606
+ if "fact_type" in keys or ("content" in keys and "company_id" in keys):
607
+ return "save_fact"
608
+
609
+ return None
610
+
611
+ def _parse_tool_calls_from_text(self, text):
612
+ """Try to parse tool calls from text response - handles Qwen3 text-based tool descriptions"""
613
+ import re
614
+ tool_calls = []
615
+
616
+ def extract_json_objects(text):
617
+ """Extract all JSON objects from text, handling nested braces"""
618
+ objects = []
619
+ i = 0
620
+ while i < len(text):
621
+ if text[i] == '{':
622
+ start = i
623
+ depth = 1
624
+ i += 1
625
+ while i < len(text) and depth > 0:
626
+ if text[i] == '{':
627
+ depth += 1
628
+ elif text[i] == '}':
629
+ depth -= 1
630
+ i += 1
631
+ if depth == 0:
632
+ try:
633
+ obj = json.loads(text[start:i])
634
+ objects.append(obj)
635
+ except:
636
+ pass
637
+ else:
638
+ i += 1
639
+ return objects
640
+
641
+ # IMPORTANT: Search BOTH raw text AND stripped text for JSON objects
642
+ # Qwen3 may put tool calls inside <think> tags
643
+ all_json_objects = extract_json_objects(text) # Search raw first
644
+
645
+ # Also search stripped version in case JSON is outside think tags
646
+ text_clean = strip_thinking(text)
647
+ if text_clean != text:
648
+ all_json_objects.extend(extract_json_objects(text_clean))
649
+ logger.info(f"Found {len(all_json_objects)} JSON objects in response")
650
+
651
+ # Process each JSON object and infer tool
652
+ seen_signatures = set() # Avoid duplicates
653
+ for obj in all_json_objects:
654
+ tool_name = self._infer_tool_from_params(obj)
655
+ if tool_name:
656
+ # Create a signature to avoid duplicates
657
+ sig = f"{tool_name}:{json.dumps(obj, sort_keys=True)}"
658
+ if sig not in seen_signatures:
659
+ seen_signatures.add(sig)
660
+ tool_calls.append(MockToolCallFromText({"tool": tool_name, "parameters": obj}))
661
+ logger.info(f"Parsed tool call: {tool_name} with params: {list(obj.keys())}")
662
+
663
+ # Also check code fence blocks (sometimes JSON is formatted there)
664
+ code_blocks = re.findall(r'```(?:json)?\s*(.+?)\s*```', text_clean, re.DOTALL)
665
+ for block in code_blocks:
666
+ block_objects = extract_json_objects(block)
667
+ for obj in block_objects:
668
+ tool_name = self._infer_tool_from_params(obj)
669
+ if tool_name:
670
+ sig = f"{tool_name}:{json.dumps(obj, sort_keys=True)}"
671
+ if sig not in seen_signatures:
672
+ seen_signatures.add(sig)
673
+ tool_calls.append(MockToolCallFromText({"tool": tool_name, "parameters": obj}))
674
+ logger.info(f"Parsed tool from code block: {tool_name}")
675
+
676
+ if tool_calls:
677
+ logger.info(f"Total tool calls parsed from text: {len(tool_calls)}")
678
+ return tool_calls if tool_calls else None
679
+
680
+ class MockToolCall:
681
+ def __init__(self, data):
682
+ self.function = MockFunction(data.get("function", {}))
683
+ self.id = data.get("id", f"call_{id(self)}")
684
+
685
+ class MockToolCallFromText:
686
+ def __init__(self, data):
687
+ self.function = MockFunctionFromText(data)
688
+ self.id = f"call_{id(self)}"
689
+
690
+ class MockFunction:
691
+ def __init__(self, data):
692
+ self.name = data.get("name", "")
693
+ self.arguments = data.get("arguments", "{}")
694
+
695
+ class MockFunctionFromText:
696
+ def __init__(self, data):
697
+ self.name = data.get("tool", data.get("name", ""))
698
+ self.arguments = json.dumps(data.get("parameters", data.get("arguments", {})))
699
+
700
+ class MockResponse:
701
+ def __init__(self, result):
702
+ choices_data = result.get("choices", [])
703
+ if choices_data:
704
+ self.choices = [MockChoice(c.get("message", {})) for c in choices_data]
705
+ else:
706
+ self.choices = []
707
+
708
+ return MockResponse(result)
709
+
710
+ async def _execute_mcp_tool(self, tool_name: str, tool_input: Dict[str, Any]) -> Any:
711
+ """
712
+ Execute an MCP tool by routing to the appropriate MCP server.
713
+
714
+ This is where we actually call the MCP servers!
715
+ """
716
+
717
+ # ============ SEARCH MCP SERVER ============
718
+ if tool_name == "search_web":
719
+ query = tool_input["query"]
720
+ max_results = tool_input.get("max_results", 5)
721
+
722
+ results = await self.mcp_registry.search.query(query, max_results=max_results)
723
+ return {
724
+ "results": results[:max_results],
725
+ "count": len(results[:max_results])
726
+ }
727
+
728
+ elif tool_name == "search_news":
729
+ query = tool_input["query"]
730
+ max_results = tool_input.get("max_results", 5)
731
+
732
+ results = await self.mcp_registry.search.query(f"{query} news", max_results=max_results)
733
+ return {
734
+ "results": results[:max_results],
735
+ "count": len(results[:max_results])
736
+ }
737
+
738
+ # ============ OPTIMIZED PROSPECT DISCOVERY WITH CONTACTS ============
739
+ elif tool_name == "discover_prospects_with_contacts":
740
+ from services.enhanced_contact_finder import EnhancedContactFinder
741
+ from urllib.parse import urlparse
742
+
743
+ client_company = tool_input["client_company"]
744
+ client_industry = tool_input["client_industry"]
745
+ target_prospects = tool_input.get("target_prospects", 3)
746
+ target_titles = tool_input.get("target_titles", ["CEO", "Founder", "VP Sales", "CTO", "Head of Sales"])
747
+
748
+ logger.info(f"Discovering {target_prospects} prospects with contacts for {client_company}")
749
+ print(f"\n[PROSPECT DISCOVERY] ========================================")
750
+ print(f"[PROSPECT DISCOVERY] Finding {target_prospects} prospects WITH verified contacts")
751
+ print(f"[PROSPECT DISCOVERY] Client: {client_company}")
752
+ print(f"[PROSPECT DISCOVERY] ========================================")
753
+
754
+ contact_finder = EnhancedContactFinder(mcp_registry=self.mcp_registry)
755
+
756
+ saved_prospects = []
757
+ all_contacts = []
758
+ skipped_companies = []
759
+ companies_checked = 0
760
+ max_companies_to_check = target_prospects * 8 # Check more companies to find enough with contacts
761
+
762
+ # Build smart search queries based on what the client company does
763
+ # The goal is to find CUSTOMERS for the client, not articles ABOUT the client
764
+ client_lower = client_company.lower()
765
+ industry_lower = client_industry.lower()
766
+
767
+ # Determine prospect type based on client business
768
+ # E-commerce platforms (Shopify, BigCommerce, etc.) -> retailers, DTC brands
769
+ # CRM software -> B2B companies, sales teams
770
+ # Marketing tools -> businesses needing marketing
771
+ # etc.
772
+
773
+ search_queries = []
774
+
775
+ # Check for e-commerce/retail platform clients
776
+ if any(kw in client_lower or kw in industry_lower for kw in ['ecommerce', 'e-commerce', 'shopify', 'online store', 'retail platform', 'shopping cart']):
777
+ search_queries = [
778
+ "DTC brands fashion apparel company",
779
+ "online boutique store founder CEO",
780
+ "independent retail brand ecommerce",
781
+ "emerging consumer brands direct to consumer",
782
+ "small business online store owner",
783
+ "handmade crafts seller business",
784
+ "subscription box company founder",
785
+ ]
786
+ # Check for CRM/Sales software clients
787
+ elif any(kw in client_lower or kw in industry_lower for kw in ['crm', 'salesforce', 'sales software', 'customer relationship']):
788
+ search_queries = [
789
+ "B2B SaaS company sales team",
790
+ "growing startup sales operations",
791
+ "enterprise software company VP Sales",
792
+ "technology company Head of Sales",
793
+ ]
794
+ # Check for marketing/advertising clients
795
+ elif any(kw in client_lower or kw in industry_lower for kw in ['marketing', 'advertising', 'ads', 'seo', 'content']):
796
+ search_queries = [
797
+ "growing startup marketing director",
798
+ "ecommerce brand marketing team",
799
+ "B2B company CMO marketing",
800
+ "technology startup growth marketing",
801
+ ]
802
+ # Default: find growing companies that might need the client's services
803
+ else:
804
+ search_queries = [
805
+ f"growing companies {industry_lower} customers list",
806
+ f"startups using {industry_lower} solutions",
807
+ f"businesses {industry_lower} case study customer",
808
+ f"companies similar to {client_company} customers",
809
+ "fast growing startups Series A B2B",
810
+ "emerging technology companies founder CEO",
811
+ "mid-market companies digital transformation",
812
+ ]
813
+
814
+ # Add generic business-finding queries
815
+ search_queries.extend([
816
+ "Inc 5000 fastest growing companies",
817
+ "emerging brands startup founders",
818
+ "venture backed startups series A",
819
+ ])
820
+
821
+ seen_domains = set()
822
+
823
+ # Skip domains that are NOT actual company websites
824
+ skip_domains = [
825
+ # Social media
826
+ 'linkedin.com', 'facebook.com', 'twitter.com', 'instagram.com', 'tiktok.com',
827
+ # Reference/directory sites
828
+ 'wikipedia.org', 'crunchbase.com', 'zoominfo.com', 'apollo.io', 'yelp.com',
829
+ 'glassdoor.com', 'g2.com', 'capterra.com', 'trustpilot.com', 'bbb.org',
830
+ # News/media sites
831
+ 'forbes.com', 'businessinsider.com', 'techcrunch.com', 'bloomberg.com',
832
+ 'cnbc.com', 'reuters.com', 'wsj.com', 'nytimes.com', 'theverge.com',
833
+ 'wired.com', 'mashable.com', 'venturebeat.com', 'inc.com', 'entrepreneur.com',
834
+ # Blog/article/review sites
835
+ 'medium.com', 'hubspot.com', 'blog.', 'wordpress.com', 'blogspot.com',
836
+ 'quora.com', 'reddit.com', 'youtube.com', 'vimeo.com',
837
+ # Generic/aggregator sites
838
+ 'amazon.com', 'ebay.com', 'alibaba.com', 'aliexpress.com',
839
+ 'google.com', 'bing.com', 'yahoo.com', 'duckduckgo.com',
840
+ # The client company itself (don't prospect yourself!)
841
+ client_company.lower().replace(' ', '') + '.com',
842
+ ]
843
+
844
+ # Also skip titles that look like articles, not company names
845
+ skip_title_patterns = [
846
+ 'what is', 'how to', 'guide', 'review', 'best ', 'top ', 'vs ',
847
+ ' vs ', 'comparison', 'tutorial', 'tips', 'ways to', 'complete',
848
+ 'everything you need', 'beginner', 'introduction', 'explained',
849
+ '2024', '2025', '2023', '[', ']', 'list of', 'examples'
850
+ ]
851
+
852
+ for query in search_queries:
853
+ if len(saved_prospects) >= target_prospects:
854
+ break
855
+
856
+ try:
857
+ print(f"\n[PROSPECT DISCOVERY] Searching: {query}")
858
+ results = await self.mcp_registry.search.query(query, max_results=10)
859
+
860
+ for result in results:
861
+ if len(saved_prospects) >= target_prospects:
862
+ break
863
+ if companies_checked >= max_companies_to_check:
864
+ break
865
+
866
+ url = result.get('url', '')
867
+ title = result.get('title', '')
868
+
869
+ # Extract domain from URL
870
+ try:
871
+ parsed = urlparse(url)
872
+ domain = parsed.netloc.replace('www.', '')
873
+ if not domain or domain in seen_domains:
874
+ continue
875
+ seen_domains.add(domain)
876
+ except:
877
+ continue
878
+
879
+ # Skip non-company domains
880
+ if any(skip in domain.lower() for skip in skip_domains):
881
+ print(f"[PROSPECT DISCOVERY] ⏭️ Skipping non-company domain: {domain}")
882
+ continue
883
+
884
+ # Skip titles that look like articles, not companies
885
+ title_lower = title.lower()
886
+ if any(pattern in title_lower for pattern in skip_title_patterns):
887
+ print(f"[PROSPECT DISCOVERY] ⏭️ Skipping article title: {title[:50]}...")
888
+ continue
889
+
890
+ # Extract company name from title - be smarter about it
891
+ # Try to get actual company name, not article title
892
+ company_name = title.split(' - ')[0].split(' | ')[0].split(':')[0].strip()
893
+
894
+ # If company name is too long (probably article title), use domain
895
+ if len(company_name) > 40 or ' ' in company_name and len(company_name.split()) > 5:
896
+ # Extract company name from domain instead
897
+ company_name = domain.split('.')[0].replace('-', ' ').title()
898
+
899
+ if not company_name or len(company_name) < 2:
900
+ continue
901
+
902
+ companies_checked += 1
903
+ print(f"\n[PROSPECT DISCOVERY] Checking ({companies_checked}/{max_companies_to_check}): {company_name} ({domain})")
904
+
905
+ # Find contacts for this company
906
+ try:
907
+ contacts = await contact_finder.find_real_contacts(
908
+ company_name=company_name,
909
+ domain=domain,
910
+ target_titles=target_titles,
911
+ max_contacts=3
912
+ )
913
+
914
+ if contacts and len(contacts) > 0:
915
+ # Save prospect
916
+ prospect_id = f"prospect_{len(saved_prospects) + 1}"
917
+ company_id = domain.replace(".", "_")
918
+
919
+ prospect_data = {
920
+ "id": prospect_id,
921
+ "company": {
922
+ "id": company_id,
923
+ "name": company_name,
924
+ "domain": domain
925
+ },
926
+ "fit_score": 75,
927
+ "status": "new",
928
+ "metadata": {"source": "automated_discovery"}
929
+ }
930
+
931
+ await self.mcp_registry.store.save_prospect(prospect_data)
932
+
933
+ # Save contacts
934
+ contact_list = []
935
+ for contact in contacts:
936
+ contact_data = {
937
+ "id": contact.id,
938
+ "name": contact.name,
939
+ "email": contact.email,
940
+ "title": contact.title,
941
+ "company": company_name,
942
+ "domain": domain,
943
+ "verified": True,
944
+ "source": "web_search_and_scraping"
945
+ }
946
+ contact_list.append(contact_data)
947
+ all_contacts.append(contact_data)
948
+
949
+ await self.mcp_registry.store.save_contact({
950
+ "id": contact.id,
951
+ "company_id": company_id,
952
+ "email": contact.email,
953
+ "first_name": contact.name.split()[0] if contact.name else "",
954
+ "last_name": contact.name.split()[-1] if len(contact.name.split()) > 1 else "",
955
+ "title": contact.title
956
+ })
957
+
958
+ saved_prospects.append({
959
+ "prospect_id": prospect_id,
960
+ "company_name": company_name,
961
+ "domain": domain,
962
+ "contacts": contact_list,
963
+ "contact_count": len(contact_list)
964
+ })
965
+
966
+ print(f"[PROSPECT DISCOVERY] ✅ SAVED: {company_name} with {len(contacts)} contacts")
967
+ else:
968
+ skipped_companies.append({"name": company_name, "domain": domain, "reason": "no_contacts"})
969
+ print(f"[PROSPECT DISCOVERY] ⏭️ SKIPPED: {company_name} (no verified contacts)")
970
+
971
+ except Exception as e:
972
+ logger.debug(f"Error checking {company_name}: {str(e)}")
973
+ skipped_companies.append({"name": company_name, "domain": domain, "reason": str(e)})
974
+ continue
975
+
976
+ except Exception as e:
977
+ logger.debug(f"Search error: {str(e)}")
978
+ continue
979
+
980
+ print(f"\n[PROSPECT DISCOVERY] ========================================")
981
+ print(f"[PROSPECT DISCOVERY] DISCOVERY COMPLETE")
982
+ print(f"[PROSPECT DISCOVERY] ========================================")
983
+ print(f"[PROSPECT DISCOVERY] Prospects saved: {len(saved_prospects)}/{target_prospects}")
984
+ print(f"[PROSPECT DISCOVERY] Total contacts: {len(all_contacts)}")
985
+ print(f"[PROSPECT DISCOVERY] Companies checked: {companies_checked}")
986
+ print(f"[PROSPECT DISCOVERY] Companies skipped: {len(skipped_companies)}")
987
+ print(f"[PROSPECT DISCOVERY] ========================================\n")
988
+
989
+ return {
990
+ "status": "success" if len(saved_prospects) > 0 else "no_prospects_found",
991
+ "prospects": saved_prospects,
992
+ "prospects_count": len(saved_prospects),
993
+ "contacts_count": len(all_contacts),
994
+ "companies_checked": companies_checked,
995
+ "companies_skipped": len(skipped_companies),
996
+ "target_met": len(saved_prospects) >= target_prospects,
997
+ "message": f"Found {len(saved_prospects)} prospects with {len(all_contacts)} verified contacts. Checked {companies_checked} companies, skipped {len(skipped_companies)} (no contacts)."
998
+ }
999
+
1000
+ # ============ VERIFIED CONTACT FINDER (Single Company) ============
1001
+ elif tool_name == "find_verified_contacts":
1002
+ from services.enhanced_contact_finder import EnhancedContactFinder
1003
+
1004
+ company_name = tool_input["company_name"]
1005
+ company_domain = tool_input["company_domain"]
1006
+ target_titles = tool_input.get("target_titles", ["CEO", "Founder", "VP Sales", "CTO", "Head of Sales"])
1007
+ max_contacts = tool_input.get("max_contacts", 3)
1008
+
1009
+ logger.info(f"Finding verified contacts for {company_name} ({company_domain})")
1010
+
1011
+ contact_finder = EnhancedContactFinder(mcp_registry=self.mcp_registry)
1012
+
1013
+ try:
1014
+ contacts = await contact_finder.find_real_contacts(
1015
+ company_name=company_name,
1016
+ domain=company_domain,
1017
+ target_titles=target_titles,
1018
+ max_contacts=max_contacts
1019
+ )
1020
+
1021
+ contact_list = []
1022
+ for contact in contacts:
1023
+ contact_data = {
1024
+ "id": contact.id,
1025
+ "name": contact.name,
1026
+ "email": contact.email,
1027
+ "title": contact.title,
1028
+ "company": company_name,
1029
+ "domain": company_domain,
1030
+ "verified": True,
1031
+ "source": "web_search_and_scraping"
1032
+ }
1033
+ contact_list.append(contact_data)
1034
+
1035
+ await self.mcp_registry.store.save_contact({
1036
+ "id": contact.id,
1037
+ "company_id": company_domain.replace(".", "_"),
1038
+ "email": contact.email,
1039
+ "first_name": contact.name.split()[0] if contact.name else "",
1040
+ "last_name": contact.name.split()[-1] if contact.name and len(contact.name.split()) > 1 else "",
1041
+ "title": contact.title
1042
+ })
1043
+
1044
+ if contact_list:
1045
+ return {
1046
+ "status": "success",
1047
+ "contacts": contact_list,
1048
+ "count": len(contact_list),
1049
+ "message": f"Found {len(contact_list)} verified contacts at {company_name}",
1050
+ "should_save_prospect": True
1051
+ }
1052
+ else:
1053
+ return {
1054
+ "status": "no_contacts_found",
1055
+ "contacts": [],
1056
+ "count": 0,
1057
+ "message": f"No verified contacts found for {company_name}. Skip this prospect.",
1058
+ "should_save_prospect": False
1059
+ }
1060
+
1061
+ except Exception as e:
1062
+ logger.error(f"Error finding contacts for {company_name}: {str(e)}")
1063
+ return {
1064
+ "status": "error",
1065
+ "contacts": [],
1066
+ "count": 0,
1067
+ "message": f"Error searching for contacts: {str(e)}",
1068
+ "should_save_prospect": False
1069
+ }
1070
+
1071
+ # ============ STORE MCP SERVER ============
1072
+ elif tool_name == "save_prospect":
1073
+ prospect_data = {
1074
+ "id": tool_input.get("prospect_id", str(uuid.uuid4())),
1075
+ "company": {
1076
+ "id": tool_input.get("company_id"),
1077
+ "name": tool_input.get("company_name"),
1078
+ "domain": tool_input.get("company_domain")
1079
+ },
1080
+ "fit_score": tool_input.get("fit_score", 0),
1081
+ "status": tool_input.get("status", "new"),
1082
+ "metadata": tool_input.get("metadata", {})
1083
+ }
1084
+
1085
+ result = await self.mcp_registry.store.save_prospect(prospect_data)
1086
+ return {"status": result, "prospect_id": prospect_data["id"]}
1087
+
1088
+ elif tool_name == "get_prospect":
1089
+ prospect_id = tool_input["prospect_id"]
1090
+ prospect = await self.mcp_registry.store.get_prospect(prospect_id)
1091
+ return prospect or {"error": "Prospect not found"}
1092
+
1093
+ elif tool_name == "list_prospects":
1094
+ prospects = await self.mcp_registry.store.list_prospects()
1095
+ status_filter = tool_input.get("status")
1096
+
1097
+ if status_filter:
1098
+ prospects = [p for p in prospects if p.get("status") == status_filter]
1099
+
1100
+ return {
1101
+ "prospects": prospects,
1102
+ "count": len(prospects)
1103
+ }
1104
+
1105
+ elif tool_name == "save_company":
1106
+ company_data = {
1107
+ "id": tool_input.get("company_id", str(uuid.uuid4())),
1108
+ "name": tool_input["name"],
1109
+ "domain": tool_input["domain"],
1110
+ "industry": tool_input.get("industry"),
1111
+ "description": tool_input.get("description"),
1112
+ "employee_count": tool_input.get("employee_count")
1113
+ }
1114
+
1115
+ result = await self.mcp_registry.store.save_company(company_data)
1116
+ return {"status": result, "company_id": company_data["id"]}
1117
+
1118
+ elif tool_name == "get_company":
1119
+ company_id = tool_input["company_id"]
1120
+ company = await self.mcp_registry.store.get_company(company_id)
1121
+ return company or {"error": "Company not found"}
1122
+
1123
+ elif tool_name == "save_fact":
1124
+ fact_data = {
1125
+ "id": tool_input.get("fact_id", str(uuid.uuid4())),
1126
+ "company_id": tool_input["company_id"],
1127
+ "fact_type": tool_input["fact_type"],
1128
+ "content": tool_input["content"],
1129
+ "source_url": tool_input.get("source_url"),
1130
+ "confidence_score": tool_input.get("confidence_score", 0.8)
1131
+ }
1132
+
1133
+ result = await self.mcp_registry.store.save_fact(fact_data)
1134
+ return {"status": result, "fact_id": fact_data["id"]}
1135
+
1136
+ elif tool_name == "save_contact":
1137
+ contact_data = {
1138
+ "id": tool_input.get("contact_id", str(uuid.uuid4())),
1139
+ "company_id": tool_input["company_id"],
1140
+ "email": tool_input["email"],
1141
+ "first_name": tool_input.get("first_name"),
1142
+ "last_name": tool_input.get("last_name"),
1143
+ "title": tool_input.get("title"),
1144
+ "seniority": tool_input.get("seniority")
1145
+ }
1146
+
1147
+ result = await self.mcp_registry.store.save_contact(contact_data)
1148
+ return {"status": result, "contact_id": contact_data["id"]}
1149
+
1150
+ elif tool_name == "list_contacts_by_domain":
1151
+ domain = tool_input["domain"]
1152
+ contacts = await self.mcp_registry.store.list_contacts_by_domain(domain)
1153
+ return {
1154
+ "contacts": contacts,
1155
+ "count": len(contacts)
1156
+ }
1157
+
1158
+ elif tool_name == "check_suppression":
1159
+ supp_type = tool_input["suppression_type"]
1160
+ value = tool_input["value"]
1161
+
1162
+ is_suppressed = await self.mcp_registry.store.check_suppression(supp_type, value)
1163
+ return {
1164
+ "suppressed": is_suppressed,
1165
+ "value": value,
1166
+ "type": supp_type
1167
+ }
1168
+
1169
+ # ============ EMAIL MCP SERVER ============
1170
+ elif tool_name == "send_email":
1171
+ to = tool_input["to"]
1172
+ subject = tool_input["subject"]
1173
+ body = tool_input["body"]
1174
+ prospect_id = tool_input["prospect_id"]
1175
+
1176
+ thread_id = await self.mcp_registry.email.send(to, subject, body, prospect_id)
1177
+ return {
1178
+ "status": "sent",
1179
+ "thread_id": thread_id,
1180
+ "to": to
1181
+ }
1182
+
1183
+ elif tool_name == "get_email_thread":
1184
+ prospect_id = tool_input["prospect_id"]
1185
+ thread = await self.mcp_registry.email.get_thread(prospect_id)
1186
+ return thread or {"error": "No email thread found"}
1187
+
1188
+ # ============ CALENDAR MCP SERVER ============
1189
+ elif tool_name == "suggest_meeting_slots":
1190
+ num_slots = tool_input.get("num_slots", 3)
1191
+ slots = await self.mcp_registry.calendar.suggest_slots()
1192
+ return {
1193
+ "slots": slots[:num_slots],
1194
+ "count": len(slots[:num_slots])
1195
+ }
1196
+
1197
+ elif tool_name == "generate_calendar_invite":
1198
+ start_time = tool_input["start_time"]
1199
+ end_time = tool_input["end_time"]
1200
+ title = tool_input["title"]
1201
+
1202
+ slot = {
1203
+ "start_iso": start_time,
1204
+ "end_iso": end_time,
1205
+ "title": title
1206
+ }
1207
+
1208
+ ics = await self.mcp_registry.calendar.generate_ics(slot)
1209
+ return {
1210
+ "ics_content": ics,
1211
+ "meeting": slot
1212
+ }
1213
+
1214
+ else:
1215
+ raise ValueError(f"Unknown MCP tool: {tool_name}")
mcp/agents/autonomous_agent_ollama.py ADDED
@@ -0,0 +1,356 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Autonomous AI Agent with MCP Tool Calling using Ollama Python Client
3
+
4
+ Uses the ollama Python package for LLM inference.
5
+ Based on: https://github.com/ollama/ollama-python
6
+
7
+ Example usage (from the guide):
8
+ from ollama import chat
9
+ response = chat(
10
+ model='granite4:1b',
11
+ messages=[
12
+ {'role': 'system', 'content': 'You are a helpful assistant.'},
13
+ {'role': 'user', 'content': user_input}
14
+ ],
15
+ options={'temperature': 0.0, 'top_p': 1.0}
16
+ )
17
+ output = response.message.content
18
+ """
19
+
20
+ import os
21
+ import json
22
+ import uuid
23
+ import logging
24
+ import asyncio
25
+ from typing import List, Dict, Any, AsyncGenerator
26
+
27
+ from mcp.tools.definitions import MCP_TOOLS
28
+ from mcp.registry import MCPRegistry
29
+
30
+ logger = logging.getLogger(__name__)
31
+
32
+ # Default model - IBM Granite 4 1B
33
+ DEFAULT_MODEL = "granite4:1b"
34
+
35
+
36
+ class AutonomousMCPAgentOllama:
37
+ """
38
+ AI Agent using Ollama Python client (FREE local LLM)
39
+
40
+ Uses ollama.chat() directly as per the official documentation.
41
+ Temperature=0.0 and top_p=1.0 recommended for Granite family models.
42
+ """
43
+
44
+ def __init__(
45
+ self,
46
+ mcp_registry: MCPRegistry,
47
+ model: str = None
48
+ ):
49
+ self.mcp_registry = mcp_registry
50
+ self.model = model or os.getenv("OLLAMA_MODEL", DEFAULT_MODEL)
51
+ self.tools_description = self._build_tools_description()
52
+
53
+ logger.info(f"Ollama Agent initialized with model: {self.model}")
54
+
55
+ def _build_tools_description(self) -> str:
56
+ """Build tool descriptions for the system prompt"""
57
+ tools_text = ""
58
+ for tool in MCP_TOOLS:
59
+ tools_text += f"\n- **{tool['name']}**: {tool['description']}"
60
+ props = tool.get('input_schema', {}).get('properties', {})
61
+ required = tool.get('input_schema', {}).get('required', [])
62
+ if props:
63
+ tools_text += "\n Parameters:"
64
+ for param, details in props.items():
65
+ req = "(required)" if param in required else "(optional)"
66
+ tools_text += f"\n - {param} {req}: {details.get('description', '')}"
67
+ return tools_text
68
+
69
+ def _build_system_prompt(self) -> str:
70
+ return f"""You are an AI sales agent with access to tools.
71
+
72
+ AVAILABLE TOOLS:
73
+ {self.tools_description}
74
+
75
+ TO USE A TOOL, respond with JSON:
76
+ ```json
77
+ {{"tool": "tool_name", "parameters": {{"param1": "value1"}}}}
78
+ ```
79
+
80
+ RULES:
81
+ 1. Use search_web to find information
82
+ 2. Use save_prospect, save_contact to store data
83
+ 3. Use send_email to draft emails
84
+ 4. Say "DONE" when finished with a summary
85
+
86
+ Be concise."""
87
+
88
+ async def run(self, task: str, max_iterations: int = 15) -> AsyncGenerator[Dict[str, Any], None]:
89
+ """Run the agent on a task"""
90
+
91
+ yield {
92
+ "type": "agent_start",
93
+ "message": f"Starting with Ollama ({self.model})",
94
+ "model": self.model
95
+ }
96
+
97
+ system_prompt = self._build_system_prompt()
98
+ messages = [
99
+ {"role": "system", "content": system_prompt},
100
+ {"role": "user", "content": task}
101
+ ]
102
+
103
+ for iteration in range(1, max_iterations + 1):
104
+ yield {
105
+ "type": "iteration_start",
106
+ "iteration": iteration,
107
+ "message": f"Iteration {iteration}: Thinking..."
108
+ }
109
+
110
+ try:
111
+ # Call Ollama using the Python client
112
+ response = await self._call_ollama(messages)
113
+ assistant_content = response.get("content", "")
114
+
115
+ if not assistant_content:
116
+ continue
117
+
118
+ # Check for completion
119
+ if "DONE" in assistant_content.upper():
120
+ final_text = assistant_content.replace("DONE", "").replace("done", "").strip()
121
+ yield {
122
+ "type": "thought",
123
+ "thought": final_text,
124
+ "message": "Task complete"
125
+ }
126
+ yield {
127
+ "type": "agent_complete",
128
+ "message": "Task complete!",
129
+ "final_answer": final_text,
130
+ "iterations": iteration
131
+ }
132
+ return
133
+
134
+ # Parse tool calls
135
+ tool_calls = self._parse_tool_calls(assistant_content)
136
+
137
+ if tool_calls:
138
+ messages.append({"role": "assistant", "content": assistant_content})
139
+ tool_results = []
140
+
141
+ for tool_call in tool_calls:
142
+ tool_name = tool_call.get("tool", "")
143
+ tool_params = tool_call.get("parameters", {})
144
+
145
+ yield {
146
+ "type": "tool_call",
147
+ "tool": tool_name,
148
+ "input": tool_params,
149
+ "message": f"Calling: {tool_name}"
150
+ }
151
+
152
+ try:
153
+ result = await self._execute_tool(tool_name, tool_params)
154
+ yield {
155
+ "type": "tool_result",
156
+ "tool": tool_name,
157
+ "result": result,
158
+ "message": f"{tool_name} completed"
159
+ }
160
+ tool_results.append({"tool": tool_name, "result": result})
161
+ except Exception as e:
162
+ yield {
163
+ "type": "tool_error",
164
+ "tool": tool_name,
165
+ "error": str(e)
166
+ }
167
+ tool_results.append({"tool": tool_name, "error": str(e)})
168
+
169
+ # Add results to conversation
170
+ results_text = "Tool results:\n" + json.dumps(tool_results, indent=2, default=str)[:2000]
171
+ messages.append({"role": "user", "content": results_text})
172
+ else:
173
+ # No tool calls
174
+ yield {
175
+ "type": "thought",
176
+ "thought": assistant_content,
177
+ "message": f"AI: {assistant_content[:100]}..."
178
+ }
179
+ messages.append({"role": "assistant", "content": assistant_content})
180
+ messages.append({"role": "user", "content": "Continue. Use tools to complete the task. Say DONE when finished."})
181
+
182
+ except Exception as e:
183
+ logger.error(f"Error: {e}")
184
+ yield {
185
+ "type": "agent_error",
186
+ "error": str(e),
187
+ "message": f"Error: {e}"
188
+ }
189
+ return
190
+
191
+ yield {
192
+ "type": "agent_max_iterations",
193
+ "message": f"Reached max iterations ({max_iterations})",
194
+ "iterations": max_iterations
195
+ }
196
+
197
+ async def _call_ollama(self, messages: List[Dict]) -> Dict:
198
+ """
199
+ Call Ollama using the official Python client.
200
+
201
+ Uses ollama.chat() directly as per the guide:
202
+ https://github.com/ollama/ollama-python
203
+
204
+ Temperature=0.0 and top_p=1.0 recommended for Granite models.
205
+ """
206
+ try:
207
+ from ollama import chat, ResponseError
208
+ except ImportError:
209
+ raise ImportError("ollama package not installed. Run: pip install ollama")
210
+
211
+ try:
212
+ # Use ollama.chat() directly as shown in the guide
213
+ # Run in executor to not block the async event loop
214
+ loop = asyncio.get_event_loop()
215
+ response = await loop.run_in_executor(
216
+ None,
217
+ lambda: chat(
218
+ model=self.model,
219
+ messages=messages,
220
+ options={
221
+ "temperature": 0.0, # Deterministic output for tool calling
222
+ "top_p": 1.0 # Full probability mass (Granite recommended)
223
+ }
224
+ )
225
+ )
226
+
227
+ # Extract response content: response.message.content
228
+ content = ""
229
+ if hasattr(response, 'message') and hasattr(response.message, 'content'):
230
+ content = response.message.content
231
+ elif isinstance(response, dict):
232
+ content = response.get("message", {}).get("content", "")
233
+
234
+ return {"content": content}
235
+
236
+ except ResponseError as e:
237
+ # Handle Ollama-specific errors (model not available, etc.)
238
+ logger.error(f"Ollama ResponseError: {e}")
239
+ raise Exception(f"Ollama error: {e}. Make sure Ollama is running and the model '{self.model}' is pulled.")
240
+ except Exception as e:
241
+ logger.error(f"Ollama call failed: {e}")
242
+ raise Exception(f"Ollama error: {e}")
243
+
244
+ def _parse_tool_calls(self, text: str) -> List[Dict]:
245
+ """Parse tool calls from response"""
246
+ import re
247
+
248
+ tool_calls = []
249
+ patterns = [
250
+ r'```json\s*(\{[^`]+\})\s*```',
251
+ r'```\s*(\{[^`]+\})\s*```',
252
+ r'(\{"tool":\s*"[^"]+",\s*"parameters":\s*\{[^}]*\}\})',
253
+ ]
254
+
255
+ for pattern in patterns:
256
+ matches = re.findall(pattern, text, re.DOTALL)
257
+ for match in matches:
258
+ try:
259
+ data = json.loads(match.strip())
260
+ if "tool" in data:
261
+ tool_calls.append(data)
262
+ except:
263
+ continue
264
+
265
+ return tool_calls
266
+
267
+ async def _execute_tool(self, tool_name: str, tool_input: Dict[str, Any]) -> Any:
268
+ """Execute an MCP tool"""
269
+
270
+ if tool_name == "search_web":
271
+ query = tool_input.get("query", "")
272
+ max_results = tool_input.get("max_results", 5)
273
+ results = await self.mcp_registry.search.query(query, max_results=max_results)
274
+ return {"results": results[:max_results], "count": len(results[:max_results])}
275
+
276
+ elif tool_name == "search_news":
277
+ query = tool_input.get("query", "")
278
+ max_results = tool_input.get("max_results", 5)
279
+ results = await self.mcp_registry.search.query(f"{query} news", max_results=max_results)
280
+ return {"results": results[:max_results], "count": len(results[:max_results])}
281
+
282
+ elif tool_name == "save_prospect":
283
+ prospect_data = {
284
+ "id": tool_input.get("prospect_id", str(uuid.uuid4())),
285
+ "company": {
286
+ "id": tool_input.get("company_id"),
287
+ "name": tool_input.get("company_name"),
288
+ "domain": tool_input.get("company_domain")
289
+ },
290
+ "fit_score": tool_input.get("fit_score", 0),
291
+ "status": tool_input.get("status", "new"),
292
+ "metadata": tool_input.get("metadata", {})
293
+ }
294
+ result = await self.mcp_registry.store.save_prospect(prospect_data)
295
+ return {"status": result, "prospect_id": prospect_data["id"]}
296
+
297
+ elif tool_name == "save_company":
298
+ company_data = {
299
+ "id": tool_input.get("company_id", str(uuid.uuid4())),
300
+ "name": tool_input.get("name", ""),
301
+ "domain": tool_input.get("domain", ""),
302
+ "industry": tool_input.get("industry"),
303
+ "description": tool_input.get("description"),
304
+ "employee_count": tool_input.get("employee_count")
305
+ }
306
+ result = await self.mcp_registry.store.save_company(company_data)
307
+ return {"status": result, "company_id": company_data["id"]}
308
+
309
+ elif tool_name == "save_contact":
310
+ contact_data = {
311
+ "id": tool_input.get("contact_id", str(uuid.uuid4())),
312
+ "company_id": tool_input.get("company_id", ""),
313
+ "email": tool_input.get("email", ""),
314
+ "first_name": tool_input.get("first_name"),
315
+ "last_name": tool_input.get("last_name"),
316
+ "title": tool_input.get("title"),
317
+ "seniority": tool_input.get("seniority")
318
+ }
319
+ result = await self.mcp_registry.store.save_contact(contact_data)
320
+ return {"status": result, "contact_id": contact_data["id"]}
321
+
322
+ elif tool_name == "save_fact":
323
+ fact_data = {
324
+ "id": tool_input.get("fact_id", str(uuid.uuid4())),
325
+ "company_id": tool_input.get("company_id", ""),
326
+ "fact_type": tool_input.get("fact_type", ""),
327
+ "content": tool_input.get("content", ""),
328
+ "source_url": tool_input.get("source_url"),
329
+ "confidence_score": tool_input.get("confidence_score", 0.8)
330
+ }
331
+ result = await self.mcp_registry.store.save_fact(fact_data)
332
+ return {"status": result, "fact_id": fact_data["id"]}
333
+
334
+ elif tool_name == "send_email":
335
+ to = tool_input.get("to", "")
336
+ subject = tool_input.get("subject", "")
337
+ body = tool_input.get("body", "")
338
+ prospect_id = tool_input.get("prospect_id", "")
339
+ thread_id = await self.mcp_registry.email.send(to, subject, body, prospect_id)
340
+ return {"status": "drafted", "thread_id": thread_id, "to": to}
341
+
342
+ elif tool_name == "list_prospects":
343
+ prospects = await self.mcp_registry.store.list_prospects()
344
+ return {"prospects": prospects, "count": len(prospects)}
345
+
346
+ elif tool_name == "get_prospect":
347
+ prospect_id = tool_input.get("prospect_id", "")
348
+ prospect = await self.mcp_registry.store.get_prospect(prospect_id)
349
+ return prospect or {"error": "Not found"}
350
+
351
+ elif tool_name == "suggest_meeting_slots":
352
+ slots = await self.mcp_registry.calendar.suggest_slots()
353
+ return {"slots": slots[:3], "count": len(slots[:3])}
354
+
355
+ else:
356
+ raise ValueError(f"Unknown tool: {tool_name}")
mcp/agents/autonomous_agent_transformers.py ADDED
@@ -0,0 +1,609 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Autonomous AI Agent with MCP Tool Calling using Local Transformers
3
+
4
+ This agent uses Hugging Face Transformers library to run models locally,
5
+ avoiding inference API delays and availability issues.
6
+
7
+ Uses Qwen3-0.6B for fast, local inference with tool calling support.
8
+ """
9
+
10
+ import os
11
+ import json
12
+ import uuid
13
+ import logging
14
+ import asyncio
15
+ import re
16
+ from typing import List, Dict, Any, AsyncGenerator, Optional
17
+
18
+ from mcp.tools.definitions import MCP_TOOLS, list_all_tools
19
+ from mcp.registry import MCPRegistry
20
+
21
+ logger = logging.getLogger(__name__)
22
+
23
+ # Default model - small but capable
24
+ DEFAULT_MODEL = "Qwen/Qwen3-0.6B"
25
+
26
+
27
+ class AutonomousMCPAgentTransformers:
28
+ """
29
+ AI Agent that autonomously uses MCP servers as tools using local Transformers.
30
+
31
+ Runs models locally for fast, reliable inference without API dependencies.
32
+ """
33
+
34
+ def __init__(
35
+ self,
36
+ mcp_registry: MCPRegistry,
37
+ model_name: str = None,
38
+ device: str = None
39
+ ):
40
+ """
41
+ Initialize the autonomous agent with local Transformers
42
+
43
+ Args:
44
+ mcp_registry: MCP registry with all servers
45
+ model_name: Model to use (default: Qwen/Qwen3-0.6B)
46
+ device: Device to run on ('cuda', 'cpu', or 'auto')
47
+ """
48
+ self.mcp_registry = mcp_registry
49
+ self.model_name = model_name or os.getenv("TRANSFORMERS_MODEL", DEFAULT_MODEL)
50
+ self.device = device or os.getenv("TRANSFORMERS_DEVICE", "auto")
51
+
52
+ # Lazy load model and tokenizer
53
+ self.pipeline = None
54
+ self.tokenizer = None
55
+ self.model = None
56
+ self._initialized = False
57
+
58
+ # Create tool definitions for the prompt
59
+ self.tools_description = self._create_tools_description()
60
+
61
+ logger.info(f"Autonomous MCP Agent (Transformers) initialized")
62
+ logger.info(f" Model: {self.model_name}")
63
+ logger.info(f" Device: {self.device}")
64
+ logger.info(f" Available tools: {len(MCP_TOOLS)}")
65
+
66
+ def _initialize_model(self):
67
+ """Lazy initialization of the model"""
68
+ if self._initialized:
69
+ return
70
+
71
+ logger.info(f"Loading model {self.model_name}...")
72
+
73
+ try:
74
+ from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
75
+ import torch
76
+
77
+ # Determine device
78
+ if self.device == "auto":
79
+ device = "cuda" if torch.cuda.is_available() else "cpu"
80
+ else:
81
+ device = self.device
82
+
83
+ logger.info(f"Using device: {device}")
84
+
85
+ # Load tokenizer
86
+ self.tokenizer = AutoTokenizer.from_pretrained(
87
+ self.model_name,
88
+ trust_remote_code=True
89
+ )
90
+
91
+ # Load model with appropriate settings
92
+ model_kwargs = {
93
+ "trust_remote_code": True,
94
+ }
95
+
96
+ if device == "cuda":
97
+ model_kwargs["torch_dtype"] = torch.float16
98
+ model_kwargs["device_map"] = "auto"
99
+ else:
100
+ model_kwargs["torch_dtype"] = torch.float32
101
+
102
+ self.model = AutoModelForCausalLM.from_pretrained(
103
+ self.model_name,
104
+ **model_kwargs
105
+ )
106
+
107
+ if device == "cpu":
108
+ self.model = self.model.to(device)
109
+
110
+ # Create pipeline for easier generation
111
+ self.pipeline = pipeline(
112
+ "text-generation",
113
+ model=self.model,
114
+ tokenizer=self.tokenizer,
115
+ device=None if device == "cuda" else device, # device_map handles cuda
116
+ )
117
+
118
+ self._initialized = True
119
+ logger.info(f"Model {self.model_name} loaded successfully")
120
+
121
+ except ImportError as e:
122
+ raise ImportError(
123
+ f"transformers package not installed or missing dependencies!\n"
124
+ f"Install with: pip install transformers torch\n"
125
+ f"Error: {e}"
126
+ )
127
+ except Exception as e:
128
+ logger.error(f"Failed to load model: {e}")
129
+ raise
130
+
131
+ def _create_tools_description(self) -> str:
132
+ """Create a description of available tools for the prompt"""
133
+ tools_text = "Available tools:\n\n"
134
+
135
+ for tool in MCP_TOOLS:
136
+ tools_text += f"- **{tool['name']}**: {tool['description']}\n"
137
+ if tool.get('input_schema', {}).get('properties'):
138
+ tools_text += " Parameters:\n"
139
+ for param, details in tool['input_schema']['properties'].items():
140
+ required = param in tool['input_schema'].get('required', [])
141
+ req_str = " (required)" if required else " (optional)"
142
+ tools_text += f" - {param}{req_str}: {details.get('description', '')}\n"
143
+ tools_text += "\n"
144
+
145
+ return tools_text
146
+
147
+ def _build_system_prompt(self) -> str:
148
+ """Build the system prompt with tool instructions"""
149
+ return f"""You are an autonomous AI agent for B2B sales automation.
150
+
151
+ You have access to MCP (Model Context Protocol) tools that let you:
152
+ - Search the web for company information and news
153
+ - Save prospects, companies, contacts, and facts to a database
154
+ - Send emails and manage email threads
155
+ - Schedule meetings and generate calendar invites
156
+
157
+ {self.tools_description}
158
+
159
+ To use a tool, respond with a JSON block in this exact format:
160
+ ```tool
161
+ {{"tool": "tool_name", "parameters": {{"param1": "value1", "param2": "value2"}}}}
162
+ ```
163
+
164
+ You can call multiple tools by including multiple tool blocks.
165
+
166
+ After using tools and gathering information, provide your final response.
167
+ When the task is complete, end with "TASK_COMPLETE" on a new line.
168
+
169
+ Be concise and efficient. Focus on completing the task."""
170
+
171
+ def _parse_tool_calls(self, response: str) -> List[Dict[str, Any]]:
172
+ """Parse tool calls from the model's response"""
173
+ tool_calls = []
174
+
175
+ # Pattern to match tool JSON blocks
176
+ # Match ```tool ... ``` or ```json ... ``` or just JSON objects with "tool" key
177
+ patterns = [
178
+ r'```tool\s*\n?(.*?)\n?```',
179
+ r'```json\s*\n?(.*?)\n?```',
180
+ r'\{"tool":\s*"[^"]+",\s*"parameters":\s*\{[^}]*\}\}',
181
+ ]
182
+
183
+ for pattern in patterns[:2]: # First two patterns use groups
184
+ matches = re.findall(pattern, response, re.DOTALL | re.IGNORECASE)
185
+ for match in matches:
186
+ try:
187
+ tool_data = json.loads(match.strip())
188
+ if "tool" in tool_data:
189
+ tool_calls.append(tool_data)
190
+ except json.JSONDecodeError:
191
+ continue
192
+
193
+ # Try direct JSON pattern
194
+ direct_matches = re.findall(patterns[2], response)
195
+ for match in direct_matches:
196
+ try:
197
+ tool_data = json.loads(match)
198
+ if tool_data not in tool_calls: # Avoid duplicates
199
+ tool_calls.append(tool_data)
200
+ except json.JSONDecodeError:
201
+ continue
202
+
203
+ return tool_calls
204
+
205
+ def _generate_response(self, messages: List[Dict[str, str]], max_new_tokens: int = 512) -> str:
206
+ """Generate a response from the model"""
207
+ self._initialize_model()
208
+
209
+ try:
210
+ # Apply chat template
211
+ inputs = self.tokenizer.apply_chat_template(
212
+ messages,
213
+ add_generation_prompt=True,
214
+ tokenize=True,
215
+ return_dict=True,
216
+ return_tensors="pt",
217
+ )
218
+
219
+ # Move to model device
220
+ if hasattr(self.model, 'device'):
221
+ inputs = {k: v.to(self.model.device) for k, v in inputs.items()}
222
+
223
+ # Generate
224
+ outputs = self.model.generate(
225
+ **inputs,
226
+ max_new_tokens=max_new_tokens,
227
+ do_sample=True,
228
+ temperature=0.7,
229
+ top_p=0.9,
230
+ pad_token_id=self.tokenizer.eos_token_id,
231
+ )
232
+
233
+ # Decode only the new tokens
234
+ input_length = inputs["input_ids"].shape[-1]
235
+ response = self.tokenizer.decode(
236
+ outputs[0][input_length:],
237
+ skip_special_tokens=True
238
+ )
239
+
240
+ return response.strip()
241
+
242
+ except Exception as e:
243
+ logger.error(f"Generation error: {e}")
244
+ raise
245
+
246
+ async def run(
247
+ self,
248
+ task: str,
249
+ max_iterations: int = 10
250
+ ) -> AsyncGenerator[Dict[str, Any], None]:
251
+ """
252
+ Run the agent autonomously on a task.
253
+
254
+ Args:
255
+ task: The task to complete
256
+ max_iterations: Maximum tool calls to prevent infinite loops
257
+
258
+ Yields:
259
+ Events showing agent's progress and tool calls
260
+ """
261
+
262
+ yield {
263
+ "type": "agent_start",
264
+ "message": f"Autonomous AI Agent (Transformers) starting task",
265
+ "task": task,
266
+ "model": self.model_name
267
+ }
268
+
269
+ # Initialize model (lazy load)
270
+ try:
271
+ self._initialize_model()
272
+ yield {
273
+ "type": "model_loaded",
274
+ "message": f"Model {self.model_name} ready"
275
+ }
276
+ except Exception as e:
277
+ yield {
278
+ "type": "agent_error",
279
+ "error": str(e),
280
+ "message": f"Failed to load model: {e}"
281
+ }
282
+ return
283
+
284
+ # Build conversation
285
+ system_prompt = self._build_system_prompt()
286
+ messages = [
287
+ {"role": "system", "content": system_prompt},
288
+ {"role": "user", "content": task}
289
+ ]
290
+
291
+ iteration = 0
292
+ accumulated_results = []
293
+
294
+ while iteration < max_iterations:
295
+ iteration += 1
296
+
297
+ yield {
298
+ "type": "iteration_start",
299
+ "iteration": iteration,
300
+ "message": f"Iteration {iteration}: Thinking..."
301
+ }
302
+
303
+ try:
304
+ # Generate response
305
+ response = await asyncio.get_event_loop().run_in_executor(
306
+ None,
307
+ self._generate_response,
308
+ messages,
309
+ 512
310
+ )
311
+
312
+ logger.info(f"Model response (iteration {iteration}): {response[:200]}...")
313
+
314
+ # Check for task completion
315
+ if "TASK_COMPLETE" in response:
316
+ # Extract final answer (everything before TASK_COMPLETE)
317
+ final_answer = response.replace("TASK_COMPLETE", "").strip()
318
+
319
+ yield {
320
+ "type": "thought",
321
+ "thought": final_answer,
322
+ "message": f"AI Response: {final_answer[:100]}..."
323
+ }
324
+
325
+ yield {
326
+ "type": "agent_complete",
327
+ "message": "Task complete!",
328
+ "final_answer": final_answer,
329
+ "iterations": iteration
330
+ }
331
+ return
332
+
333
+ # Parse tool calls
334
+ tool_calls = self._parse_tool_calls(response)
335
+
336
+ if tool_calls:
337
+ # Execute each tool call
338
+ tool_results = []
339
+
340
+ for tool_call in tool_calls:
341
+ tool_name = tool_call.get("tool", "")
342
+ tool_params = tool_call.get("parameters", {})
343
+
344
+ yield {
345
+ "type": "tool_call",
346
+ "tool": tool_name,
347
+ "input": tool_params,
348
+ "message": f"Action: {tool_name}"
349
+ }
350
+
351
+ try:
352
+ result = await self._execute_mcp_tool(tool_name, tool_params)
353
+
354
+ yield {
355
+ "type": "tool_result",
356
+ "tool": tool_name,
357
+ "result": result,
358
+ "message": f"Tool {tool_name} completed"
359
+ }
360
+
361
+ tool_results.append({
362
+ "tool": tool_name,
363
+ "result": result
364
+ })
365
+ accumulated_results.append({
366
+ "tool": tool_name,
367
+ "params": tool_params,
368
+ "result": result
369
+ })
370
+
371
+ except Exception as e:
372
+ error_msg = str(e)
373
+ logger.error(f"Tool execution failed: {tool_name} - {error_msg}")
374
+
375
+ yield {
376
+ "type": "tool_error",
377
+ "tool": tool_name,
378
+ "error": error_msg,
379
+ "message": f"Tool {tool_name} failed: {error_msg}"
380
+ }
381
+
382
+ tool_results.append({
383
+ "tool": tool_name,
384
+ "error": error_msg
385
+ })
386
+
387
+ # Add assistant response and tool results to conversation
388
+ messages.append({"role": "assistant", "content": response})
389
+
390
+ # Format tool results for the model
391
+ results_text = "Tool results:\n"
392
+ for tr in tool_results:
393
+ if "error" in tr:
394
+ results_text += f"- {tr['tool']}: Error - {tr['error']}\n"
395
+ else:
396
+ result_str = json.dumps(tr['result'], default=str)[:500]
397
+ results_text += f"- {tr['tool']}: {result_str}\n"
398
+
399
+ messages.append({"role": "user", "content": results_text})
400
+
401
+ else:
402
+ # No tool calls - this might be a thought or partial response
403
+ yield {
404
+ "type": "thought",
405
+ "thought": response,
406
+ "message": f"AI Response: {response[:100]}..."
407
+ }
408
+
409
+ # Add to conversation and prompt for continuation
410
+ messages.append({"role": "assistant", "content": response})
411
+ messages.append({
412
+ "role": "user",
413
+ "content": "Continue with the task. Use the available tools to gather information and complete the task. When done, say TASK_COMPLETE."
414
+ })
415
+
416
+ except Exception as e:
417
+ error_msg = str(e)
418
+ logger.error(f"Error in iteration {iteration}: {error_msg}", exc_info=True)
419
+
420
+ yield {
421
+ "type": "agent_error",
422
+ "error": error_msg,
423
+ "message": f"Error: {error_msg}"
424
+ }
425
+
426
+ # Try to continue if we have results
427
+ if accumulated_results:
428
+ break
429
+ return
430
+
431
+ # Max iterations reached
432
+ yield {
433
+ "type": "agent_max_iterations",
434
+ "message": f"Reached maximum iterations ({max_iterations})",
435
+ "iterations": iteration,
436
+ "accumulated_results": accumulated_results
437
+ }
438
+
439
+ async def _execute_mcp_tool(self, tool_name: str, tool_input: Dict[str, Any]) -> Any:
440
+ """
441
+ Execute an MCP tool by routing to the appropriate MCP server.
442
+ """
443
+
444
+ # ============ SEARCH MCP SERVER ============
445
+ if tool_name == "search_web":
446
+ query = tool_input.get("query", "")
447
+ max_results = tool_input.get("max_results", 5)
448
+
449
+ results = await self.mcp_registry.search.query(query, max_results=max_results)
450
+ return {
451
+ "results": results[:max_results],
452
+ "count": len(results[:max_results])
453
+ }
454
+
455
+ elif tool_name == "search_news":
456
+ query = tool_input.get("query", "")
457
+ max_results = tool_input.get("max_results", 5)
458
+
459
+ results = await self.mcp_registry.search.query(f"{query} news", max_results=max_results)
460
+ return {
461
+ "results": results[:max_results],
462
+ "count": len(results[:max_results])
463
+ }
464
+
465
+ # ============ STORE MCP SERVER ============
466
+ elif tool_name == "save_prospect":
467
+ prospect_data = {
468
+ "id": tool_input.get("prospect_id", str(uuid.uuid4())),
469
+ "company": {
470
+ "id": tool_input.get("company_id"),
471
+ "name": tool_input.get("company_name"),
472
+ "domain": tool_input.get("company_domain")
473
+ },
474
+ "fit_score": tool_input.get("fit_score", 0),
475
+ "status": tool_input.get("status", "new"),
476
+ "metadata": tool_input.get("metadata", {})
477
+ }
478
+
479
+ result = await self.mcp_registry.store.save_prospect(prospect_data)
480
+ return {"status": result, "prospect_id": prospect_data["id"]}
481
+
482
+ elif tool_name == "get_prospect":
483
+ prospect_id = tool_input.get("prospect_id", "")
484
+ prospect = await self.mcp_registry.store.get_prospect(prospect_id)
485
+ return prospect or {"error": "Prospect not found"}
486
+
487
+ elif tool_name == "list_prospects":
488
+ prospects = await self.mcp_registry.store.list_prospects()
489
+ status_filter = tool_input.get("status")
490
+
491
+ if status_filter:
492
+ prospects = [p for p in prospects if p.get("status") == status_filter]
493
+
494
+ return {
495
+ "prospects": prospects,
496
+ "count": len(prospects)
497
+ }
498
+
499
+ elif tool_name == "save_company":
500
+ company_data = {
501
+ "id": tool_input.get("company_id", str(uuid.uuid4())),
502
+ "name": tool_input.get("name", ""),
503
+ "domain": tool_input.get("domain", ""),
504
+ "industry": tool_input.get("industry"),
505
+ "description": tool_input.get("description"),
506
+ "employee_count": tool_input.get("employee_count")
507
+ }
508
+
509
+ result = await self.mcp_registry.store.save_company(company_data)
510
+ return {"status": result, "company_id": company_data["id"]}
511
+
512
+ elif tool_name == "get_company":
513
+ company_id = tool_input.get("company_id", "")
514
+ company = await self.mcp_registry.store.get_company(company_id)
515
+ return company or {"error": "Company not found"}
516
+
517
+ elif tool_name == "save_fact":
518
+ fact_data = {
519
+ "id": tool_input.get("fact_id", str(uuid.uuid4())),
520
+ "company_id": tool_input.get("company_id", ""),
521
+ "fact_type": tool_input.get("fact_type", ""),
522
+ "content": tool_input.get("content", ""),
523
+ "source_url": tool_input.get("source_url"),
524
+ "confidence_score": tool_input.get("confidence_score", 0.8)
525
+ }
526
+
527
+ result = await self.mcp_registry.store.save_fact(fact_data)
528
+ return {"status": result, "fact_id": fact_data["id"]}
529
+
530
+ elif tool_name == "save_contact":
531
+ contact_data = {
532
+ "id": tool_input.get("contact_id", str(uuid.uuid4())),
533
+ "company_id": tool_input.get("company_id", ""),
534
+ "email": tool_input.get("email", ""),
535
+ "first_name": tool_input.get("first_name"),
536
+ "last_name": tool_input.get("last_name"),
537
+ "title": tool_input.get("title"),
538
+ "seniority": tool_input.get("seniority")
539
+ }
540
+
541
+ result = await self.mcp_registry.store.save_contact(contact_data)
542
+ return {"status": result, "contact_id": contact_data["id"]}
543
+
544
+ elif tool_name == "list_contacts_by_domain":
545
+ domain = tool_input.get("domain", "")
546
+ contacts = await self.mcp_registry.store.list_contacts_by_domain(domain)
547
+ return {
548
+ "contacts": contacts,
549
+ "count": len(contacts)
550
+ }
551
+
552
+ elif tool_name == "check_suppression":
553
+ supp_type = tool_input.get("suppression_type", "email")
554
+ value = tool_input.get("value", "")
555
+
556
+ is_suppressed = await self.mcp_registry.store.check_suppression(supp_type, value)
557
+ return {
558
+ "suppressed": is_suppressed,
559
+ "value": value,
560
+ "type": supp_type
561
+ }
562
+
563
+ # ============ EMAIL MCP SERVER ============
564
+ elif tool_name == "send_email":
565
+ to = tool_input.get("to", "")
566
+ subject = tool_input.get("subject", "")
567
+ body = tool_input.get("body", "")
568
+ prospect_id = tool_input.get("prospect_id", "")
569
+
570
+ thread_id = await self.mcp_registry.email.send(to, subject, body, prospect_id)
571
+ return {
572
+ "status": "sent",
573
+ "thread_id": thread_id,
574
+ "to": to
575
+ }
576
+
577
+ elif tool_name == "get_email_thread":
578
+ prospect_id = tool_input.get("prospect_id", "")
579
+ thread = await self.mcp_registry.email.get_thread(prospect_id)
580
+ return thread or {"error": "No email thread found"}
581
+
582
+ # ============ CALENDAR MCP SERVER ============
583
+ elif tool_name == "suggest_meeting_slots":
584
+ num_slots = tool_input.get("num_slots", 3)
585
+ slots = await self.mcp_registry.calendar.suggest_slots()
586
+ return {
587
+ "slots": slots[:num_slots],
588
+ "count": len(slots[:num_slots])
589
+ }
590
+
591
+ elif tool_name == "generate_calendar_invite":
592
+ start_time = tool_input.get("start_time", "")
593
+ end_time = tool_input.get("end_time", "")
594
+ title = tool_input.get("title", "Meeting")
595
+
596
+ slot = {
597
+ "start_iso": start_time,
598
+ "end_iso": end_time,
599
+ "title": title
600
+ }
601
+
602
+ ics = await self.mcp_registry.calendar.generate_ics(slot)
603
+ return {
604
+ "ics_content": ics,
605
+ "meeting": slot
606
+ }
607
+
608
+ else:
609
+ raise ValueError(f"Unknown MCP tool: {tool_name}")
mcp/auth/__init__.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enterprise Authentication and Authorization Module for MCP Servers
3
+
4
+ Provides:
5
+ - API key authentication
6
+ - Request signing
7
+ - Rate limiting
8
+ - RBAC (Role-Based Access Control)
9
+ """
10
+
11
+ from .api_key_auth import (
12
+ APIKey,
13
+ APIKeyManager,
14
+ APIKeyAuthMiddleware,
15
+ RequestSigningAuth,
16
+ get_key_manager
17
+ )
18
+
19
+ from .rate_limiter import (
20
+ TokenBucket,
21
+ RateLimiter,
22
+ RateLimitMiddleware,
23
+ RedisRateLimiter,
24
+ get_rate_limiter
25
+ )
26
+
27
+ __all__ = [
28
+ # API Key Auth
29
+ 'APIKey',
30
+ 'APIKeyManager',
31
+ 'APIKeyAuthMiddleware',
32
+ 'RequestSigningAuth',
33
+ 'get_key_manager',
34
+ # Rate Limiting
35
+ 'TokenBucket',
36
+ 'RateLimiter',
37
+ 'RateLimitMiddleware',
38
+ 'RedisRateLimiter',
39
+ 'get_rate_limiter',
40
+ ]
mcp/auth/api_key_auth.py ADDED
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enterprise API Key Authentication System for MCP Servers
3
+
4
+ Features:
5
+ - API key generation and validation
6
+ - Key rotation support
7
+ - Expiry and rate limiting per key
8
+ - Audit logging of authentication attempts
9
+ - Multiple authentication methods (header, query param)
10
+ """
11
+ import os
12
+ import secrets
13
+ import hashlib
14
+ import hmac
15
+ import logging
16
+ from typing import Optional, Dict, Set, Tuple
17
+ from datetime import datetime, timedelta
18
+ from dataclasses import dataclass
19
+ from aiohttp import web
20
+
21
+ logger = logging.getLogger(__name__)
22
+
23
+
24
+ @dataclass
25
+ class APIKey:
26
+ """API Key with metadata"""
27
+ key_id: str
28
+ key_hash: str # Hashed version of the key
29
+ name: str
30
+ tenant_id: Optional[str] = None
31
+ created_at: datetime = None
32
+ expires_at: Optional[datetime] = None
33
+ is_active: bool = True
34
+ permissions: Set[str] = None
35
+ rate_limit: int = 100 # requests per minute
36
+ metadata: Dict = None
37
+
38
+ def __post_init__(self):
39
+ if self.created_at is None:
40
+ self.created_at = datetime.utcnow()
41
+ if self.permissions is None:
42
+ self.permissions = set()
43
+ if self.metadata is None:
44
+ self.metadata = {}
45
+
46
+ def is_expired(self) -> bool:
47
+ """Check if key is expired"""
48
+ if self.expires_at is None:
49
+ return False
50
+ return datetime.utcnow() > self.expires_at
51
+
52
+ def is_valid(self) -> bool:
53
+ """Check if key is valid"""
54
+ return self.is_active and not self.is_expired()
55
+
56
+
57
+ class APIKeyManager:
58
+ """
59
+ API Key Manager with secure key storage and validation
60
+ """
61
+
62
+ def __init__(self):
63
+ self.keys: Dict[str, APIKey] = {}
64
+ self._load_keys_from_env()
65
+ logger.info(f"API Key Manager initialized with {len(self.keys)} keys")
66
+
67
+ def _load_keys_from_env(self):
68
+ """Load API keys from environment variables"""
69
+ # Primary API key
70
+ primary_key = os.getenv("MCP_API_KEY")
71
+ if primary_key:
72
+ key_id = "primary"
73
+ key_hash = self._hash_key(primary_key)
74
+ self.keys[key_hash] = APIKey(
75
+ key_id=key_id,
76
+ key_hash=key_hash,
77
+ name="Primary API Key",
78
+ is_active=True,
79
+ permissions={"*"}, # All permissions
80
+ rate_limit=1000
81
+ )
82
+ logger.info("Loaded primary API key from environment")
83
+
84
+ # Additional keys (comma-separated)
85
+ additional_keys = os.getenv("MCP_API_KEYS", "")
86
+ if additional_keys:
87
+ for idx, key in enumerate(additional_keys.split(",")):
88
+ key = key.strip()
89
+ if key:
90
+ key_id = f"key_{idx + 1}"
91
+ key_hash = self._hash_key(key)
92
+ self.keys[key_hash] = APIKey(
93
+ key_id=key_id,
94
+ key_hash=key_hash,
95
+ name=f"API Key {idx + 1}",
96
+ is_active=True,
97
+ permissions={"*"},
98
+ rate_limit=100
99
+ )
100
+ logger.info(f"Loaded {len(additional_keys.split(','))} additional API keys")
101
+
102
+ @staticmethod
103
+ def generate_api_key() -> str:
104
+ """
105
+ Generate a secure API key
106
+ Format: mcp_<32-char-hex>
107
+ """
108
+ random_bytes = secrets.token_bytes(32)
109
+ key_hex = random_bytes.hex()
110
+ return f"mcp_{key_hex}"
111
+
112
+ @staticmethod
113
+ def _hash_key(key: str) -> str:
114
+ """Hash an API key using SHA-256"""
115
+ return hashlib.sha256(key.encode()).hexdigest()
116
+
117
+ def create_key(
118
+ self,
119
+ name: str,
120
+ tenant_id: Optional[str] = None,
121
+ expires_in_days: Optional[int] = None,
122
+ permissions: Set[str] = None,
123
+ rate_limit: int = 100
124
+ ) -> Tuple[str, APIKey]:
125
+ """
126
+ Create a new API key
127
+
128
+ Returns:
129
+ Tuple of (plain_key, api_key_object)
130
+ """
131
+ plain_key = self.generate_api_key()
132
+ key_hash = self._hash_key(plain_key)
133
+
134
+ expires_at = None
135
+ if expires_in_days:
136
+ expires_at = datetime.utcnow() + timedelta(days=expires_in_days)
137
+
138
+ api_key = APIKey(
139
+ key_id=f"key_{len(self.keys) + 1}",
140
+ key_hash=key_hash,
141
+ name=name,
142
+ tenant_id=tenant_id,
143
+ expires_at=expires_at,
144
+ permissions=permissions or {"*"},
145
+ rate_limit=rate_limit
146
+ )
147
+
148
+ self.keys[key_hash] = api_key
149
+ logger.info(f"Created new API key: {api_key.key_id} for {name}")
150
+
151
+ return plain_key, api_key
152
+
153
+ def validate_key(self, plain_key: str) -> Optional[APIKey]:
154
+ """
155
+ Validate an API key
156
+
157
+ Returns:
158
+ APIKey object if valid, None otherwise
159
+ """
160
+ if not plain_key:
161
+ return None
162
+
163
+ key_hash = self._hash_key(plain_key)
164
+ api_key = self.keys.get(key_hash)
165
+
166
+ if not api_key:
167
+ logger.warning("Invalid API key provided")
168
+ return None
169
+
170
+ if not api_key.is_valid():
171
+ logger.warning(f"Expired or inactive API key: {api_key.key_id}")
172
+ return None
173
+
174
+ return api_key
175
+
176
+ def revoke_key(self, key_hash: str):
177
+ """Revoke an API key"""
178
+ if key_hash in self.keys:
179
+ self.keys[key_hash].is_active = False
180
+ logger.info(f"Revoked API key: {self.keys[key_hash].key_id}")
181
+
182
+ def list_keys(self) -> list[APIKey]:
183
+ """List all API keys"""
184
+ return list(self.keys.values())
185
+
186
+
187
+ class APIKeyAuthMiddleware:
188
+ """
189
+ aiohttp middleware for API key authentication
190
+ """
191
+
192
+ def __init__(self, key_manager: APIKeyManager, exempt_paths: Set[str] = None):
193
+ self.key_manager = key_manager
194
+ self.exempt_paths = exempt_paths or {"/health", "/metrics"}
195
+ logger.info("API Key Auth Middleware initialized")
196
+
197
+ @web.middleware
198
+ async def middleware(self, request: web.Request, handler):
199
+ """Middleware handler"""
200
+
201
+ # Skip authentication for exempt paths
202
+ if request.path in self.exempt_paths:
203
+ return await handler(request)
204
+
205
+ # Extract API key from request
206
+ api_key = self._extract_api_key(request)
207
+
208
+ if not api_key:
209
+ logger.warning(f"No API key provided for {request.path}")
210
+ return web.json_response(
211
+ {"error": "Authentication required", "message": "API key missing"},
212
+ status=401
213
+ )
214
+
215
+ # Validate API key
216
+ key_obj = self.key_manager.validate_key(api_key)
217
+
218
+ if not key_obj:
219
+ logger.warning(f"Invalid API key for {request.path}")
220
+ return web.json_response(
221
+ {"error": "Authentication failed", "message": "Invalid or expired API key"},
222
+ status=401
223
+ )
224
+
225
+ # Check permissions (if needed)
226
+ # TODO: Implement permission checking based on request path
227
+
228
+ # Attach key info to request for downstream use
229
+ request["api_key"] = key_obj
230
+ request["tenant_id"] = key_obj.tenant_id
231
+
232
+ logger.debug(f"Authenticated request: {request.path} with key {key_obj.key_id}")
233
+
234
+ return await handler(request)
235
+
236
+ def _extract_api_key(self, request: web.Request) -> Optional[str]:
237
+ """
238
+ Extract API key from request
239
+
240
+ Supports:
241
+ - X-API-Key header
242
+ - Authorization: Bearer <key> header
243
+ - api_key query parameter
244
+ """
245
+ # Try X-API-Key header
246
+ api_key = request.headers.get("X-API-Key")
247
+ if api_key:
248
+ return api_key
249
+
250
+ # Try Authorization: Bearer header
251
+ auth_header = request.headers.get("Authorization")
252
+ if auth_header and auth_header.startswith("Bearer "):
253
+ return auth_header[7:] # Remove "Bearer " prefix
254
+
255
+ # Try query parameter (less secure, should be avoided in production)
256
+ api_key = request.query.get("api_key")
257
+ if api_key:
258
+ logger.warning("API key provided via query parameter (insecure)")
259
+ return api_key
260
+
261
+ return None
262
+
263
+
264
+ class RequestSigningAuth:
265
+ """
266
+ Request signing authentication using HMAC
267
+ More secure than API keys alone
268
+ """
269
+
270
+ def __init__(self, secret_key: Optional[str] = None):
271
+ self.secret_key = secret_key or os.getenv("MCP_SECRET_KEY", "")
272
+ if not self.secret_key:
273
+ logger.warning("No secret key provided for request signing")
274
+
275
+ def sign_request(self, method: str, path: str, body: str, timestamp: str) -> str:
276
+ """
277
+ Sign a request using HMAC-SHA256
278
+
279
+ Args:
280
+ method: HTTP method (GET, POST, etc.)
281
+ path: Request path
282
+ body: Request body (JSON string)
283
+ timestamp: ISO timestamp
284
+
285
+ Returns:
286
+ HMAC signature (hex string)
287
+ """
288
+ message = f"{method}|{path}|{body}|{timestamp}"
289
+ signature = hmac.new(
290
+ self.secret_key.encode(),
291
+ message.encode(),
292
+ hashlib.sha256
293
+ ).hexdigest()
294
+ return signature
295
+
296
+ def verify_signature(
297
+ self,
298
+ method: str,
299
+ path: str,
300
+ body: str,
301
+ timestamp: str,
302
+ signature: str
303
+ ) -> bool:
304
+ """
305
+ Verify request signature
306
+
307
+ Returns:
308
+ True if signature is valid, False otherwise
309
+ """
310
+ # Check timestamp (prevent replay attacks)
311
+ try:
312
+ request_time = datetime.fromisoformat(timestamp.replace("Z", "+00:00"))
313
+ time_diff = abs((datetime.utcnow() - request_time).total_seconds())
314
+
315
+ # Reject requests older than 5 minutes
316
+ if time_diff > 300:
317
+ logger.warning(f"Request timestamp too old: {time_diff}s")
318
+ return False
319
+ except Exception as e:
320
+ logger.error(f"Invalid timestamp format: {e}")
321
+ return False
322
+
323
+ # Verify signature
324
+ expected_signature = self.sign_request(method, path, body, timestamp)
325
+ return hmac.compare_digest(expected_signature, signature)
326
+
327
+ @web.middleware
328
+ async def middleware(self, request: web.Request, handler):
329
+ """Middleware for request signing verification"""
330
+
331
+ # Skip health check and metrics
332
+ if request.path in {"/health", "/metrics"}:
333
+ return await handler(request)
334
+
335
+ # Extract signature components
336
+ signature = request.headers.get("X-Signature")
337
+ timestamp = request.headers.get("X-Timestamp")
338
+
339
+ if not signature or not timestamp:
340
+ return web.json_response(
341
+ {"error": "Missing signature or timestamp"},
342
+ status=401
343
+ )
344
+
345
+ # Get request body
346
+ body = ""
347
+ if request.can_read_body:
348
+ body_bytes = await request.read()
349
+ body = body_bytes.decode()
350
+
351
+ # Verify signature
352
+ if not self.verify_signature(
353
+ request.method,
354
+ request.path,
355
+ body,
356
+ timestamp,
357
+ signature
358
+ ):
359
+ logger.warning(f"Invalid signature for {request.path}")
360
+ return web.json_response(
361
+ {"error": "Invalid signature"},
362
+ status=401
363
+ )
364
+
365
+ return await handler(request)
366
+
367
+
368
+ # Global key manager instance
369
+ _key_manager: Optional[APIKeyManager] = None
370
+
371
+
372
+ def get_key_manager() -> APIKeyManager:
373
+ """Get or create the global API key manager"""
374
+ global _key_manager
375
+ if _key_manager is None:
376
+ _key_manager = APIKeyManager()
377
+ return _key_manager
mcp/auth/rate_limiter.py ADDED
@@ -0,0 +1,317 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enterprise Rate Limiting for MCP Servers
3
+
4
+ Features:
5
+ - Token bucket algorithm for smooth rate limiting
6
+ - Per-client rate limiting
7
+ - Global rate limiting
8
+ - Different limits for different endpoints
9
+ - Distributed rate limiting with Redis (optional)
10
+ """
11
+ import time
12
+ import logging
13
+ from typing import Dict, Optional
14
+ from collections import defaultdict
15
+ from dataclasses import dataclass, field
16
+ from aiohttp import web
17
+ import asyncio
18
+
19
+ logger = logging.getLogger(__name__)
20
+
21
+
22
+ @dataclass
23
+ class TokenBucket:
24
+ """Token bucket for rate limiting"""
25
+ capacity: int # Maximum tokens
26
+ refill_rate: float # Tokens per second
27
+ tokens: float = field(default=0)
28
+ last_refill: float = field(default_factory=time.time)
29
+
30
+ def __post_init__(self):
31
+ self.tokens = self.capacity
32
+
33
+ def _refill(self):
34
+ """Refill tokens based on time elapsed"""
35
+ now = time.time()
36
+ elapsed = now - self.last_refill
37
+
38
+ # Add tokens based on refill rate
39
+ self.tokens = min(
40
+ self.capacity,
41
+ self.tokens + (elapsed * self.refill_rate)
42
+ )
43
+ self.last_refill = now
44
+
45
+ def consume(self, tokens: int = 1) -> bool:
46
+ """
47
+ Try to consume tokens
48
+
49
+ Returns:
50
+ True if tokens were available, False otherwise
51
+ """
52
+ self._refill()
53
+
54
+ if self.tokens >= tokens:
55
+ self.tokens -= tokens
56
+ return True
57
+
58
+ return False
59
+
60
+ def get_wait_time(self, tokens: int = 1) -> float:
61
+ """
62
+ Get time to wait until tokens are available
63
+
64
+ Returns:
65
+ Seconds to wait
66
+ """
67
+ self._refill()
68
+
69
+ if self.tokens >= tokens:
70
+ return 0.0
71
+
72
+ tokens_needed = tokens - self.tokens
73
+ return tokens_needed / self.refill_rate
74
+
75
+
76
+ class RateLimiter:
77
+ """
78
+ In-memory rate limiter with token bucket algorithm
79
+ """
80
+
81
+ def __init__(self):
82
+ # Client-specific buckets
83
+ self.client_buckets: Dict[str, TokenBucket] = {}
84
+
85
+ # Global bucket for all requests
86
+ self.global_bucket: Optional[TokenBucket] = None
87
+
88
+ # Endpoint-specific limits
89
+ self.endpoint_limits: Dict[str, Dict] = {
90
+ "/rpc": {"capacity": 100, "refill_rate": 10.0}, # 100 requests, 10/sec refill
91
+ "default": {"capacity": 50, "refill_rate": 5.0} # Default for other endpoints
92
+ }
93
+
94
+ # Global rate limit (disabled by default)
95
+ # self.global_bucket = TokenBucket(capacity=1000, refill_rate=100.0)
96
+
97
+ # Cleanup task
98
+ self._cleanup_task = None
99
+ logger.info("Rate limiter initialized")
100
+
101
+ def _get_client_id(self, request: web.Request) -> str:
102
+ """
103
+ Get client identifier for rate limiting
104
+
105
+ Uses (in order):
106
+ 1. API key
107
+ 2. IP address
108
+ """
109
+ # Try API key first
110
+ if "api_key" in request and hasattr(request["api_key"], "key_id"):
111
+ return f"key:{request['api_key'].key_id}"
112
+
113
+ # Fall back to IP address
114
+ peername = request.transport.get_extra_info('peername')
115
+ if peername:
116
+ return f"ip:{peername[0]}"
117
+
118
+ return "unknown"
119
+
120
+ def _get_endpoint_limits(self, path: str) -> Dict:
121
+ """Get rate limits for endpoint"""
122
+ return self.endpoint_limits.get(path, self.endpoint_limits["default"])
123
+
124
+ def _get_or_create_bucket(self, client_id: str, path: str) -> TokenBucket:
125
+ """Get or create token bucket for client"""
126
+ bucket_key = f"{client_id}:{path}"
127
+
128
+ if bucket_key not in self.client_buckets:
129
+ limits = self._get_endpoint_limits(path)
130
+ self.client_buckets[bucket_key] = TokenBucket(
131
+ capacity=limits["capacity"],
132
+ refill_rate=limits["refill_rate"]
133
+ )
134
+
135
+ return self.client_buckets[bucket_key]
136
+
137
+ async def check_rate_limit(
138
+ self,
139
+ request: web.Request,
140
+ tokens: int = 1
141
+ ) -> tuple[bool, Optional[float]]:
142
+ """
143
+ Check if request is within rate limit
144
+
145
+ Returns:
146
+ Tuple of (allowed, retry_after_seconds)
147
+ """
148
+ client_id = self._get_client_id(request)
149
+ path = request.path
150
+
151
+ # Check global rate limit first (if enabled)
152
+ if self.global_bucket:
153
+ if not self.global_bucket.consume(tokens):
154
+ wait_time = self.global_bucket.get_wait_time(tokens)
155
+ logger.warning(f"Global rate limit exceeded, retry after {wait_time:.2f}s")
156
+ return False, wait_time
157
+
158
+ # Check client-specific rate limit
159
+ bucket = self._get_or_create_bucket(client_id, path)
160
+
161
+ if not bucket.consume(tokens):
162
+ wait_time = bucket.get_wait_time(tokens)
163
+ logger.warning(f"Rate limit exceeded for {client_id} on {path}, retry after {wait_time:.2f}s")
164
+ return False, wait_time
165
+
166
+ return True, None
167
+
168
+ async def start_cleanup_task(self):
169
+ """Start background cleanup task"""
170
+ if self._cleanup_task is None:
171
+ self._cleanup_task = asyncio.create_task(self._cleanup_loop())
172
+ logger.info("Rate limiter cleanup task started")
173
+
174
+ async def _cleanup_loop(self):
175
+ """Periodically clean up old buckets"""
176
+ while True:
177
+ await asyncio.sleep(300) # Every 5 minutes
178
+
179
+ # Remove buckets that haven't been used recently
180
+ cutoff_time = time.time() - 600 # 10 minutes
181
+ removed = 0
182
+
183
+ for key in list(self.client_buckets.keys()):
184
+ bucket = self.client_buckets[key]
185
+ if bucket.last_refill < cutoff_time:
186
+ del self.client_buckets[key]
187
+ removed += 1
188
+
189
+ if removed > 0:
190
+ logger.info(f"Cleaned up {removed} unused rate limit buckets")
191
+
192
+
193
+ class RateLimitMiddleware:
194
+ """aiohttp middleware for rate limiting"""
195
+
196
+ def __init__(self, rate_limiter: RateLimiter, exempt_paths: set[str] = None):
197
+ self.rate_limiter = rate_limiter
198
+ self.exempt_paths = exempt_paths or {"/health", "/metrics"}
199
+ logger.info("Rate limit middleware initialized")
200
+
201
+ @web.middleware
202
+ async def middleware(self, request: web.Request, handler):
203
+ """Middleware handler"""
204
+
205
+ # Skip rate limiting for exempt paths
206
+ if request.path in self.exempt_paths:
207
+ return await handler(request)
208
+
209
+ # Check rate limit
210
+ allowed, retry_after = await self.rate_limiter.check_rate_limit(request)
211
+
212
+ if not allowed:
213
+ return web.json_response(
214
+ {
215
+ "error": "Rate limit exceeded",
216
+ "message": f"Too many requests. Please retry after {retry_after:.2f} seconds.",
217
+ "retry_after": retry_after
218
+ },
219
+ status=429,
220
+ headers={"Retry-After": str(int(retry_after) + 1)}
221
+ )
222
+
223
+ # Add rate limit headers
224
+ response = await handler(request)
225
+
226
+ # TODO: Add X-RateLimit-* headers
227
+ # response.headers["X-RateLimit-Limit"] = "100"
228
+ # response.headers["X-RateLimit-Remaining"] = "95"
229
+
230
+ return response
231
+
232
+
233
+ class RedisRateLimiter:
234
+ """
235
+ Distributed rate limiter using Redis
236
+ Suitable for multi-instance deployments
237
+ """
238
+
239
+ def __init__(self, redis_client=None):
240
+ """
241
+ Initialize with Redis client
242
+
243
+ Args:
244
+ redis_client: redis.asyncio.Redis client
245
+ """
246
+ self.redis = redis_client
247
+ logger.info("Redis rate limiter initialized" if redis_client else "Redis rate limiter (disabled)")
248
+
249
+ async def check_rate_limit(
250
+ self,
251
+ key: str,
252
+ limit: int,
253
+ window_seconds: int
254
+ ) -> tuple[bool, Optional[int]]:
255
+ """
256
+ Check rate limit using Redis
257
+
258
+ Uses sliding window algorithm with Redis sorted sets
259
+
260
+ Returns:
261
+ Tuple of (allowed, retry_after_seconds)
262
+ """
263
+ if not self.redis:
264
+ # If Redis is not available, allow all requests
265
+ return True, None
266
+
267
+ now = time.time()
268
+ window_start = now - window_seconds
269
+
270
+ try:
271
+ # Redis pipeline for atomic operations
272
+ pipe = self.redis.pipeline()
273
+
274
+ # Remove old entries
275
+ pipe.zremrangebyscore(key, 0, window_start)
276
+
277
+ # Count current requests
278
+ pipe.zcard(key)
279
+
280
+ # Add current request
281
+ pipe.zadd(key, {str(now): now})
282
+
283
+ # Set expiry
284
+ pipe.expire(key, window_seconds)
285
+
286
+ results = await pipe.execute()
287
+
288
+ count = results[1] # Result from ZCARD
289
+
290
+ if count < limit:
291
+ return True, None
292
+ else:
293
+ # Calculate retry time
294
+ oldest_entries = await self.redis.zrange(key, 0, 0, withscores=True)
295
+ if oldest_entries:
296
+ oldest_time = oldest_entries[0][1]
297
+ retry_after = int(oldest_time + window_seconds - now) + 1
298
+ return False, retry_after
299
+
300
+ return False, window_seconds
301
+
302
+ except Exception as e:
303
+ logger.error(f"Redis rate limit error: {e}")
304
+ # On error, allow request (fail open)
305
+ return True, None
306
+
307
+
308
+ # Global rate limiter instance
309
+ _rate_limiter: Optional[RateLimiter] = None
310
+
311
+
312
+ def get_rate_limiter() -> RateLimiter:
313
+ """Get or create the global rate limiter"""
314
+ global _rate_limiter
315
+ if _rate_limiter is None:
316
+ _rate_limiter = RateLimiter()
317
+ return _rate_limiter
mcp/database/__init__.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enterprise-Grade Database Layer for CX AI Agent
3
+
4
+ Provides:
5
+ - SQLAlchemy ORM models with async support
6
+ - Repository pattern for clean data access
7
+ - Connection pooling and transaction management
8
+ - Multi-tenancy support
9
+ - Audit logging
10
+ - Database-backed MCP store service
11
+ """
12
+
13
+ from .models import (
14
+ Base,
15
+ Company,
16
+ Prospect,
17
+ Contact,
18
+ Fact,
19
+ Activity,
20
+ Suppression,
21
+ Handoff,
22
+ AuditLog
23
+ )
24
+
25
+ from .engine import (
26
+ DatabaseManager,
27
+ get_db_manager,
28
+ get_session,
29
+ init_database,
30
+ close_database
31
+ )
32
+
33
+ from .repositories import (
34
+ CompanyRepository,
35
+ ProspectRepository,
36
+ ContactRepository,
37
+ FactRepository,
38
+ ActivityRepository,
39
+ SuppressionRepository,
40
+ HandoffRepository
41
+ )
42
+
43
+ from .store_service import DatabaseStoreService
44
+
45
+ __all__ = [
46
+ # Models
47
+ 'Base',
48
+ 'Company',
49
+ 'Prospect',
50
+ 'Contact',
51
+ 'Fact',
52
+ 'Activity',
53
+ 'Suppression',
54
+ 'Handoff',
55
+ 'AuditLog',
56
+ # Engine
57
+ 'DatabaseManager',
58
+ 'get_db_manager',
59
+ 'get_session',
60
+ 'init_database',
61
+ 'close_database',
62
+ # Repositories
63
+ 'CompanyRepository',
64
+ 'ProspectRepository',
65
+ 'ContactRepository',
66
+ 'FactRepository',
67
+ 'ActivityRepository',
68
+ 'SuppressionRepository',
69
+ 'HandoffRepository',
70
+ # Services
71
+ 'DatabaseStoreService',
72
+ ]
mcp/database/engine.py ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enterprise-Grade Database Engine with Connection Pooling and Async Support
3
+ """
4
+ import os
5
+ import logging
6
+ from typing import Optional, AsyncGenerator
7
+ from contextlib import asynccontextmanager
8
+ from sqlalchemy.ext.asyncio import (
9
+ create_async_engine,
10
+ AsyncSession,
11
+ AsyncEngine,
12
+ async_sessionmaker
13
+ )
14
+ from sqlalchemy.pool import NullPool, QueuePool
15
+ from sqlalchemy import event, text
16
+
17
+ from .models import Base
18
+
19
+ logger = logging.getLogger(__name__)
20
+
21
+
22
+ class DatabaseConfig:
23
+ """Database configuration with environment variable support"""
24
+
25
+ def __init__(self):
26
+ # Database URL (supports SQLite, PostgreSQL, MySQL)
27
+ self.database_url = os.getenv(
28
+ "DATABASE_URL",
29
+ "sqlite+aiosqlite:///./data/cx_agent.db"
30
+ )
31
+
32
+ # Convert postgres:// to postgresql:// for SQLAlchemy
33
+ if self.database_url.startswith("postgres://"):
34
+ self.database_url = self.database_url.replace(
35
+ "postgres://", "postgresql+asyncpg://", 1
36
+ )
37
+
38
+ # Connection pool settings
39
+ self.pool_size = int(os.getenv("DB_POOL_SIZE", "20"))
40
+ self.max_overflow = int(os.getenv("DB_MAX_OVERFLOW", "10"))
41
+ self.pool_timeout = int(os.getenv("DB_POOL_TIMEOUT", "30"))
42
+ self.pool_recycle = int(os.getenv("DB_POOL_RECYCLE", "3600"))
43
+ self.pool_pre_ping = os.getenv("DB_POOL_PRE_PING", "true").lower() == "true"
44
+
45
+ # Echo SQL for debugging
46
+ self.echo = os.getenv("DB_ECHO", "false").lower() == "true"
47
+
48
+ # Enable SQLite WAL mode for better concurrency
49
+ self.enable_wal = os.getenv("SQLITE_WAL", "true").lower() == "true"
50
+
51
+ def is_sqlite(self) -> bool:
52
+ """Check if using SQLite"""
53
+ return "sqlite" in self.database_url
54
+
55
+ def is_postgres(self) -> bool:
56
+ """Check if using PostgreSQL"""
57
+ return "postgresql" in self.database_url
58
+
59
+
60
+ class DatabaseManager:
61
+ """Singleton database manager with connection pooling"""
62
+
63
+ _instance: Optional["DatabaseManager"] = None
64
+ _engine: Optional[AsyncEngine] = None
65
+ _session_factory: Optional[async_sessionmaker[AsyncSession]] = None
66
+
67
+ def __new__(cls):
68
+ if cls._instance is None:
69
+ cls._instance = super().__new__(cls)
70
+ return cls._instance
71
+
72
+ def __init__(self):
73
+ if self._engine is None:
74
+ self._initialize()
75
+
76
+ def _initialize(self):
77
+ """Initialize database engine and session factory"""
78
+ config = DatabaseConfig()
79
+
80
+ # Engine kwargs
81
+ engine_kwargs = {
82
+ "echo": config.echo,
83
+ "future": True,
84
+ }
85
+
86
+ # Configure connection pool based on database type
87
+ if config.is_sqlite():
88
+ # SQLite specific settings
89
+ logger.info(f"Initializing SQLite database: {config.database_url}")
90
+ engine_kwargs.update({
91
+ "poolclass": NullPool, # SQLite doesn't need pooling in the same way
92
+ "connect_args": {
93
+ "check_same_thread": False,
94
+ "timeout": 30,
95
+ }
96
+ })
97
+
98
+ # Enable WAL mode for better concurrency
99
+ if config.enable_wal:
100
+ engine_kwargs["connect_args"]["pragmas"] = {
101
+ "journal_mode": "WAL",
102
+ "synchronous": "NORMAL",
103
+ "cache_size": -64000, # 64MB cache
104
+ "foreign_keys": 1,
105
+ "busy_timeout": 5000,
106
+ }
107
+
108
+ else:
109
+ # PostgreSQL/MySQL settings
110
+ logger.info(f"Initializing database: {config.database_url}")
111
+ engine_kwargs.update({
112
+ "poolclass": QueuePool,
113
+ "pool_size": config.pool_size,
114
+ "max_overflow": config.max_overflow,
115
+ "pool_timeout": config.pool_timeout,
116
+ "pool_recycle": config.pool_recycle,
117
+ "pool_pre_ping": config.pool_pre_ping,
118
+ })
119
+
120
+ # Create async engine
121
+ self._engine = create_async_engine(
122
+ config.database_url,
123
+ **engine_kwargs
124
+ )
125
+
126
+ # Create session factory
127
+ self._session_factory = async_sessionmaker(
128
+ self._engine,
129
+ class_=AsyncSession,
130
+ expire_on_commit=False,
131
+ autocommit=False,
132
+ autoflush=False
133
+ )
134
+
135
+ # Register event listeners
136
+ self._register_event_listeners()
137
+
138
+ logger.info("Database engine initialized successfully")
139
+
140
+ def _register_event_listeners(self):
141
+ """Register SQLAlchemy event listeners"""
142
+
143
+ @event.listens_for(self._engine.sync_engine, "connect")
144
+ def receive_connect(dbapi_conn, connection_record):
145
+ """Event listener for new connections"""
146
+ logger.debug("New database connection established")
147
+
148
+ @event.listens_for(self._engine.sync_engine, "close")
149
+ def receive_close(dbapi_conn, connection_record):
150
+ """Event listener for closed connections"""
151
+ logger.debug("Database connection closed")
152
+
153
+ @property
154
+ def engine(self) -> AsyncEngine:
155
+ """Get the database engine"""
156
+ if self._engine is None:
157
+ raise RuntimeError("Database engine not initialized")
158
+ return self._engine
159
+
160
+ @property
161
+ def session_factory(self) -> async_sessionmaker[AsyncSession]:
162
+ """Get the session factory"""
163
+ if self._session_factory is None:
164
+ raise RuntimeError("Session factory not initialized")
165
+ return self._session_factory
166
+
167
+ async def create_tables(self):
168
+ """Create all database tables"""
169
+ logger.info("Creating database tables...")
170
+ async with self._engine.begin() as conn:
171
+ await conn.run_sync(Base.metadata.create_all)
172
+ logger.info("Database tables created successfully")
173
+
174
+ async def drop_tables(self):
175
+ """Drop all database tables (use with caution!)"""
176
+ logger.warning("Dropping all database tables...")
177
+ async with self._engine.begin() as conn:
178
+ await conn.run_sync(Base.metadata.drop_all)
179
+ logger.info("Database tables dropped")
180
+
181
+ async def health_check(self) -> bool:
182
+ """Check database health"""
183
+ try:
184
+ async with self.get_session() as session:
185
+ await session.execute(text("SELECT 1"))
186
+ return True
187
+ except Exception as e:
188
+ logger.error(f"Database health check failed: {e}")
189
+ return False
190
+
191
+ @asynccontextmanager
192
+ async def get_session(self) -> AsyncGenerator[AsyncSession, None]:
193
+ """Get a database session with automatic cleanup"""
194
+ session = self.session_factory()
195
+ try:
196
+ yield session
197
+ await session.commit()
198
+ except Exception as e:
199
+ await session.rollback()
200
+ logger.error(f"Database session error: {e}")
201
+ raise
202
+ finally:
203
+ await session.close()
204
+
205
+ async def close(self):
206
+ """Close database engine and connections"""
207
+ if self._engine is not None:
208
+ await self._engine.dispose()
209
+ logger.info("Database engine closed")
210
+
211
+
212
+ # Global database manager instance
213
+ _db_manager: Optional[DatabaseManager] = None
214
+
215
+
216
+ def get_db_manager() -> DatabaseManager:
217
+ """Get or create the global database manager instance"""
218
+ global _db_manager
219
+ if _db_manager is None:
220
+ _db_manager = DatabaseManager()
221
+ return _db_manager
222
+
223
+
224
+ async def get_session() -> AsyncGenerator[AsyncSession, None]:
225
+ """Convenience function to get a database session"""
226
+ db_manager = get_db_manager()
227
+ async with db_manager.get_session() as session:
228
+ yield session
229
+
230
+
231
+ async def init_database():
232
+ """Initialize database (create tables if needed)"""
233
+ db_manager = get_db_manager()
234
+ await db_manager.create_tables()
235
+ logger.info("Database initialized")
236
+
237
+
238
+ async def close_database():
239
+ """Close database connections"""
240
+ db_manager = get_db_manager()
241
+ await db_manager.close()
242
+ logger.info("Database closed")
mcp/database/migrate.py ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Database Migration Management Script
3
+ Provides helper functions for managing database migrations with Alembic
4
+ """
5
+ import os
6
+ import sys
7
+ import logging
8
+ from pathlib import Path
9
+
10
+ # Add parent directory to path
11
+ sys.path.insert(0, str(Path(__file__).parent.parent.parent))
12
+
13
+ from alembic.config import Config
14
+ from alembic import command
15
+
16
+ logger = logging.getLogger(__name__)
17
+
18
+
19
+ def get_alembic_config() -> Config:
20
+ """Get Alembic configuration"""
21
+ # Path to alembic.ini
22
+ alembic_ini = Path(__file__).parent.parent.parent / "alembic.ini"
23
+
24
+ if not alembic_ini.exists():
25
+ raise FileNotFoundError(f"alembic.ini not found at {alembic_ini}")
26
+
27
+ config = Config(str(alembic_ini))
28
+ return config
29
+
30
+
31
+ def create_migration(message: str):
32
+ """Create a new migration"""
33
+ config = get_alembic_config()
34
+ command.revision(config, message=message, autogenerate=True)
35
+ logger.info(f"Created migration: {message}")
36
+
37
+
38
+ def upgrade_database(revision: str = "head"):
39
+ """Upgrade database to a revision"""
40
+ config = get_alembic_config()
41
+ command.upgrade(config, revision)
42
+ logger.info(f"Upgraded database to {revision}")
43
+
44
+
45
+ def downgrade_database(revision: str):
46
+ """Downgrade database to a revision"""
47
+ config = get_alembic_config()
48
+ command.downgrade(config, revision)
49
+ logger.info(f"Downgraded database to {revision}")
50
+
51
+
52
+ def show_current_revision():
53
+ """Show current database revision"""
54
+ config = get_alembic_config()
55
+ command.current(config)
56
+
57
+
58
+ def show_migration_history():
59
+ """Show migration history"""
60
+ config = get_alembic_config()
61
+ command.history(config)
62
+
63
+
64
+ if __name__ == "__main__":
65
+ import argparse
66
+
67
+ parser = argparse.ArgumentParser(description="Database Migration Management")
68
+ subparsers = parser.add_subparsers(dest="command", help="Command to run")
69
+
70
+ # Create migration
71
+ create_parser = subparsers.add_parser("create", help="Create a new migration")
72
+ create_parser.add_argument("message", help="Migration message")
73
+
74
+ # Upgrade database
75
+ upgrade_parser = subparsers.add_parser("upgrade", help="Upgrade database")
76
+ upgrade_parser.add_argument(
77
+ "--revision",
78
+ default="head",
79
+ help="Revision to upgrade to (default: head)"
80
+ )
81
+
82
+ # Downgrade database
83
+ downgrade_parser = subparsers.add_parser("downgrade", help="Downgrade database")
84
+ downgrade_parser.add_argument("revision", help="Revision to downgrade to")
85
+
86
+ # Show current revision
87
+ subparsers.add_parser("current", help="Show current database revision")
88
+
89
+ # Show history
90
+ subparsers.add_parser("history", help="Show migration history")
91
+
92
+ args = parser.parse_args()
93
+
94
+ logging.basicConfig(level=logging.INFO)
95
+
96
+ if args.command == "create":
97
+ create_migration(args.message)
98
+ elif args.command == "upgrade":
99
+ upgrade_database(args.revision)
100
+ elif args.command == "downgrade":
101
+ downgrade_database(args.revision)
102
+ elif args.command == "current":
103
+ show_current_revision()
104
+ elif args.command == "history":
105
+ show_migration_history()
106
+ else:
107
+ parser.print_help()