pgvector Setup Guide¶
Part of Data & Database. This guide covers the setup and usage of pgvector for vector similarity search in the Redhound trading system.
Overview¶
pgvector is a PostgreSQL extension that enables efficient vector similarity search. Vector storage is consolidated with relational and time-series data in a single TimescaleDB database.
Key Benefits: - Unified Database: Single database for all data types - Reduced Complexity: No separate vector database to maintain - Better Performance: HNSW indexes provide fast approximate nearest neighbor search - Production Ready: Battle-tested PostgreSQL extension - Cost Efficiency: Reduced infrastructure costs
Architecture¶
Components¶
- TimescaleDB with pgvector: PostgreSQL 18 with both TimescaleDB and pgvector extensions
- Python Integration:
pgvectorPython package for SQLAlchemy integration - Database package (
backend.database): Models and types frombackend.database.models(base, signal, session, agent_analysis, agent_memory, debate, stock_profile, cost; enums inmodels/enums.py). - Repository layer:
AgentMemoryRepository.find_similar_memoriesis the supported API for similarity search. UseRepositoryFactory.get(AgentMemoryRepository)to obtain the repository. - Custom types:
backend.database.types.Vectorfor vector columns. - Vector operations:
backend.database.utils.vector_opsfor index creation and raw distance expressions when building custom queries.
Vector Dimensions¶
We use OpenAI's text-embedding-3-small model, which produces 1536-dimensional vectors.
Installation¶
Docker Setup¶
The pgvector extension is automatically installed via Docker:
# docker-compose.yml
postgres:
image: timescale/timescaledb-ha:pg18 # Includes pgvector support
volumes:
- ./docker/postgres/init-timescaledb.sql:/docker-entrypoint-initdb.d/01-init-timescaledb.sql:ro
The initialization script enables both extensions:
-- docker/postgres/init-timescaledb.sql
CREATE EXTENSION IF NOT EXISTS timescaledb;
CREATE EXTENSION IF NOT EXISTS vector;
Python Dependencies¶
The pgvector package is included in pyproject.toml:
Install dependencies:
Usage¶
1. Define Vector Columns¶
Use the custom Vector type in SQLAlchemy models:
from sqlalchemy import Column, Integer, String, Text
from backend.database.types import Vector
class AgentMemory(Base):
__tablename__ = 'agent_memory'
id = Column(Integer, primary_key=True)
agent_type = Column(String(50))
situation = Column(Text)
recommendation = Column(Text)
outcome = Column(Text, nullable=True)
embedding = Column(Vector(1536)) # OpenAI text-embedding-3-small
2. Insert Vectors¶
Vectors can be inserted as lists or numpy arrays:
import numpy as np
# From list
memory = AgentMemory(
agent_type="technical",
situation="Strong uptrend with RSI overbought",
recommendation="Consider taking profits",
embedding=[0.1, 0.2, 0.3, ...] # 1536 dimensions
)
# From numpy array
embedding_array = np.array([0.1, 0.2, 0.3, ...])
memory = AgentMemory(
agent_type="fundamental",
situation="P/E ratio below industry average",
recommendation="Potential value opportunity",
embedding=embedding_array
)
session.add(memory)
await session.commit()
3. Create HNSW Indexes¶
HNSW (Hierarchical Navigable Small World) indexes enable fast similarity search:
from backend.database.utils.vector_ops import create_hnsw_index
# Create index for agent memory embeddings
await create_hnsw_index(
session,
table_name='agent_memory',
column_name='embedding',
m=16, # Max connections per layer
ef_construction=64, # Build-time quality parameter
distance_metric='cosine' # cosine, l2, or inner_product
)
Index Parameters:
- m (default: 16): Maximum connections per layer
- Higher = better recall, larger index
- Typical range: 12-48
- ef_construction (default: 64): Build-time candidate list size
- Higher = better recall, slower build
- Typical range: 64-512
- distance_metric: Similarity metric
- cosine: Cosine similarity (recommended for embeddings)
- l2: Euclidean distance
- inner_product: Dot product
4. Similarity Search¶
Application code: Prefer AgentMemoryRepository.find_similar_memories so all data access goes through the repository layer:
from backend.database.models.enums import AgentType
from backend.database.repositories import AgentMemoryRepository
from backend.database.repositories.factory import RepositoryFactory
repo = RepositoryFactory.get(AgentMemoryRepository)
pairs = repo.find_similar_memories(
session, embedding=query_embedding, agent_type=AgentType.TECHNICAL, limit=10, threshold=0.7
)
# pairs: list of (AgentMemory, similarity_score)
The following examples use vector_ops and raw select() for custom queries or when tuning index parameters.
Cosine similarity (raw query)¶
from sqlalchemy import select
from backend.database.utils.vector_ops import cosine_distance
query_embedding = [0.1, 0.2, 0.3, ...] # 1536 dimensions
stmt = (
select(AgentMemory)
.order_by(cosine_distance(AgentMemory.embedding, query_embedding))
.limit(10)
)
result = session.execute(stmt)
similar_memories = result.scalars().all()
L2 Distance¶
from backend.database.utils.vector_ops import l2_distance
stmt = (
select(AgentMemory)
.order_by(l2_distance(AgentMemory.embedding, query_embedding))
.limit(10)
)
Inner Product¶
from backend.database.utils.vector_ops import inner_product
stmt = (
select(AgentMemory)
.order_by(inner_product(AgentMemory.embedding, query_embedding))
.limit(10)
)
5. Tune Search Quality¶
Adjust the ef_search parameter for better recall:
from backend.database.utils.vector_ops import set_hnsw_ef_search
# Higher ef_search = better recall, slower search
await set_hnsw_ef_search(session, ef_search=100)
# Now perform similarity search
stmt = select(AgentMemory).order_by(
cosine_distance(AgentMemory.embedding, query_embedding)
).limit(10)
ef_search Guidelines: - Default: 40 (good balance) - Fast search: 10-20 - High precision: 100-200 - Should be >= number of results you want
Performance Optimization¶
Index Creation¶
Create indexes during migrations, not at runtime:
# alembic/versions/xxx_add_agent_memory_vector_index.py
from alembic import op
def upgrade():
op.execute("""
CREATE INDEX agent_memory_embedding_hnsw_idx
ON agent_memory
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64)
""")
def downgrade():
op.execute("DROP INDEX IF EXISTS agent_memory_embedding_hnsw_idx")
Query Optimization¶
- Use LIMIT: Always limit results to avoid scanning entire table
- Filter First: Apply WHERE clauses before similarity search when possible
- Tune ef_search: Balance between speed and recall
- Monitor Performance: Track query times and adjust parameters
Example with filtering:
from backend.database.models.enums import AgentType
stmt = (
select(AgentMemory)
.where(AgentMemory.agent_type == AgentType.TECHNICAL)
.order_by(cosine_distance(AgentMemory.embedding, query_embedding))
.limit(10)
)
Migration from a legacy vector store¶
If you have existing vector data in another store, export it to JSON (e.g. using scripts/migrate_legacy_vector_to_pgvector.py for a one-time migration), then import into pgvector.
1. Export source data¶
Use the migration script (see script docstring) or export your vector store’s collections to JSON with keys such as ids, embeddings, metadatas, documents.
2. Import to pgvector¶
import json
from backend.database.models import AgentMemory
from backend.database.models.enums import AgentType
with open("vector_export.json", "r") as f:
data = json.load(f)
for i, embedding in enumerate(data["embeddings"]):
meta = data["metadatas"][i]
agent_type = AgentType(meta.get("agent_type", "MARKET_CONTEXT")) # use enum
memory = AgentMemory(
agent_type=agent_type,
situation=data["documents"][i],
recommendation=meta.get("recommendation", ""),
outcome=meta.get("outcome"),
embedding=embedding,
)
session.add(memory)
session.commit()
3. Clean up¶
After successful migration, stop and remove any legacy vector store containers and volumes if they are still defined in your setup.
Testing¶
Run pgvector integration tests:
# Start database
docker-compose up -d postgres
# Run tests
pytest tests/integration/test_pgvector.py -v
Troubleshooting¶
Extension Not Found¶
Error: ERROR: extension "vector" is not available
Solution: Ensure you're using the correct TimescaleDB image:
Index Creation Fails¶
Error: ERROR: operator class "vector_cosine_ops" does not exist
Solution: Verify pgvector extension is enabled:
Slow Queries¶
Issue: Similarity searches are slow
Solutions:
1. Ensure HNSW index exists: \d+ agent_memory in psql
2. Increase ef_search: SET hnsw.ef_search = 100
3. Add filtering before similarity search
4. Consider increasing index parameters (m, ef_construction)