Troubleshooting¶
Common issues, solutions, and debugging techniques for ArcadeDB Python bindings.
Installation Issues¶
Package Import Errors¶
Problem: Can't import arcadedb_embedded module
Solutions:
-
Verify Installation:
-
Reinstall Package:
-
Reinstall if wheel looks corrupted: Wheels bundle the ArcadeDB JRE and JARs. If imports fail, reinstall the wheel (no external Java install is needed):
-
Check Python Path:
Runtime Errors¶
Database Connection Issues¶
Problem: Can't connect to database
Solutions:
-
Check Database Path:
-
Verify Database Created:
-
Check Permissions:
Database Already Exists¶
Symptom:
Solution:
Use open_database() instead:
import os
import arcadedb_embedded as arcadedb
if os.path.exists("./mydb"):
db = arcadedb.open_database("./mydb")
else:
db = arcadedb.create_database("./mydb")
Or delete existing database:
import shutil
# Remove existing database
if os.path.exists("./mydb"):
shutil.rmtree("./mydb")
# Create fresh database
db = arcadedb.create_database("./mydb")
Database Locked¶
Symptom: ArcadeDBError: Database is locked by another process
Cause: Another process has the database open.
Solution:
-
Close other connections:
-
Check for orphaned processes:
-
Remove lock file (last resort):
Memory Configuration¶
JVM Memory Configuration¶
Configure JVM memory via the ARCADEDB_JVM_ARGS environment variable before importing arcadedb_embedded:
Basic Configuration:
# Default: 4GB heap
python script.py
# Production: 8GB heap with matching initial size
export ARCADEDB_JVM_ARGS="-Xmx8g -Xms8g"
python script.py
# One-liner
ARCADEDB_JVM_ARGS="-Xmx8g -Xms8g" python script.py
Common JVM Options:
| Option | Description | Example |
|---|---|---|
-Xmx<size> |
Maximum heap memory | -Xmx8g (8 gigabytes) |
-Xms<size> |
Initial heap size (recommended: same as -Xmx) |
-Xms8g |
-XX:MaxDirectMemorySize=<size> |
Limit off-heap direct buffers | -XX:MaxDirectMemorySize=8g |
-Darcadedb.vectorIndex.locationCacheSize=<count> |
Max vector locations to cache (default: -1 = unlimited) | -Darcadedb.vectorIndex.locationCacheSize=100000 |
-Darcadedb.vectorIndex.graphBuildCacheSize=<count> |
Max vectors cached during HNSW build (default: 10000) | -Darcadedb.vectorIndex.graphBuildCacheSize=3000 |
-Darcadedb.vectorIndex.mutationsBeforeRebuild=<count> |
Mutations before graph rebuild (default: 100) | -Darcadedb.vectorIndex.mutationsBeforeRebuild=200 |
Vector Index Memory Tuning:
For applications using vector indexes, control memory usage:
# Conservative: bounded caches for large vector datasets
export ARCADEDB_JVM_ARGS="-Xmx8g -Xms8g -XX:MaxDirectMemorySize=8g \
-Darcadedb.vectorIndex.locationCacheSize=100000 \
-Darcadedb.vectorIndex.graphBuildCacheSize=3000 \
-Darcadedb.vectorIndex.mutationsBeforeRebuild=200"
python vector_app.py
Cache Size Guidelines:
locationCacheSize: Number of vector locations (each ~56 bytes)- 100000 entries ≈ 5.6 MB
- -1 = unlimited (backward compatible, may consume unbounded memory)
-
Recommended: 100000 for datasets with 1M+ vectors
-
graphBuildCacheSize: Number of vectors during HNSW build - Memory ≈ cacheSize × (dimensions × 4 + 64) bytes
- For 768-dim: 10000 entries ≈ 30 MB
- Lower values reduce build-time memory spikes
- Recommended: 3000-5000 for high-dimensional vectors
Memory Planning:
Total Process Memory = JVM Heap + Off-Heap Components
Off-Heap Components:
- Direct buffers (MaxDirectMemorySize)
- Metaspace (class definitions)
- Page cache
- Thread stacks
- Vector index caches (if bounded)
Rule of thumb: Plan for 1.5-2× your heap size in actual RAM
Example Configurations:
# Small datasets (<1M records, <100K vectors)
ARCADEDB_JVM_ARGS="-Xmx2g -Xms2g"
# Medium datasets (1M-10M records, 100K-1M vectors)
ARCADEDB_JVM_ARGS="-Xmx8g -Xms8g -XX:MaxDirectMemorySize=8g"
# Large datasets (10M+ records, 1M+ vectors) with bounded caches
ARCADEDB_JVM_ARGS="-Xmx16g -Xms16g -XX:MaxDirectMemorySize=16g \
-Darcadedb.vectorIndex.locationCacheSize=100000 \
-Darcadedb.vectorIndex.graphBuildCacheSize=5000 \
-Darcadedb.vectorIndex.mutationsBeforeRebuild=200"
# High-dimensional vectors (e.g., 1536-dim embeddings)
ARCADEDB_JVM_ARGS="-Xmx8g -Xms8g -XX:MaxDirectMemorySize=8g \
-Darcadedb.vectorIndex.locationCacheSize=50000 \
-Darcadedb.vectorIndex.graphBuildCacheSize=2000 \
-Darcadedb.vectorIndex.mutationsBeforeRebuild=150"
Configuration Timing
ARCADEDB_JVM_ARGS must be set before the first import arcadedb_embedded. The
JVM can only be configured once per Python process.
Alternative: ARCADEDB_JVM_ERROR_FILE
Set crash log location:
Out of Memory Errors¶
Problem: OutOfMemoryError or heap space errors
Solutions:
-
Increase Heap via Environment Variable (Recommended):
-
Bound Vector Caches (for vector workloads):
-
Use Batch Processing:
-
Close ResultSets:
Data Type Issues¶
Problem: Type conversion errors
Solutions:
-
Use Correct Types:
-
Convert NumPy Arrays:
Transaction Already Active¶
Symptom:
with db.transaction():
with db.transaction(): # Nested!
pass
# ArcadeDBError: Transaction already active
Cause: Nested transactions not supported.
Solution:
Don't nest transactions:
# Bad
with db.transaction():
some_operation()
with db.transaction(): # ✗ Error
another_operation()
# Good
with db.transaction():
some_operation()
another_operation()
Or use separate transaction blocks:
with db.transaction():
some_operation()
# First transaction committed
with db.transaction():
another_operation()
Query Syntax Error¶
Symptom:
Cause: String not properly quoted.
Solution:
Use parameters (RECOMMENDED):
Or quote strings in SQL:
Function Name Errors¶
Problem: SQL function not recognized
Solutions:
-
Check Function Name Case:
-
Use Built-in Functions:
Multi-line Query Issues¶
Problem: SQL parser errors with complex queries
Solution: Use single-line queries or proper escaping:
# ✅ Single line (wrap in a transaction when executing)
query = "INSERT INTO Product SET name = 'test', created_at = sysdate()"
# ✅ Multi-line with proper formatting
query = """
INSERT INTO Product SET
name = 'test',
created_at = sysdate()
""".strip()
Type Conversion Error¶
Symptom:
Cause: NumPy arrays need explicit conversion.
Solution:
Use conversion utilities:
from arcadedb_embedded import to_java_float_array
import numpy as np
embedding = np.array([1.0, 2.0, 3.0], dtype=np.float32)
vertex.set("embedding", to_java_float_array(embedding))
Performance Issues¶
Slow Queries¶
Symptom: Queries take seconds or minutes.
Diagnosis:
Use EXPLAIN to analyze:
result = db.query("sql", "EXPLAIN SELECT FROM User WHERE email = 'alice@example.com'")
for row in result:
print(row.to_dict())
Solutions:
-
Create indexes:
-
Use LIMIT:
-
Project only needed fields:
Slow Imports¶
Symptom: Importing data is very slow.
Solutions:
-
Increase batch size (commitEvery):
-
Drop indexes during import:
# Drop indexes (Schema API preferred for embedded) db.schema.drop_index("User[email]", force=True) # Import data (vertices) stats = importer.import_file( file_path="users.csv", import_type="vertices", type_name="User", typeIdProperty="id", ) # Recreate indexes db.schema.create_index("User", ["email"], unique=True) -
Use transactions efficiently:
# Bad: Many small transactions for record in records: with db.transaction(): vertex = db.new_vertex("Data") vertex.set("data", record) vertex.save() # Good: Batch in larger transactions batch_size = 10000 for i in range(0, len(records), batch_size): with db.transaction(): for record in records[i:i+batch_size]: vertex = db.new_vertex("Data") vertex.set("data", record) vertex.save()
High Memory Usage¶
Symptom: Process memory grows continuously.
Diagnosis:
Monitor memory:
import psutil
import os
process = psutil.Process(os.getpid())
print(f"Memory: {process.memory_info().rss / 1024 / 1024:.1f} MB")
Solutions:
-
Stream large ResultSets:
-
Close ResultSets:
-
Force garbage collection:
-
Smaller transactions:
Server Mode Issues¶
Server Won't Start¶
Symptom:
server = arcadedb.create_server("./databases")
server.start()
# ArcadeDBError: Unable to start server
Solutions:
- Check port availability:
Use different port:
-
Check permissions:
-
Check logs:
Can't Connect to Server¶
Symptom: Server running but can't connect via HTTP.
Solutions:
-
Verify server is running:
-
Check firewall:
-
Test with curl:
Vector Search Issues¶
Vector Dimension Mismatch¶
Symptom:
Cause: Embedding dimension doesn't match index dimension.
Solution:
Verify dimensions match:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
# Check model dimension
test_embedding = model.encode("test")
print(f"Model dimension: {len(test_embedding)}") # 384
# Create index with matching dimension
index = db.create_vector_index(
vertex_type="Document",
vector_property="embedding",
dimensions=384 # Must match!
)
Slow First Query¶
Symptom: The first vector search query takes significantly longer than subsequent queries.
Cause: The vector index is built lazily. The first query triggers the actual construction of the index ("warm up").
Solution: This is expected behavior. You can perform a "warm up" query during application startup if consistent query latency is required.
# Warm up index on startup
print("Warming up vector index...")
index.find_nearest(np.zeros(384), k=1)
print("Index ready")
Poor Search Results¶
Symptom: Vector search returns irrelevant results.
Solutions:
-
Try different distance function:
# Cosine (default, usually best for text) index = db.create_vector_index( vertex_type="Doc", vector_property="embedding", dimensions=384, distance_function="cosine" ) # Euclidean (sometimes better for images) index = db.create_vector_index( vertex_type="Image", vector_property="features", dimensions=512, distance_function="euclidean" ) -
Tune vector parameters:
-
Improve embeddings:
Debugging¶
Enable Logging¶
Python logging:
import logging
# Basic logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# File logging
logging.basicConfig(
level=logging.DEBUG,
filename='arcadedb.log',
filemode='w',
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
import arcadedb_embedded as arcadedb
# Now all operations will be logged
Java logging:
import jpype
# Enable Java logging before importing arcadedb
jpype.startJVM(
classpath=[...],
"-Djava.util.logging.config.file=logging.properties"
)
logging.properties:
.level=INFO
handlers=java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.level=ALL
com.arcadedata.level=DEBUG
Inspect Java Objects¶
# Get Java class name
java_obj = vertex._java_vertex
print(java_obj.getClass().getName())
# List methods
for method in java_obj.getClass().getMethods():
print(method.getName())
# Get property value (raw Java)
value = java_obj.get("property_name")
print(f"Type: {type(value)}, Value: {value}")
Transaction Debugging¶
class DebugTransaction:
"""Debug wrapper for transactions."""
def __init__(self, db):
self.db = db
self.transaction = None
def __enter__(self):
print("Starting transaction")
self.transaction = self.db.transaction()
return self.transaction.__enter__()
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type:
print(f"Transaction failed: {exc_type.__name__}: {exc_val}")
else:
print("Transaction committed")
return self.transaction.__exit__(exc_type, exc_val, exc_tb)
# Usage
with DebugTransaction(db):
vertex = db.new_vertex("User")
vertex.set("name", "Alice")
vertex.save()
Query Debugging¶
def debug_query(db, language, query, *args):
"""Execute query with debugging."""
print(f"Query: {query}")
if args:
print(f"Params: {args}")
try:
result = db.query(language, query, *args)
rows = list(result)
print(f"Results: {len(rows)} rows")
return rows
except Exception as e:
print(f"Error: {e}")
raise
# Usage
results = debug_query(db, "sql", "SELECT FROM User WHERE name = :name", {"name": "Alice"})
Common Error Messages¶
"Property not found"¶
Meaning: Trying to get property that doesn't exist.
Solution:
# Check if property exists
if vertex.has_property("name"):
name = vertex.get("name")
else:
name = "Unknown"
# Or use default
name = vertex.get("name") or "Unknown"
"Type not found"¶
Meaning: Vertex/Edge type doesn't exist.
Solution:
# Create type first (Schema API is auto-transactional)
db.schema.get_or_create_vertex_type("User")
# Then create vertex
with db.transaction():
vertex = db.new_vertex("User")
"Index already exists"¶
Meaning: Trying to create duplicate index.
Solution:
# Drop existing index
try:
db.schema.drop_index("User[email]", force=True)
except Exception:
pass # Index doesn't exist
# Create new index
db.schema.create_index("User", ["email"], unique=True)
"Unique constraint violation"¶
Meaning: Trying to insert duplicate value for unique property.
Solution:
# Check if exists first
result = db.query("sql", "SELECT FROM User WHERE email = :email", {"email": "alice@example.com"})
if result.has_next():
vertex = result.next()
# Update existing
vertex.set("name", "Alice")
vertex.save()
else:
# Create new
with db.transaction():
vertex = db.new_vertex("User")
vertex.set("email", "alice@example.com")
vertex.set("name", "Alice")
vertex.save()
Getting Help¶
-
Check Documentation:
-
Search Issues:
-
Report Bug: Include:
- Python version (
python --version) - Package version (
uv pip show arcadedb-embedded) - Minimal reproducible example
- Full error message with stack trace
- Operating system
- Python version (
See Also¶
- Architecture - System architecture and design
- Database API - Core database operations
- Exceptions API - Error handling reference