Troubleshooting¶

Common issues, solutions, and debugging techniques for ArcadeDB Python bindings.

Installation Issues¶

Package Import Errors¶

Problem: Can't import arcadedb_embedded module

Solutions:

Verify Installation:

uv pip show arcadedb-embedded
uv pip list | grep arcadedb

Reinstall Package:

uv pip uninstall arcadedb-embedded
uv pip install arcadedb-embedded

Reinstall if wheel looks corrupted: Wheels bundle the ArcadeDB JRE and JARs. If imports fail, reinstall the wheel (no external Java install is needed):
```
uv pip uninstall -y arcadedb-embedded
uv pip install --no-cache-dir arcadedb-embedded
```
Check Python Path:
```
import sys
print(sys.path)
```

Runtime Errors¶

Database Connection Issues¶

Problem: Can't connect to database

Solutions:

Check Database Path:

import os
db_path = "databases/mydb"
print(f"Exists: {os.path.exists(db_path)}")

Verify Database Created:

import arcadedb_embedded as arcadedb

# Create if not exists
if not os.path.exists(db_path):
    db = arcadedb.create_database(db_path)
else:
    db = arcadedb.open_database(db_path)

Check Permissions:

ls -la databases/
chmod -R 755 databases/

Database Already Exists¶

Symptom:

arcadedb.create_database("./mydb")
# ArcadeDBError: Database already exists

Solution:

Use open_database() instead:

import os
import arcadedb_embedded as arcadedb

if os.path.exists("./mydb"):
    db = arcadedb.open_database("./mydb")
else:
    db = arcadedb.create_database("./mydb")

Or delete existing database:

import shutil

# Remove existing database
if os.path.exists("./mydb"):
    shutil.rmtree("./mydb")

# Create fresh database
db = arcadedb.create_database("./mydb")

Database Locked¶

Symptom: ArcadeDBError: Database is locked by another process

Cause: Another process has the database open.

Solution:

Close other connections:

# Ensure previous database is closed
db.close()

Check for orphaned processes:
```
ps aux | grep python
kill <PID>
```

Remove lock file (last resort):

# Only if you're sure no process is using the database
rm ./mydb/.lock

Memory Configuration¶

JVM Memory Configuration¶

Configure JVM memory in Python before the first database/importer is created:

Basic Configuration (preferred):

import arcadedb_embedded as arcadedb

# Default: 4GB heap (no changes needed)

# Production: 8GB heap with matching initial size
arcadedb.start_jvm(heap_size="8g", jvm_args="-Xms8g")

Common JVM Options:

Option	Description	Example
`-Xmx<size>`	Maximum heap memory	`-Xmx8g` (8 gigabytes)
`-Xms<size>`	Initial heap size (recommended: same as `-Xmx`)	`-Xms8g`
`-XX:MaxDirectMemorySize=<size>`	Limit off-heap direct buffers	`-XX:MaxDirectMemorySize=8g`
`-Darcadedb.vectorIndex.locationCacheSize=<count>`	Max vector locations to cache (default: -1 = unlimited)	`-Darcadedb.vectorIndex.locationCacheSize=100000`
`-Darcadedb.vectorIndex.graphBuildCacheSize=<count>`	Max vectors cached during HNSW build (default: 10000)	`-Darcadedb.vectorIndex.graphBuildCacheSize=3000`
`-Darcadedb.vectorIndex.mutationsBeforeRebuild=<count>`	Mutations before graph rebuild (default: 100)	`-Darcadedb.vectorIndex.mutationsBeforeRebuild=200`

Vector Index Memory Tuning:

For applications using vector indexes, control memory usage:

# Conservative: bounded caches for large vector datasets
arcadedb.start_jvm(
    heap_size="8g",
    jvm_args=(
        "-Xms8g -XX:MaxDirectMemorySize=8g "
        "-Darcadedb.vectorIndex.locationCacheSize=100000 "
        "-Darcadedb.vectorIndex.graphBuildCacheSize=3000 "
        "-Darcadedb.vectorIndex.mutationsBeforeRebuild=200"
    ),
)

Cache Size Guidelines:

locationCacheSize: Number of vector locations (each ~56 bytes)
- 100000 entries ≈ 5.6 MB
- -1 = unlimited (backward compatible, may consume unbounded memory)
- Recommended: 100000 for datasets with 1M+ vectors
graphBuildCacheSize: Number of vectors during HNSW build
- Memory ≈ cacheSize × (dimensions × 4 + 64) bytes
- For 768-dim: 10000 entries ≈ 30 MB
- Lower values reduce build-time memory spikes
- Recommended: 3000-5000 for high-dimensional vectors

Memory Planning:

Total Process Memory = JVM Heap + Off-Heap Components

Off-Heap Components:

- Direct buffers (MaxDirectMemorySize)
- Metaspace (class definitions)
- Page cache
- Thread stacks
- Vector index caches (if bounded)

Rule of thumb: Plan for 1.5-2× your heap size in actual RAM

Example Configurations:

# Small datasets (<1M records, <100K vectors)
arcadedb.start_jvm(heap_size="2g", jvm_args="-Xms2g")

# Medium datasets (1M-10M records, 100K-1M vectors)
arcadedb.start_jvm(heap_size="8g", jvm_args="-Xms8g -XX:MaxDirectMemorySize=8g")

# Large datasets (10M+ records, 1M+ vectors) with bounded caches
arcadedb.start_jvm(
    heap_size="16g",
    jvm_args=(
        "-Xms16g -XX:MaxDirectMemorySize=16g "
        "-Darcadedb.vectorIndex.locationCacheSize=100000 "
        "-Darcadedb.vectorIndex.graphBuildCacheSize=5000 "
        "-Darcadedb.vectorIndex.mutationsBeforeRebuild=200"
    ),
)

# High-dimensional vectors (e.g., 1536-dim embeddings)
arcadedb.start_jvm(
    heap_size="8g",
    jvm_args=(
        "-Xms8g -XX:MaxDirectMemorySize=8g "
        "-Darcadedb.vectorIndex.locationCacheSize=50000 "
        "-Darcadedb.vectorIndex.graphBuildCacheSize=2000 "
        "-Darcadedb.vectorIndex.mutationsBeforeRebuild=150"
    ),
)

Configuration Timing

JVM options are locked after the JVM starts. Configure start_jvm(...) or pass jvm_kwargs before the first database/importer is created. To change settings, start a new Python process.

Alternative: ARCADEDB_JVM_ERROR_FILE

Set crash log location:

export ARCADEDB_JVM_ERROR_FILE="/var/log/arcade/errors.log"

Out of Memory Errors¶

Problem: OutOfMemoryError or heap space errors

Solutions:

Increase Heap (preferred):

import arcadedb_embedded as arcadedb
arcadedb.start_jvm(heap_size="8g", jvm_args="-Xms8g")

Bound Vector Caches (for vector workloads):

arcadedb.start_jvm(
    heap_size="8g",
    jvm_args=(
        "-Xms8g "
        "-Darcadedb.vectorIndex.locationCacheSize=100000 "
        "-Darcadedb.vectorIndex.graphBuildCacheSize=3000"
    ),
)

Use Batch Processing:

batch_size = 1000
for i in range(0, len(data), batch_size):
    batch = data[i:i + batch_size]
    process_batch(batch)

Close ResultSets:

result = db.query("sql", "SELECT FROM LargeTable")
try:
    for row in result:
        process(row)
finally:
    result.close()

Data Type Issues¶

Problem: Type conversion errors

Solutions:

Use Correct Types:

# Integer
vertex.set("age", 25)

# String
vertex.set("name", "Alice")

# List
vertex.set("tags", ["python", "database"])

# DateTime
from datetime import datetime
vertex.set("created", datetime.now())

Convert NumPy Arrays:

from arcadedb_embedded import to_java_float_array
import numpy as np

arr = np.array([1.0, 2.0, 3.0], dtype=np.float32)
vertex.set("embedding", to_java_float_array(arr))

Transaction Already Active¶

Symptom:

with db.transaction():
    with db.transaction():  # Nested!
        pass
# ArcadeDBError: Transaction already active

Cause: Nested transactions not supported.

Solution:

Don't nest transactions:

# Bad
with db.transaction():
    some_operation()
    with db.transaction():  # ✗ Error
        another_operation()

# Good
with db.transaction():
    some_operation()
    another_operation()

Or use separate transaction blocks:

with db.transaction():
    some_operation()

# First transaction committed

with db.transaction():
    another_operation()

Query Syntax Error¶

Symptom:

db.query("sql", "SELECT * FROM User WHERE name = Alice")
# ArcadeDBError: Syntax error near 'Alice'

Cause: String not properly quoted.

Solution:

Use parameters (RECOMMENDED):

db.query("sql",
    "SELECT FROM User WHERE name = :name",
    {"name": "Alice"}
)

Or quote strings in SQL:

db.query("sql", "SELECT FROM User WHERE name = 'Alice'")
#                                              ↑    ↑ quotes

Function Name Errors¶

Problem: SQL function not recognized

Solutions:

Check Function Name Case:

# Wrong
with db.transaction():
    db.command("sql", "INSERT INTO Product SET created = SYSDATE()")

# Correct
with db.transaction():
    db.command("sql", "INSERT INTO Product SET created = sysdate()")

Use Built-in Functions:

# Date/time
with db.transaction():
    db.command("sql", "INSERT INTO Event SET timestamp = sysdate()")

# UUID
with db.transaction():
    db.command("sql", "INSERT INTO User SET id = uuid()")

Multi-line Query Issues¶

Problem: SQL parser errors with complex queries

Solution: Use single-line queries or proper escaping:

# ✅ Single line (wrap in a transaction when executing)
query = "INSERT INTO Product SET name = 'test', created_at = sysdate()"

# ✅ Multi-line with proper formatting
query = """
INSERT INTO Product SET
    name = 'test',
    created_at = sysdate()
""".strip()

Type Conversion Error¶

Symptom:

vertex.set("embedding", numpy_array)
# TypeError: Cannot convert numpy.ndarray to Java type

Cause: NumPy arrays need explicit conversion.

Solution:

Use conversion utilities:

from arcadedb_embedded import to_java_float_array
import numpy as np

embedding = np.array([1.0, 2.0, 3.0], dtype=np.float32)
vertex.set("embedding", to_java_float_array(embedding))

Performance Issues¶

Slow Queries¶

Symptom: Queries take seconds or minutes.

Diagnosis:

Use EXPLAIN to analyze:

result = db.query("sql", "EXPLAIN SELECT FROM User WHERE email = 'alice@example.com'")
for row in result:
    print(row.to_dict())

Solutions:

Create indexes:

db.command("sql", "CREATE INDEX ON User (email) UNIQUE")

Use LIMIT:

# Bad: Load everything
result = db.query("sql", "SELECT FROM User")

# Good: Limit results
result = db.query("sql", "SELECT FROM User LIMIT 100")

Project only needed fields:

# Bad: Load all properties
result = db.query("sql", "SELECT FROM User")

# Good: Only needed fields
result = db.query("sql", "SELECT name, email FROM User")

Slow Imports¶

Symptom: Importing data is very slow.

Solutions:

Increase batch size (commitEvery):

from arcadedb_embedded import Importer
importer = Importer(db)
stats = importer.import_file(
    file_path="users.csv",
    import_type="vertices",
    type_name="User",
    typeIdProperty="id",
    commitEvery=10000,  # Default is 5000
)

Drop indexes during import:

# Drop indexes
db.command("sql", "DROP INDEX `User[email]`")

# Import data (vertices)
stats = importer.import_file(
    file_path="users.csv",
    import_type="vertices",
    type_name="User",
    typeIdProperty="id",
)

# Recreate indexes
db.command("sql", "CREATE INDEX ON User (email) UNIQUE")

Use transactions efficiently:

# Bad: Many small transactions
for record in records:
    with db.transaction():
        vertex = db.new_vertex("Data")
        vertex.set("data", record)
        vertex.save()

# Good: Batch in larger transactions
batch_size = 10000
for i in range(0, len(records), batch_size):
    with db.transaction():
        for record in records[i:i+batch_size]:
            vertex = db.new_vertex("Data")
            vertex.set("data", record)
            vertex.save()

High Memory Usage¶

Symptom: Process memory grows continuously.

Diagnosis:

Monitor memory:

import psutil
import os

process = psutil.Process(os.getpid())
print(f"Memory: {process.memory_info().rss / 1024 / 1024:.1f} MB")

Solutions:

Stream large ResultSets:

# Bad: Load all results
result = db.query("sql", "SELECT FROM LargeTable")
all_results = list(result)  # Loads everything!

# Good: Process streaming
result = db.query("sql", "SELECT FROM LargeTable")
for row in result:
    process(row)
    # Only one row in memory

Close ResultSets:

result = db.query("sql", "SELECT FROM User")
for row in result:
    if some_condition(row):
        break
# ResultSet automatically closed when iterator exhausted

Force garbage collection:

import gc

for batch in large_dataset:
    process_batch(batch)
    gc.collect()  # Trigger GC

Smaller transactions:

# Bad: Huge transaction
with db.transaction():
    for i in range(1000000):
        vertex = db.new_vertex("Data")
        vertex.save()

# Good: Batch transactions
batch_size = 10000
for i in range(0, 1000000, batch_size):
    with db.transaction():
        for j in range(batch_size):
            vertex = db.new_vertex("Data")
            vertex.save()

Server Mode Issues¶

Server Won't Start¶

Symptom:

server = arcadedb.create_server("./databases")
server.start()
# ArcadeDBError: Unable to start server

Solutions:

Check port availability:
```
lsof -i :2480
```

Use different port:

server = arcadedb.create_server(
    root_path="./databases",
    http_port=8080  # Different port
)

Check permissions:

ls -la ./databases
# Ensure write permissions
chmod -R 755 ./databases

Check logs:

# Enable logging
import logging
logging.basicConfig(level=logging.DEBUG)

server = arcadedb.create_server("./databases")
server.start()
# Check log output

Can't Connect to Server¶

Symptom: Server running but can't connect via HTTP.

Solutions:

Verify server is running:

if server.is_started():
    print("Server is running")
    print(f"URL: http://localhost:{server.http_port}")

Check firewall:

# Linux
sudo ufw allow 2480

# macOS
# System Preferences > Security & Privacy > Firewall

Test with curl:

curl http://localhost:2480/api/v1/server

Vector Search Issues¶

Vector Dimension Mismatch¶

Symptom:

vertex.save()
# ArcadeDBError: Vector dimension mismatch

Cause: Embedding dimension doesn't match index dimension.

Solution:

Verify dimensions match:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

# Check model dimension
test_embedding = model.encode("test")
print(f"Model dimension: {len(test_embedding)}")  # 384

# Create index with matching dimension
index = db.create_vector_index(
    vertex_type="Document",
    vector_property="embedding",
    dimensions=384  # Must match!
)

Slow First Query¶

Symptom: The first vector search query takes significantly longer than subsequent queries.

Cause: Most apps should not see this with current defaults, because create_vector_index() eagerly prepares the graph (build_graph_now=True). A slow first query typically means you created the index with build_graph_now=False, so graph preparation is deferred.

Solution: If you want predictable first-query latency, either keep the default eager behavior or explicitly rebuild before serving traffic. This is also useful after bulk vector inserts or removals/deletes when you want to force rebuild at a controlled time.

# Preferred: eager at creation (default)
index = db.create_vector_index(
    vertex_type="Document",
    vector_property="embedding",
    dimensions=384,
    build_graph_now=True,
)

# Optional: if created with build_graph_now=False, rebuild explicitly
index.build_graph_now()

Poor Search Results¶

Symptom: Vector search returns irrelevant results.

Solutions:

Try different distance function:

# Cosine (default, usually best for text)
index = db.create_vector_index(
    vertex_type="Doc",
    vector_property="embedding",
    dimensions=384,
    distance_function="cosine"
)

# Euclidean (sometimes better for images)
index = db.create_vector_index(
    vertex_type="Image",
    vector_property="features",
    dimensions=512,
    distance_function="euclidean"
)

Tune vector parameters:

# Better recall, slower
index = db.create_vector_index(
    vertex_type="Doc",
    vector_property="embedding",
    dimensions=384,
    max_connections=32,  # Default: 16
    beam_width=200       # Default: 100
)

Improve embeddings:

# Combine title and content
text = f"{doc['title']}. {doc['content']}"
embedding = model.encode(text)

# vs. just content
embedding = model.encode(doc['content'])  # May be less effective

Debugging¶

Enable Logging¶

Python logging:

import logging

# Basic logging
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# File logging
logging.basicConfig(
    level=logging.DEBUG,
    filename='arcadedb.log',
    filemode='w',
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

import arcadedb_embedded as arcadedb
# Now all operations will be logged

Java logging:

import jpype

# Enable Java logging before importing arcadedb
jpype.startJVM(
    classpath=[...],
    "-Djava.util.logging.config.file=logging.properties"
)

logging.properties:

.level=INFO
handlers=java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.level=ALL
com.arcadedata.level=DEBUG

Inspect Java Objects¶

# Get Java class name
java_obj = vertex._java_vertex
print(java_obj.getClass().getName())

# List methods
for method in java_obj.getClass().getMethods():
    print(method.getName())

# Get property value (raw Java)
value = java_obj.get("property_name")
print(f"Type: {type(value)}, Value: {value}")

Transaction Debugging¶

class DebugTransaction:
    """Debug wrapper for transactions."""

    def __init__(self, db):
        self.db = db
        self.transaction = None

    def __enter__(self):
        print("Starting transaction")
        self.transaction = self.db.transaction()
        return self.transaction.__enter__()

    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type:
            print(f"Transaction failed: {exc_type.__name__}: {exc_val}")
        else:
            print("Transaction committed")
        return self.transaction.__exit__(exc_type, exc_val, exc_tb)

# Usage
with DebugTransaction(db):
    vertex = db.new_vertex("User")
    vertex.set("name", "Alice")
    vertex.save()

Query Debugging¶

def debug_query(db, language, query, *args):
    """Execute query with debugging."""
    print(f"Query: {query}")
    if args:
        print(f"Params: {args}")

    try:
        result = db.query(language, query, *args)
        rows = list(result)
        print(f"Results: {len(rows)} rows")
        return rows
    except Exception as e:
        print(f"Error: {e}")
        raise

# Usage
results = debug_query(db, "sql", "SELECT FROM User WHERE name = :name", {"name": "Alice"})

Common Error Messages¶

"Property not found"¶

Meaning: Trying to get property that doesn't exist.

Solution:

# Check if property exists
if vertex.has_property("name"):
    name = vertex.get("name")
else:
    name = "Unknown"

# Or use default
name = vertex.get("name") or "Unknown"

"Type not found"¶

Meaning: Vertex/Edge type doesn't exist.

Solution:

# Create type first
result = db.query("sql", "SELECT FROM schema:types WHERE name = 'User'")
if not result.has_next():
    db.command("sql", "CREATE VERTEX TYPE User")

# Then create vertex
with db.transaction():
    vertex = db.new_vertex("User")

"Index already exists"¶

Meaning: Trying to create duplicate index.

Solution:

# Drop existing index
try:
    db.command("sql", "DROP INDEX `User[email]`")
except Exception:
    pass  # Index doesn't exist

# Create new index
db.command("sql", "CREATE INDEX ON User (email) UNIQUE")

"Unique constraint violation"¶

Meaning: Trying to insert duplicate value for unique property.

Solution:

# Check if exists first
result = db.query("sql", "SELECT FROM User WHERE email = :email", {"email": "alice@example.com"})

if result.has_next():
    vertex = result.next()
    # Update existing
    vertex.set("name", "Alice")
    vertex.save()
else:
    # Create new
    with db.transaction():
        vertex = db.new_vertex("User")
        vertex.set("email", "alice@example.com")
        vertex.set("name", "Alice")
        vertex.save()

Getting Help¶

Check Documentation:
Search Issues:
- GitHub Issues
- ArcadeDB Documentation
Report Bug: Include:
- Python version (python --version)
- Package version (uv pip show arcadedb-embedded)
- Minimal reproducible example
- Full error message with stack trace
- Operating system

Troubleshooting¶

Installation Issues¶

Package Import Errors¶

Runtime Errors¶

Database Connection Issues¶

Database Already Exists¶

Database Locked¶

Memory Configuration¶

JVM Memory Configuration¶

Out of Memory Errors¶

Data Type Issues¶

Transaction Already Active¶

Query Syntax Error¶

Function Name Errors¶

Multi-line Query Issues¶

Type Conversion Error¶

Performance Issues¶

Slow Queries¶

Slow Imports¶

High Memory Usage¶

Server Mode Issues¶

Server Won't Start¶

Can't Connect to Server¶

Vector Search Issues¶

Vector Dimension Mismatch¶

Slow First Query¶

Poor Search Results¶

Debugging¶

Enable Logging¶

Inspect Java Objects¶

Transaction Debugging¶

Query Debugging¶

Common Error Messages¶

"Property not found"¶

"Type not found"¶

"Index already exists"¶

"Unique constraint violation"¶

Getting Help¶

See Also¶