Java API Coverage
Java API Coverage Analysis¶
This section provides a practical mapping between the ArcadeDB Java API and the
Python bindings surface in this repository. It reflects the current code in
arcadedb_embedded rather than a theoretical, full Java surface comparison.
Executive Summary¶
The Python bindings expose the core database, schema, graph, vector, async, import/export, and server workflows needed for typical application usage. Most omissions are low-level JVM internals (WAL details, bucket scanning, binary protocol, server plugins, clustering) that are not typically used from Python.
Coverage by Area (Qualitative)¶
| Area | Status | Notes |
|---|---|---|
| Core Database | ✅ Supported | DatabaseFactory, Database, transactions, lookups, batch helpers |
| Query Execution | ✅ Supported | SQL, OpenCypher, MongoDB, GraphQL passthrough |
| Schema & Indexes | ✅ Supported | Types, properties, LSM/FULL_TEXT/Vector indexes |
| Graph API | ✅ Supported | Document, Vertex, Edge wrappers + query traversal |
| Vector Search | ✅ Supported | JVector indexes + NumPy conversion helpers |
| Async & Batch | ✅ Supported | AsyncExecutor, BatchContext |
| Data Import | ✅ Supported | CSV/TSV, XML, and ArcadeDB JSONL import |
| Data Export | ✅ Supported | JSONL/GraphML/GraphSON + CSV for query results |
| Server Mode | ✅ Supported | Embedded server lifecycle + Studio access |
| Advanced/Low-level | ❌ Not exposed | WAL internals, binary protocol, HA/replication, plugins |
Detailed Coverage¶
1. Core Database Operations¶
DatabaseFactory:
- ✅
create(),open(),exists()
Database:
- ✅
query(language, query, *args)andcommand(language, command, *args) - ✅ Transactions:
begin(),commit(),rollback(),transaction() - ✅ Records:
new_document(),new_vertex(),lookup_by_rid(),lookup_by_key() - ✅ Utilities:
count_type(),drop(),get_name(),get_database_path(),is_open(),close() - ✅ Configuration:
set_auto_transaction(),set_read_your_writes() - ✅ Async/batch:
async_executor()andbatch_context() - ✅ Export helpers:
export_database()andexport_to_csv()
Not directly exposed: bucket scans, WAL internals, low-level binary protocol
2. Query Execution¶
All query languages supported by the underlying ArcadeDB engine can be used via
db.query() and db.command():
- ✅ SQL
- ✅ OpenCypher
- ✅ MongoDB query syntax
- ✅ GraphQL
ResultSet & Results:
- ✅ Pythonic iteration (
__iter__,__next__) - ✅
has_next(),next() - ✅
get(),has_property(),get_property_names() - ✅
to_json(),to_dict()(Python enhancement)
3. Graph API¶
Hybrid approach: Pythonic object manipulation + query languages
Vertex & Edge Manipulation (Pythonic):
- ✅
db.new_vertex(type)/db.new_document(type) - ✅
record.set(name, value)/record.save()/record.delete()/record.modify() - ✅
vertex.new_edge(label, target, **props)(bidirectionality controlled by EdgeType schema) - ✅
vertex.get_out_edges(),get_in_edges(),get_both_edges() - ✅
db.lookup_by_rid(rid)for direct record access
Graph Traversals & Queries:
- ✅ SQL traversal:
SELECT * FROM User WHERE out('Follows').name = 'Alice' - ✅ OpenCypher patterns:
MATCH (a:User)-[:FOLLOWS]->(b) RETURN b - ✅ Path finding, shortest paths, pattern matching
Not exposed: event listeners/callback hooks, low-level graph internals
Object-Oriented Approach (Recommended):
# Create vertices with fluent Python API
alice = db.new_vertex("Person").set("name", "Alice").save()
bob = db.new_vertex("Person").set("name", "Bob").save()
# Create edge with properties (bidirectionality determined by EdgeType schema)
edge = alice.new_edge("Follows", bob, since=date.today())
edge.save()
Query-Based Approach (Also Supported):
# Create edges via SQL
db.command("sql", """
CREATE EDGE Follows
FROM (SELECT FROM User WHERE id = 1)
TO (SELECT FROM User WHERE id = 2)
""")
# Or via Cypher
db.command("cypher", """
MATCH (a:User {id: 1}), (b:User {id: 2})
CREATE (a)-[:FOLLOWS]->(b)
""")
# Traverse via Cypher
result = db.query("cypher", """
MATCH (user:User {name: 'Alice'})-[:FOLLOWS]->(friend)
RETURN friend.name
""")
4. Schema & Index API¶
Full Pythonic Schema API available via db.schema:
- ✅
create_document_type(),create_vertex_type(),create_edge_type() - ✅
get_or_create_*()helpers - ✅
create_property(),drop_property() - ✅
drop_type(),exists_type(),get_type(),get_types() - ✅ Indexes:
create_index(),drop_index(),get_indexes(),exists_index() - ✅ Vector indexes:
create_vector_index()(onDatabase),list_vector_indexes()
5. Server Mode¶
- ✅
ArcadeDBServer(root_path, config)- Server initialization - ✅
start(),stop(), context manager support - ✅
get_database(),create_database()- Database management - ✅
get_studio_url(),get_http_port() - ✅ Context manager support
- ✅
get_studio_url(),get_http_port()- Python enhancements - ✅ Embedded and HTTP mode support
- ❌ Plugin management, HA/replication, advanced user/security management
6. Data Import¶
Supported:
- ✅ CSV/TSV -
import_csv()(documents/vertices/edges, FK resolution) - ✅ XML -
import_xml()(documents/vertices) - ✅ ArcadeDB JSONL exports -
IMPORT DATABASE file://...via SQL - ✅ Edge import with foreign key resolution
- ✅ Batch processing and parallel import
- ✅ Automatic type inference
Not Implemented:
- ❌ RDF/OrientDB/GloVe/Word2Vec importers
- ❌ Direct JSON array import (use JSONL instead)
Note: The supported formats (CSV, XML, ArcadeDB JSONL export/import) cover most real-world data migration scenarios.
7. Data Export¶
- ✅ JSONL export - Full database backup format
- ✅ GraphML export - Graph visualization format
- ✅ GraphSON export - TinkerPop-compatible graph JSON
- ✅ CSV export of query results via
export_to_csv() - ✅ Type filtering via
include_types/exclude_types - ✅ Compression when exporting JSONL/GraphML/GraphSON (Java exporter)
8. Vector Search¶
- ✅ Vector index creation -
create_vector_index()(JVector) - ✅ NumPy array support -
to_java_float_array(),to_python_array() - ✅ Similarity search -
VectorIndex.find_nearest()and PQ approximate search - ✅ Distance functions - cosine, euclidean, inner_product
- ✅ Index tuning parameters (connections, beam width, quantization)
- ✅ Automatic indexing of existing records
- ✅ List vector indexes -
schema.list_vector_indexes()
9. Advanced / Low-Level APIs Not Exposed¶
- ❌ WAL and storage internals
- ❌ Binary protocol and custom network stacks
- ❌ HA/replication, distributed clustering
- ❌ Server plugins and module management
- ❌ Custom query engines and DSLs
Design Philosophy: Query-First Approach¶
The Python bindings follow a "query-first, API-second" philosophy, which is ideal for Python developers. Instead of exposing every Java object, operations are enabled through:
- SQL DDL for schema management
- Cypher/SQL for graph operations
- High-level wrappers for common tasks (transactions, vector search)
This approach is actually cleaner and more maintainable than direct API exposure:
# Python way (clean):
db.command("sql", "CREATE INDEX ON User (email) UNIQUE")
db.query("cypher", "MATCH (a)-[:FOLLOWS]->(b) RETURN b")
# vs. hypothetical direct API (complex):
schema = db.getSchema()
type = schema.getType("User")
index_builder = schema.buildTypeIndex("User", ["email"])
index = index_builder.withUnique(true).create()
Use Case Suitability¶
| Use Case | Suitable? | Notes |
|---|---|---|
| Embedded database in Python app | ✅ Excellent | Core use case |
| Graph analytics with Cypher | ✅ Excellent | SQL and OpenCypher supported |
| Document store | ✅ Excellent | SQL and schema APIs |
| Vector similarity search | ✅ Excellent | JVector + NumPy integration |
| Development with Studio UI | ✅ Excellent | Server mode included |
| Data migration (CSV/XML/JSONL import) | ✅ Good | CSV/XML importers + JSONL via SQL |
| Async bulk ingestion | ✅ Good | AsyncExecutor and BatchContext |
| Multi-master replication | ❌ Not supported | Java server only |
| Custom query language | ❌ Not supported | Use built-in languages |
Conclusion¶
These bindings cover the primary workflows most Python developers need:
- Embedded multi-model database
- Graph, document, vector, and time-series data
- SQL and OpenCypher queries
- Server mode for Studio UI and HTTP access
They intentionally do not expose low-level JVM internals, clustering, and plugin management. For those scenarios, use the Java APIs directly.
🚧 Future Work¶
- SQL-level vector syntax in ArcadeDB (when available upstream)
- Expanded performance benchmarks and scale testing
- Continued alignment with upstream Java releases
📝 License¶
Apache License 2.0
🙏 Contributing¶
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests:
python3 -m pytest tests/ -v - Submit a pull request