GraphBatch API¶

The GraphBatch helper exposes ArcadeDB's high-throughput graph-ingest path from Python.

Overview¶

Use GraphBatch when you need to load many vertices and edges efficiently.

This is the repository's current recommended bulk graph-ingest path from Python.

You typically create it through db.graph_batch(...) rather than constructing the class directly.

Entry Point¶

`db.graph_batch(...) -> GraphBatch`¶

Create a configured batch helper tied to the current database.

Common options:

batch_size: buffered edge batch size before flush
expected_edge_count: sizing hint for large runs
light_edges: create property-less light edges when appropriate
commit_every: commit cadence during batch work
use_wal: enable WAL for stronger durability
wal_flush: flush policy such as no, yes_nometadata, yes_full
parallel_flush: flush deferred work in parallel

Example:

with db.graph_batch(batch_size=1000, expected_edge_count=50000) as batch:
    alice = batch.create_vertex("Person", name="Alice")
    bob = batch.create_vertex("Person", name="Bob")
    batch.new_edge(alice, "Knows", bob, since=2024)

Common Operations¶

`create_vertex(type_name, **properties)`¶

Create and persist a single vertex.

`create_vertices(type_name, count_or_properties)`¶

Create many vertices efficiently and return their RIDs.

`new_edge(source, edge_type, destination, **properties)`¶

Buffer an edge for creation during flush/close.

`new_edges(source_rids, edge_type, destination_rids, properties=None)`¶

Buffer many edges with one JPype crossing per call — the bulk counterpart of new_edge, which pays one boundary crossing per edge. RIDs may be strings ("#1:0") or objects with a string representation; properties is an optional same-length sequence of per-edge property dicts (JSON-representable values take the bulk path, anything else falls back to per-edge buffering). Returns the batch for chaining.

with db.graph_batch(use_wal=False) as batch:
    rids = batch.create_vertices("Person", [{"id": i} for i in range(100)])
    batch.new_edges(rids[:-1], "Knows", rids[1:])

`flush()`¶

Force buffered edge work to disk early.

`close()`¶

Flush remaining work and finalize the batch.

Counters¶

The helper also exposes counters such as:

get_total_edges_created()
get_buffered_edge_count()
get_deferred_incoming_edge_count()

Notes¶

Prefer GraphBatch over importer-based graph loading for Python-managed bulk ingest.
wal_flush validation is intentionally strict and raises ValueError for invalid modes.
See the graph-ingest examples and tests for realistic usage patterns.

GraphBatch API¶

Overview¶

Entry Point¶

db.graph_batch(...) -> GraphBatch¶

Common Operations¶

create_vertex(type_name, **properties)¶

create_vertices(type_name, count_or_properties)¶

new_edge(source, edge_type, destination, **properties)¶

new_edges(source_rids, edge_type, destination_rids, properties=None)¶

flush()¶

close()¶