Data Import Examples¶
This page covers the current import examples for the Python bindings.
Before running any example, download the datasets using Dataset Downloader.
CSV Import Examples¶
Import Tabular Data as Documents¶
Example 04 - CSV Import: Documents
Learn how to:
- create a target document type with SQL
- import CSV data through
IMPORT DATABASE - validate NULL handling and inferred types
- benchmark query performance before and after indexes
Import Graph Data¶
Example 05 - CSV Import: Graph Database
Learn how to:
- import vertices from CSV
- import edges from CSV using matching IDs
- create graph schema up front
- bulk-load graph data with SQL import commands
SQL Import Workflow¶
The current bindings expose SQL IMPORT DATABASE plus a narrow
db.import_documents(...) helper for document-file loads.
For very large Python-side bulk ingest workloads in this repository, do not treat importer-based paths as the default choice.
- For bulk table/document ingest, prefer async SQL insert with one async worker.
- Do not rely on multi-threaded async SQL insert for this path in the current Python examples.
- For bulk graph ingest, prefer
GraphBatch.
More broadly, this repository does not currently encourage IMPORT DATABASE as the main
Python-side ingest recommendation. It remains available for the supported file-driven
workflows and may become a stronger recommendation later if behavior improves.
Basic CSV Import¶
from pathlib import Path
import arcadedb_embedded as arcadedb
def file_url(path: str) -> str:
return Path(path).resolve().as_uri()
with arcadedb.create_database("./import_demo") as db:
db.command("sql", "CREATE DOCUMENT TYPE MyType")
db.command(
"sql",
f"IMPORT DATABASE {file_url('./data.csv')} WITH documentType = 'MyType', commitEvery = 5000",
)
Import with Predefined Schema¶
with arcadedb.create_database("./import_demo") as db:
db.command("sql", "CREATE DOCUMENT TYPE Product")
db.command("sql", "CREATE PROPERTY Product.id INTEGER")
db.command("sql", "CREATE PROPERTY Product.name STRING")
db.command("sql", "CREATE PROPERTY Product.price DOUBLE")
db.command(
"sql",
f"IMPORT DATABASE {file_url('./products.csv')} WITH documentType = 'Product', commitEvery = 5000",
)
Import Graph Vertices and Edges¶
with arcadedb.create_database("./graph_import_demo") as db:
db.command("sql", "CREATE VERTEX TYPE User")
db.command("sql", "CREATE EDGE TYPE Follows")
db.command(
"sql",
(
"IMPORT DATABASE WITH "
f"vertices = '{file_url('./users.csv')}', "
"vertexType = 'User', "
"typeIdProperty = 'userId', "
"typeIdType = 'Long', "
"typeIdUnique = true"
),
)
db.command(
"sql",
(
"IMPORT DATABASE WITH "
f"edges = '{file_url('./follows.csv')}', "
"edgeType = 'Follows', "
"typeIdProperty = 'userId', "
"typeIdType = 'Long', "
"edgeFromField = 'follower_id', "
"edgeToField = 'following_id'"
),
)
Performance Tips¶
- Pre-create critical schema and unique indexes.
- Use a larger
commitEveryvalue when you intentionally choose the SQL import path. - Drop expensive secondary indexes before a one-shot import and recreate them afterward.
- Validate source files before starting long-running jobs.
- For bulk table/document ingest, prefer single-worker async SQL instead of importer-based paths.
- For bulk graph ingest, prefer
GraphBatchinstead of importer-based graph loading.
Additional Resources¶
- Import Workflow Reference - Supported SQL import surface
- Import Guide - Import strategies and patterns
- Performance Guide - JVM and bulk-load tuning
Source Code¶
View the complete example source code: