Data Import Examples¶

This page covers the current import examples for the Python bindings.

Before running any example, download the datasets using Dataset Downloader.

CSV Import Examples¶

Import Tabular Data as Documents¶

Example 04 - CSV Import: Documents

Learn how to:

create a target document type with SQL
import CSV data through IMPORT DATABASE
validate NULL handling and inferred types
benchmark query performance before and after indexes

Import Graph Data¶

Example 05 - CSV Import: Graph Database

Learn how to:

import vertices from CSV
import edges from CSV using matching IDs
create graph schema up front
bulk-load graph data with SQL import commands

SQL Import Workflow¶

The current bindings expose SQL IMPORT DATABASE plus a narrow db.import_documents(...) helper for document-file loads.

For very large Python-side bulk ingest workloads in this repository, do not treat importer-based paths as the default choice.

For bulk table/document ingest, prefer async SQL insert with one async worker.
Do not rely on multi-threaded async SQL insert for this path in the current Python examples.
For bulk graph ingest, prefer GraphBatch.

More broadly, this repository does not currently encourage IMPORT DATABASE as the main Python-side ingest recommendation. It remains available for the supported file-driven workflows and may become a stronger recommendation later if behavior improves.

Basic CSV Import¶

from pathlib import Path

import arcadedb_embedded as arcadedb


def file_url(path: str) -> str:
    return Path(path).resolve().as_uri()


with arcadedb.create_database("./import_demo") as db:
    db.command("sql", "CREATE DOCUMENT TYPE MyType")
    db.command(
        "sql",
        f"IMPORT DATABASE {file_url('./data.csv')} WITH documentType = 'MyType', commitEvery = 5000",
    )

Import with Predefined Schema¶

with arcadedb.create_database("./import_demo") as db:
    db.command("sql", "CREATE DOCUMENT TYPE Product")
    db.command("sql", "CREATE PROPERTY Product.id INTEGER")
    db.command("sql", "CREATE PROPERTY Product.name STRING")
    db.command("sql", "CREATE PROPERTY Product.price DOUBLE")

    db.command(
        "sql",
        f"IMPORT DATABASE {file_url('./products.csv')} WITH documentType = 'Product', commitEvery = 5000",
    )

Import Graph Vertices and Edges¶

with arcadedb.create_database("./graph_import_demo") as db:
    db.command("sql", "CREATE VERTEX TYPE User")
    db.command("sql", "CREATE EDGE TYPE Follows")

    db.command(
        "sql",
        (
            "IMPORT DATABASE WITH "
            f"vertices = '{file_url('./users.csv')}', "
            "vertexType = 'User', "
            "typeIdProperty = 'userId', "
            "typeIdType = 'Long', "
            "typeIdUnique = true"
        ),
    )

    db.command(
        "sql",
        (
            "IMPORT DATABASE WITH "
            f"edges = '{file_url('./follows.csv')}', "
            "edgeType = 'Follows', "
            "typeIdProperty = 'userId', "
            "typeIdType = 'Long', "
            "edgeFromField = 'follower_id', "
            "edgeToField = 'following_id'"
        ),
    )

Performance Tips¶

Pre-create critical schema and unique indexes.
Use a larger commitEvery value when you intentionally choose the SQL import path.
Drop expensive secondary indexes before a one-shot import and recreate them afterward.
Validate source files before starting long-running jobs.
For bulk table/document ingest, prefer single-worker async SQL instead of importer-based paths.
For bulk graph ingest, prefer GraphBatch instead of importer-based graph loading.

Additional Resources¶

Import Workflow Reference - Supported SQL import surface
Import Guide - Import strategies and patterns
Performance Guide - JVM and bulk-load tuning

Source Code¶

View the complete example source code: