Skip to content

How-to Guides

Step-by-step instructions for specific graph operation tasks. Each guide focuses on a single goal.

Data Loading & Combination

Join Files

Combine multiple KGX files into a unified database. Covers basic joins, mixed formats, glob patterns, and schema reporting.

Incremental Updates

Add new data to an existing database, with options for schema updates and deduplication.

Data Transformation

Split Graphs

Divide a graph into subsets based on field values such as source, category, or other attributes.

Normalize IDs

Apply SSSOM mappings to harmonize identifiers across different naming conventions and ontologies.

Clean Graphs

Remove duplicates, dangling edges, and optionally singleton nodes from a graph.

Reporting & Analysis

Generate Reports

Create QC reports, graph statistics, schema compliance reports, and tabular summaries.

Export Formats

Export your graph to TSV, JSONL, or Parquet format, with optional archiving and compression.

Guide Format

Each how-to guide follows a consistent structure:

  1. Goal: What you'll accomplish
  2. Prerequisites: What you need before starting
  3. Steps: Numbered instructions with examples
  4. Verification: How to confirm success
  5. Variations: Alternative approaches and options

Quick Reference

Task Command Guide
Combine files koza join Join Files
Add to existing koza append Incremental Updates
Extract subset koza split Split Graphs
Harmonize IDs koza normalize Normalize IDs
Remove issues koza prune / koza deduplicate Clean Graphs
Quality reports koza report Generate Reports
Format conversion koza split --format Export Formats