🛠️ Advanced Configuration Guide

June 2025 Shadow Shift Engineering Team

This guide covers Shadow Shift's powerful configuration options to create custom synthetic datasets with precise control over data structure, relationships, and distributions. Master these techniques to generate production-grade test data tailored to your specific needs.

📐 Schema Definition

Shadow Shift uses a JSON-based schema definition to configure data generation. The schema describes your data structure and generation rules:

Basic Schema
Advanced Schema
{
  "schema": {
    "users": {
      "fields": {
        "id": { "type": "uuid" },
        "name": { "type": "full_name" },
        "email": { "type": "email" },
        "signup_date": { 
          "type": "date",
          "range": ["2020-01-01", "2025-12-31"]
        }
      },
      "rows": 1000
    }
  }
}
{
  "schema": {
    "users": {
      "fields": {
        "id": { "type": "uuid", "primary_key": true },
        "account_type": { 
          "type": "enum",
          "values": ["free", "pro", "enterprise"],
          "distribution": [0.7, 0.2, 0.1]
        },
        "login_count": {
          "type": "integer",
          "min": 0,
          "max": 500,
          "distribution": "exponential"
        }
      },
      "rows": 5000,
      "relationships": [
        {
          "target": "orders",
          "type": "one_to_many",
          "field": "user_id"
        }
      ]
    }
  },
  "options": {
    "concurrency": 4,
    "batch_size": 1000
  }
}

🔡 Field Types & Options

Shadow Shift supports numerous field types with customizable parameters:

Common Field Types
Example: Custom Distribution
"account_status": {
  "type": "enum",
  "values": ["active", "inactive", "suspended"],
  "distribution": [0.8, 0.15, 0.05] // 80% active, 15% inactive, 5% suspended
}

🔗 Data Relationships

Define complex relational data models with these relationship types:

Relationship Types
"relationships": [
  {
    "target": "orders",
    "type": "one_to_many",
    "field": "user_id",
    "cardinality": {
      "min": 0,
      "max": 20,
      "distribution": "normal"
    }
  }
]

Performance Optimization

Configure these options for large-scale data generation:

Generation Options
"options": {
  "concurrency": 4,       // Parallel threads
  "batch_size": 1000,     // Rows per batch
  "memory_limit": "2GB",  // Memory cap
  "format": "ndjson",     // Output format
  "compression": "gzip"   // On-the-fly compression
}

🚀 Ready to Configure Your Perfect Dataset?

Put these advanced techniques into practice with Shadow Shift's intuitive schema designer.

Start Configuring →

🔧 Troubleshooting Tips

📚 Further Resources