Dependency Management

Building complex computational graphs with bead's input system

Understanding Dependencies

In bead, dependencies are explicit connections between computational units. When bead B depends on bead A, it means B uses A’s outputs as inputs.

The Input System

Basic Dependency Addition

# Add a dependency
$ bead input add processed-data

# What happens:
# 1. Searches all bead boxes for 'processed-data'
# 2. Finds latest version
# 3. Extracts outputs to input/processed-data/
# 4. Records dependency in .bead-meta/

Input Directory Structure

my-analysis/
├── input/
│   ├── processed-data/      # From one bead
│   │   ├── clean.csv
│   │   └── README.md
│   └── model-parameters/    # From another bead
│       └── config.json
├── output/
└── temp/

Complete Input Commands

Adding Dependencies

# Basic add (finds latest version)
bead input add survey-responses

# Add specific version
bead input add survey-responses --time 20250730T120000+0200

# Add from specific file
bead input add model-output /path/to/model_20250730.zip

Loading and Unloading

Save disk space by loading/unloading large inputs:

# Unload to free space (keeps dependency definition)
$ bead input unload large-dataset
$ ls input/
# large-dataset folder gone

# Load when needed again
$ bead input load large-dataset
# Data restored from bead box at the exact same version

Updating Dependencies

Keep inputs current as upstream beads evolve:

# Update single input to latest
$ bead input update processed-data

# Update all inputs
$ bead input update

# See what would update without changing
$ bead input update --dry-run

# Go back a version (helpful if you broke something)
$ bead input update --previous processed-data

# Delete dependency entirely  
$ bead input delete test-data

Version Management

Understanding Versions

Each bead save creates a new version with timestamp:

survey-data_20250729T100000+0200.zip  # Version 1
survey-data_20250730T100000+0200.zip  # Version 2  
survey-data_20250730T150000+0200.zip  # Version 3

Pinning Versions

# Always use latest (default)
$ bead input add survey-data

# Pin to specific time
$ bead input add survey-data --time 20250730T100000+0200

# Update to previous version
$ bead input update --previous survey-data

Complex Dependency Patterns

Multiple Dependencies

$ bead new multi-source-analysis
$ cd multi-source-analysis

# Add multiple data sources
$ bead input add customer-data
$ bead input add transaction-logs  
$ bead input add product-catalog

Use in your analysis script (analyze.py):

import pandas as pd

customers = pd.read_csv('input/customer-data/customers.csv')
transactions = pd.read_csv('input/transaction-logs/logs.csv')
products = pd.read_csv('input/product-catalog/products.csv')

# Merge and analyze...

Dependency Chains

Build pipelines where each step depends on the previous:

# Step 1: Raw data
$ bead new raw-sensor-readings
$ cd raw-sensor-readings
# ... download data ...
bead save my-beads

# Step 2: Cleaning
$ bead new clean-sensor-data
$ cd clean-sensor-data
$ bead input add raw-sensor-readings
# ... clean data ...
$ bead save my-beads

# Step 3: Analysis
$ bead new sensor-analysis
$ cd sensor-analysis
$ bead input add clean-sensor-data
# ... analyze ...
$ bead save my-beads

# Step 4: Visualization
$ bead new dashboard
$ cd dashboard
$ bead input add sensor-analysis
# ... create plots ...

Branching Dependencies

One bead can be input to many:

        ┌→ regional-analysis
        │
base-data──→ temporal-analysis
        │
        └→ cohort-analysis

Implementation:

# Each analysis starts with same base
$ bead new regional-analysis
$ cd regional-analysis
$ bead input add base-data

$ bead new temporal-analysis
$ cd temporal-analysis
$ bead input add base-data

$ bead new cohort-analysis
$ cd cohort-analysis
$ bead input add base-data

Managing Large Dependencies

Selective Loading with –review Flag

For beads with large outputs:

# Development: don't load outputs
$ bead edit large-model-results

# When you need to inspect outputs
$ bead edit --review large-model-results

Troubleshooting Dependencies

Missing Dependencies

$ python analyze.py
FileNotFoundError: input/model-output/predictions.csv

# Solution 1: Load the input
$ bead input load model-output

# Solution 2: Check if input is defined
$ cat .bead-meta/bead | grep model-output

# if not defined, add it
$ bead input add model-output

Wrong Version Loaded

# Check current version
$ bead status
Bead Name: clean-sensor-data

Inputs:

input/raw-sensor-readings
	Status:      loaded
	Bead:        raw-sensor-readings # 20250909T120353663121+0100
	Box[es]:
	 * -r my-beads # 20250909T120353663121+0100

# Update to latest
$ bead input update raw-sensor-readings

Ready to collaborate? Continue to Team Collaboration to learn how teams work together with bead.