Skip to main content

directed acyclic graphs (dags)

The Streams Engine in the Instinct API uses Directed Acyclic Graphs (DAGs) as the fundamental structure for organizing data processing pipelines.

what is a dag (stream) in the instinct api?

In the Instinct API, streams are implemented as directed acyclic graphs (DAGs) - a special type of graph where:

  • Data flows in one direction (directed)
  • There are no loops or cycles (acyclic)

This structure makes DAGs ideal for representing data processing pipelines, where data moves from source nodes through processing nodes to sink nodes, with each node performing a specific operation.

key properties in stream processing

directed

The "directed" property means that data flows in a specific direction, from upstream nodes to downstream nodes. In practical terms:

  • Each connection (pipe) has a clear source and destination
  • Data only flows in one direction along each pipe
  • The direction of flow dictates the order of operations

acyclic

The "acyclic" property means that data never flows back to a node that has already processed it. This prevents infinite loops and ensures that processing eventually completes.

dag implementation in the instinct api

components

In the Instinct API's implementation of DAGs:

  • Nodes: The processing units that perform operations on data
  • Pipes: The connections between nodes that define how data flows
  • Ports: The interfaces on nodes where pipes connect (input or output)
  • Stream: The complete DAG containing all nodes and pipes

representation in stream configuration

The Instinct API represents DAGs through node and pipe definitions in stream configurations:

{
"nodes": [
{
"id": "eeg-source",
"executable": "headset-reader",
"config": {
"channels": ["Fp1", "Fp2", "F3", "F4"],
"sampleRate": 250
}
},
{
"id": "filter",
"executable": "signal-filter",
"config": {
"highpass": 0.5,
"lowpass": 50,
"notch": 60
}
},
{
"id": "analyzer",
"executable": "frequency-analyzer",
"config": {
"bands": ["alpha", "beta", "theta", "delta"],
"windowSize": 256
}
},
{
"id": "visualizer",
"executable": "data-visualizer",
"config": {
"displayMode": "spectral",
"refreshRate": 10
}
}
],
"pipes": [
{ "id": "raw-data", "source": "eeg-source", "target": "filter" },
{ "id": "filtered-data", "source": "filter", "target": "analyzer" },
{ "id": "analysis-results", "source": "analyzer", "target": "visualizer" }
]
}

dag operations in the instinct api

topological sorting

The Instinct API uses topological sorting to determine the execution sequence for nodes in a DAG. This ensures:

  • Nodes process data only after all upstream dependencies have completed
  • The execution order respects the data dependencies defined by pipes
  • The system can efficiently schedule parallel execution where dependencies allow

dag validation

Before executing a stream, the Instinct API validates the DAG structure to ensure:

  • All node references in pipes exist in the stream definition
  • No cycles are present that would create infinite processing loops
  • All nodes are connected (no isolated nodes without pipes)
  • Input and output port connections between nodes are compatible

stream processing patterns

linear pipelines

The simplest DAG structure in the Instinct API is a linear pipeline where each node processes data sequentially:

EEG Source → Signal Filter → Feature Extractor → Data Storage

branching workflows

The Instinct API supports branching to process data in parallel paths:

                → Alpha Band Analyzer →
EEG Source → Filter → Results Merger → Visualizer
→ Beta Band Analyzer →

merging paths

Multiple processing paths can merge at a node that combines results:

EEG Channel 1 →
→ Signal Combiner → Feature Extraction → Classifier
EEG Channel 2 →

conditional routing

Data can be routed conditionally to different processing paths based on signal properties:

               → Artifact Removal → Reintegration →
EEG Source → QA → Analysis
→ Clean Signal ------------------------>

stream execution

The execution of a DAG in the Instinct API involves:

  1. Initialization: Setting up each node based on configuration parameters
  2. Data Propagation: Transferring data packets through pipes from sources to sinks
  3. Concurrent Processing: Running independent node operations in parallel when possible
  4. Termination: Proper shutdown of all nodes when processing completes

best practices for stream design

When designing DAGs for the Instinct API:

  1. Optimize for Data Flow: Design stream topologies that minimize unnecessary data transformations
  2. Node Granularity: Create focused nodes with single responsibilities rather than complex multi-function nodes
  3. Error Handling: Include error handling paths to manage processing failures gracefully
  4. Monitoring Points: Add monitoring nodes at key points to observe pipeline performance
  5. Resource Allocation: Consider CPU and memory requirements when designing parallel processing paths
  6. Pipeline Documentation: Document the purpose and requirements of your stream design

common stream processing patterns

signal processing pipeline

EEG Source → Filtering → Artifact Removal → Feature Extraction → Classification

real-time monitoring

Headset Data → Signal Processor → Analyzer → Alert System
→ Visualization Dashboard

data collection and analysis

EEG Data → Preprocessor → Feature Extractor → Machine Learning Model → Results
→ Raw Data Storage

next steps

  • Learn about Streams to understand the complete pipeline concept
  • Explore Nodes to see the different processing components available
  • Understand Pipes to master data flow connections between nodes
  • Follow our Basic Pipeline Tutorial to build your first stream