API Reference¶
This section provides detailed documentation for all public classes and functions in Edsger.
edsger.path Module¶
Dijkstra Class¶
The main class for performing Dijkstra’s shortest path algorithm on directed graphs.
Constructor Parameters¶
edges: pandas.DataFrame containing the directed graph edgestail: Column name for edge source nodes (default: ‘tail’)head: Column name for edge destination nodes (default: ‘head’)weight: Column name for edge weights (default: ‘weight’)orientation: Either ‘out’ (single-source) or ‘in’ (single-target) (default: ‘out’)check_edges: Whether to validate edge data (default: False)permute: Whether to optimize node indexing (default: False)verbose: Whether to print messages about parallel edge removal (default: False)
Methods¶
run¶
def run(self, vertex_idx, path_tracking=False, return_inf=True,
return_series=False, heap_length_ratio=1.0)
Runs the shortest path algorithm from/to the specified vertex.
Parameters:
vertex_idx: Source/target vertex indexpath_tracking: Whether to track paths for reconstruction (default: False)return_inf: Return infinity for unreachable vertices (default: True)return_series: Return results as pandas Series (default: False)heap_length_ratio: Heap size as fraction of vertices (default: 1.0)
Returns:
Array or Series of shortest path lengths
get_path¶
def get_path(self, vertex_idx)
Reconstructs the shortest path to/from a vertex (requires path_tracking=True).
Parameters:
vertex_idx: Destination/source vertex index
Returns:
Array of vertex indices forming the path
get_vertices¶
def get_vertices(self)
Returns all vertices in the graph.
Returns:
Array of vertex indices
Examples¶
Basic Usage¶
from edsger.path import Dijkstra
import pandas as pd
# Create a directed graph
edges = pd.DataFrame({
'tail': [0, 0, 1],
'head': [1, 2, 2],
'weight': [1, 4, 2]
})
# Initialize Dijkstra
dijkstra = Dijkstra(edges)
# Find shortest paths from vertex 0
paths = dijkstra.run(vertex_idx=0)
With Path Tracking¶
# Enable path tracking
paths = dijkstra.run(vertex_idx=0, path_tracking=True)
# Get the actual path to vertex 2
path = dijkstra.get_path(vertex_idx=2)
Custom Parameters¶
# Create with custom settings
dijkstra = Dijkstra(
edges,
orientation='in',
check_edges=True,
permute=True
)
# Run with custom parameters
paths = dijkstra.run(
vertex_idx=2,
return_series=True,
heap_length_ratio=0.5
)
BellmanFord Class¶
The class for performing Bellman-Ford shortest path algorithm on directed graphs, which handles directed graphs with negative edge weights and detects negative cycles.
Constructor Parameters¶
edges: pandas.DataFrame containing the directed graph edgestail: Column name for edge source nodes (default: ‘tail’)head: Column name for edge destination nodes (default: ‘head’)weight: Column name for edge weights (default: ‘weight’)orientation: Either ‘out’ (single-source) or ‘in’ (single-target) (default: ‘out’)check_edges: Whether to validate edge data (default: False)permute: Whether to optimize node indexing (default: False)verbose: Whether to print messages about parallel edge removal (default: False)
Methods¶
run¶
def run(self, vertex_idx, path_tracking=False, return_inf=True,
return_series=False, detect_negative_cycles=True)
Runs the Bellman-Ford algorithm from/to the specified vertex.
Parameters:
vertex_idx: Source/target vertex indexpath_tracking: Whether to track paths for reconstruction (default: False)return_inf: Return infinity for unreachable vertices (default: True)return_series: Return results as pandas Series (default: False)detect_negative_cycles: Whether to detect negative cycles (default: True)
Returns:
Array or Series of shortest path lengths
Raises:
ValueError: If a negative cycle is detected anddetect_negative_cycles=True
get_path¶
def get_path(self, vertex_idx)
Reconstructs the shortest path to/from a vertex (requires path_tracking=True).
Parameters:
vertex_idx: Destination/source vertex index
Returns:
Array of vertex indices forming the path
get_vertices¶
def get_vertices(self)
Returns all vertices in the graph.
Returns:
Array of vertex indices
Examples¶
Basic Usage with Negative Weights¶
from edsger.path import BellmanFord
import pandas as pd
# Create a directed graph with negative weights
edges = pd.DataFrame({
'tail': [0, 0, 1, 1, 2, 3],
'head': [1, 2, 2, 3, 3, 4],
'weight': [1, 4, -2, 5, 1, 3] # Note the negative weight
})
# Initialize Bellman-Ford
bf = BellmanFord(edges)
# Find shortest paths from vertex 0
paths = bf.run(vertex_idx=0)
print(paths) # [ 0. 1. -1. 0. 3.]
Negative Cycle Detection¶
# Create a directed graph with a negative cycle
edges_cycle = pd.DataFrame({
'tail': [0, 1, 2],
'head': [1, 2, 0],
'weight': [1, -2, -1] # Cycle 0→1→2→0 has total weight -2
})
bf_cycle = BellmanFord(edges_cycle)
try:
bf_cycle.run(vertex_idx=0)
except ValueError as e:
print(f"Error: {e}") # Error: Negative cycle detected in the graph
Path Tracking with Negative Weights¶
# Enable path tracking
paths = bf.run(vertex_idx=0, path_tracking=True)
# Get the actual path to vertex 2 (using negative weight path)
path = bf.get_path(vertex_idx=2)
print(path) # Path using the negative weight edge
Performance: Disabling Cycle Detection¶
# For performance when you know there are no negative cycles
paths = bf.run(vertex_idx=0, detect_negative_cycles=False)
BFS Class¶
The class for performing Breadth-First Search on directed graphs, finding shortest paths based on minimum number of edges rather than weighted distances.
Constructor Parameters¶
edges: pandas.DataFrame containing the directed graph edgestail: Column name for edge source nodes (default: ‘tail’)head: Column name for edge destination nodes (default: ‘head’)orientation: Either ‘out’ (single-source) or ‘in’ (single-target) (default: ‘out’)check_edges: Whether to validate edge data (default: False)permute: Whether to optimize node indexing (default: False)verbose: Whether to print messages about parallel edge removal (default: False)sentinel: Sentinel value for unreachable nodes (default: -9999). Must be a negative integer that fits in int32 range.
Note: Weight column is not used by BFS. If present in the DataFrame, it will be ignored.
Methods¶
run¶
def run(self, vertex_idx, path_tracking=False, return_series=False)
Runs BFS from/to the specified vertex.
Parameters:
vertex_idx: Source/target vertex indexpath_tracking: Whether to track paths for reconstruction (default: False)return_series: Return results as pandas Series (default: False)
Returns:
Array or Series of predecessor indices
The sentinel value (default
-9999) indicates unreachable vertex or start vertexOther values indicate the predecessor in the shortest path
get_path¶
def get_path(self, vertex_idx)
Reconstructs the shortest path to/from a vertex (requires path_tracking=True).
Parameters:
vertex_idx: Destination/source vertex index
Returns:
Array of vertex indices forming the path (from target to source)
get_vertices¶
def get_vertices(self)
Returns all vertices in the directed graph.
Returns:
Array of vertex indices
Examples¶
Basic BFS¶
from edsger.path import BFS
import pandas as pd
# Create a directed graph
edges = pd.DataFrame({
'tail': [0, 0, 1, 2, 2, 3],
'head': [1, 2, 3, 3, 4, 4]
})
# Initialize BFS
bfs = BFS(edges)
# Find shortest paths (by hop count) from vertex 0
predecessors = bfs.run(vertex_idx=0)
print(predecessors) # [-9999 0 0 1 2]
Custom Sentinel Value¶
# Create BFS with custom sentinel for unreachable nodes
bfs = BFS(edges, sentinel=-1)
predecessors = bfs.run(vertex_idx=0)
# Check the sentinel value
print(bfs.UNREACHABLE) # -1
# Unreachable nodes will have value -1 instead of -9999
With Path Tracking¶
# Enable path tracking
predecessors = bfs.run(vertex_idx=0, path_tracking=True)
# Get the actual path to vertex 4
path = bfs.get_path(vertex_idx=4)
print(path) # [4 2 0] - path from target to source
With Non-Contiguous Vertex IDs¶
# Directed graph with non-contiguous vertex IDs
edges_sparse = pd.DataFrame({
'tail': [0, 0, 10, 20],
'head': [10, 20, 20, 30]
})
# Use permute=True for efficiency
bfs = BFS(edges_sparse, permute=True)
predecessors = bfs.run(vertex_idx=0, path_tracking=True)
path = bfs.get_path(vertex_idx=30)
print(path) # [30 20 0]
Backward Search (orientation=’in’)¶
# Find vertices that can reach a target
bfs_in = BFS(edges, orientation='in')
predecessors = bfs_in.run(vertex_idx=4)
# Shows successors in backward traversal
Return as Pandas Series¶
# Get results as labeled Series
predecessors = bfs.run(vertex_idx=0, return_series=True)
print(predecessors)
# vertex_idx
# 0 -9999
# 1 0
# 2 0
# 3 1
# 4 2
# Name: predecessor, dtype: int32
Performance Characteristics¶
Time Complexity: O(V + E) - linear in vertices and edges
Space Complexity: O(V) - queue and predecessor arrays
HyperpathGenerating Class¶
The class for computing hyperpath-based routing using the Spiess-Florian algorithm, designed for transit network assignment and optimal strategy computation.
Reference: Spiess, H. and Florian, M. (1989). Optimal strategies: A new assignment model for transit networks. Transportation Research Part B 23(2), 83-102.
Overview¶
The Spiess-Florian algorithm computes optimal travel strategies in transit networks where passengers can choose between multiple routes with different frequencies and travel times. Unlike traditional shortest path algorithms, it models the probabilistic nature of transit where passengers may take the first arriving vehicle among a set of attractive lines.
The algorithm outputs:
Edge volumes: How passenger demand is distributed across transit lines
Travel times (u_i_vec): Expected travel time from each vertex to the destination, including waiting time
Constructor Parameters¶
edges: pandas.DataFrame containing transit network edgestail: Column name for edge source nodes (default: ‘tail’)head: Column name for edge destination nodes (default: ‘head’)trav_time: Column name for travel times (default: ‘trav_time’)freq: Column name for service frequencies (default: ‘freq’)check_edges: Whether to validate edge data (default: False)orientation: Currently only ‘in’ is supported (default: ‘in’)
Attributes¶
edge_count: Number of edges in the graphvertex_count: Number of vertices in the graphu_i_vec: Array of expected travel times from each vertex to the destination (available afterrun())_edges: DataFrame containing edges with computed ‘volume’ column (available afterrun())
Methods¶
run¶
def run(self, origin, destination, volume, return_inf=False)
Computes the hyperpath and distributes demand volume across edges.
Parameters:
origin: Origin vertex index (int) or list of origin indicesdestination: Destination vertex index (int)volume: Demand volume (float) or list of volumes corresponding to each originreturn_inf: Whether to keep infinity for unreachable vertices (default: False)
Returns:
None (results are stored in instance attributes
u_i_vecand_edges['volume'])
Examples¶
Basic Transit Network¶
from edsger.path import HyperpathGenerating
import pandas as pd
# Create a simple transit network
# Two parallel lines from stop 0 to stop 2 via different intermediate stops
edges = pd.DataFrame({
'tail': [0, 0, 1, 3],
'head': [1, 3, 2, 2],
'trav_time': [5.0, 4.0, 3.0, 4.0], # Travel times in minutes
'freq': [0.1, 0.15, 0.2, 0.2] # Frequencies (vehicles per minute)
})
# Initialize the algorithm
hp = HyperpathGenerating(edges)
# Compute hyperpath from origin 0 to destination 2 with 100 passengers
hp.run(origin=0, destination=2, volume=100.0)
# Access expected travel times to destination
print("Travel times to destination:", hp.u_i_vec)
# Access edge volumes (how passengers are distributed)
print(hp._edges[['tail', 'head', 'volume']])
Multiple Origins¶
# Compute hyperpath from multiple origins to a single destination
hp = HyperpathGenerating(edges)
hp.run(
origin=[0, 1], # Two origins
destination=2, # Single destination
volume=[80.0, 20.0] # Different demand volumes
)
# Total edge volumes reflect combined demand from all origins
print("Edge volumes:", hp._edges['volume'].values)
Custom Column Names¶
# Transit data with custom column names
transit_edges = pd.DataFrame({
'from_stop': [0, 0, 1, 2],
'to_stop': [1, 2, 2, 3],
'travel_time': [2.0, 5.0, 1.5, 3.0],
'frequency': [0.1, 0.05, 0.2, 0.15]
})
hp = HyperpathGenerating(
transit_edges,
tail='from_stop',
head='to_stop',
trav_time='travel_time',
freq='frequency'
)
hp.run(origin=0, destination=3, volume=1.0)
With Edge Validation¶
# Enable input validation
hp = HyperpathGenerating(edges, check_edges=True)
# This will raise an error if:
# - Required columns are missing
# - Data types are incorrect
# - Travel times or frequencies are negative
# - Missing values exist
Understanding the Output¶
After calling run(), two key outputs are available:
u_i_vec: Expected travel time from each vertex to the destinationIncludes both in-vehicle travel time and expected waiting time
Infinite values indicate unreachable vertices
_edges['volume']: Passenger volumes on each edgeShows how total demand is distributed across the transit network
Higher volumes indicate more heavily used transit lines
Use Cases¶
Transit assignment: Distribute passenger demand across a transit network
Line planning: Identify heavily used routes that may need more capacity
Service evaluation: Assess expected travel times for different origin-destination pairs
Network design: Evaluate the impact of adding or removing transit lines
Parallel Edges Handling¶
The Dijkstra, BellmanFord, and BFS classes automatically handle parallel edges (multiple directed edges between the same pair of vertices) during initialization. When parallel edges are detected:
Automatic Processing: The
_preprocess_edges()method is called internally during initializationMinimum Weight Selection: For each pair of vertices with multiple edges, only the edge with the minimum weight is kept
Verbose Output: If
verbose=Trueis set in the constructor, a message will be printed indicating how many parallel edges were removed
Example with Parallel Edges¶
import pandas as pd
from edsger.path import Dijkstra
# Directed graph with parallel edges between vertices 0->1
edges = pd.DataFrame({
'tail': [0, 0, 1], # Two directed edges from 0->1
'head': [1, 1, 2], # with weights 5.0 and 3.0
'weight': [5.0, 3.0, 2.0]
})
# Initialize with verbose=True to see parallel edge removal
dijkstra = Dijkstra(edges, verbose=True)
# Output: "Automatically removed 1 parallel edge(s). For each pair of vertices, kept the edge with minimum weight."
# The directed graph now has only the edge 0->1 with weight 3.0 (minimum)
print(dijkstra.n_edges) # Will show 2 instead of 3
This automatic handling ensures:
Consistent directed graph representation
Optimal shortest paths (using minimum weight edges)
No duplicate edges in the internal directed graph structure
edsger.graph_importer Module¶
GraphImporter Architecture¶
The GraphImporter system provides automatic DataFrame backend detection and optimization for all graph algorithms in Edsger.
Key Features¶
Automatic Detection: Identifies DataFrame type (pandas NumPy, pandas Arrow, or Polars)
Memory Optimization: Automatically uses uint32 for vertex indices when possible
Contiguous Memory: Ensures C-contiguous arrays for optimal Cython performance
Factory Pattern: Simple interface with automatic backend selection
standardize_graph_dataframe Function¶
from edsger.graph_importer import standardize_graph_dataframe
def standardize_graph_dataframe(edges, tail='tail', head='head', weight='weight')
Standardizes any supported DataFrame format to NumPy-backed pandas DataFrame.
Parameters:
edges: Input DataFrame (pandas, pandas with Arrow, or Polars)tail: Column name for source verticeshead: Column name for destination verticesweight: Column name for edge weights
Returns:
pandas DataFrame with NumPy arrays, optimized dtypes, and contiguous memory
Example Usage¶
import pandas as pd
import polars as pl
from edsger.graph_importer import standardize_graph_dataframe
# Works with any DataFrame type
edges_polars = pl.DataFrame({
'source': [0, 1, 2],
'target': [1, 2, 3],
'cost': [1.0, 2.0, 3.0]
})
# Automatic conversion and optimization
edges_numpy = standardize_graph_dataframe(
edges_polars,
tail='source',
head='target',
weight='cost'
)
print(edges_numpy.dtypes)
# source uint32 (optimized from int64)
# target uint32 (optimized from int64)
# cost float64
GraphImporter Classes¶
GraphImporter (Base Class)¶
Abstract base class providing the factory method for automatic DataFrame type detection.
Class Method: from_dataframe¶
@staticmethod
def from_dataframe(edges, tail='tail', head='head', weight='weight')
Factory method that automatically selects the appropriate importer based on DataFrame type.
Detection Priority:
Polars DataFrame
pandas with Arrow backend
pandas with NumPy backend (default)
PandasNumpyImporter¶
Handles standard pandas DataFrames with NumPy backend.
Minimal overhead for already NumPy-backed DataFrames
Ensures contiguous memory layout
Maintains existing dtypes
PandasArrowImporter¶
Handles pandas DataFrames with PyArrow backend.
Converts Arrow columns to NumPy arrays
Optimizes integer dtypes (uint32 when possible)
Handles nullable types gracefully
PolarsImporter¶
Handles Polars DataFrames.
Efficient conversion using Polars’ to_pandas() method
Automatic dtype optimization
Preserves performance benefits of columnar storage during conversion
Performance Considerations¶
The GraphImporter system optimizes for:
Memory Efficiency: Uses smallest possible dtype for vertex indices
Cache Locality: Ensures C-contiguous memory layout
Conversion Speed: Single conversion at initialization
Zero-Copy When Possible: Minimizes data copying during conversion
Integration with Algorithms¶
All path algorithms (Dijkstra, BellmanFord, BFS, HyperpathGenerating) automatically use the GraphImporter system:
from edsger.path import Dijkstra
import polars as pl
# GraphImporter is used internally
edges = pl.DataFrame({...})
dijkstra = Dijkstra(edges) # Automatic detection and conversion
No manual conversion is needed - the algorithms handle everything automatically!