Configuration Guide
Parameter Reference
Required Parameters
from base_attentive import BaseAttentive
model = BaseAttentive(
static_input_dim=4, # Number of static features
dynamic_input_dim=8, # Number of dynamic features
future_input_dim=6, # Number of future features
output_dim=2, # Number of output variables
forecast_horizon=24, # Forecast horizon (steps)
)
Parameter |
Type |
Constraints |
Description |
|---|---|---|---|
|
int |
>= 0 |
Static feature dimension (0 = no static input) |
|
int |
>= 1 |
Dynamic (historical) feature dimension |
|
int |
>= 0 |
Future covariate dimension (0 = no future input) |
|
int |
>= 1 |
Number of output variables |
|
int |
>= 1 |
Forecast length in time steps |
Architectural Parameters
model = BaseAttentive(
static_input_dim=4, dynamic_input_dim=8, future_input_dim=6,
output_dim=2, forecast_horizon=24,
embed_dim=32,
hidden_units=64,
lstm_units=64,
attention_units=32,
num_heads=4,
num_encoder_layers=2,
max_window_size=10,
memory_size=100,
)
Parameter |
Type |
Default |
Range |
Description |
|---|---|---|---|---|
|
int |
32 |
[8, 512] |
Shared embedding dimension |
|
int |
64 |
[16, 1024] |
Dense hidden layer width |
|
int |
64 |
[16, 1024] |
LSTM hidden size (hybrid mode) |
|
int |
32 |
[16, 1024] |
Attention projection dimension |
|
int |
4 |
[1, 16] |
Multi-head attention heads ( |
|
int |
2 |
[1, 12] |
Stacked encoder layer count |
|
int |
10 |
[1, oo) |
Maximum dynamic time window size |
|
int |
100 |
[1, oo) |
Memory bank size for memory-augmented attention |
Temporal Aggregation Parameters
model = BaseAttentive(
...,
scales=[1, 2, 4],
multi_scale_agg="last",
final_agg="last",
)
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list[int] / ‘auto’ / None |
None |
LSTM sub-sampling strides. |
|
str |
‘last’ |
Merge multi-scale outputs: |
|
str |
‘last’ |
Final temporal aggregation: |
Regularization Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
0.1 |
Dropout probability [0, 1] |
|
str |
‘relu’ |
|
|
bool |
False |
Apply batch normalization |
|
bool |
True |
Use residual connections |
Feature Processing Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
True |
Enable Variable Selection Network |
|
int or None |
None |
VSN projection size (defaults to |
|
bool |
True |
Apply Dynamic Time Warping alignment |
Configuration / Routing Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
‘hybrid’ |
Encoder type: |
|
str or None |
None |
Mode shortcut: |
|
str / list / int / None |
None |
Decoder attention stack control (see below) |
|
list[float] or None |
None |
Enables probabilistic output |
|
dict or None |
None |
Structural overrides (highest precedence) |
|
int |
0 |
Logging verbosity |
Architecture Configuration
Use architecture_config for structural choices:
config = {
"encoder_type": "hybrid",
"decoder_attention_stack": ["cross", "hierarchical", "memory"],
"feature_processing": "vsn",
}
model = BaseAttentive(
static_input_dim=4, dynamic_input_dim=8, future_input_dim=6,
output_dim=2, forecast_horizon=24,
architecture_config=config,
)
Attention Level Shortcuts
model = BaseAttentive(..., attention_levels=None) # all three
model = BaseAttentive(..., attention_levels="cross") # string
model = BaseAttentive(..., attention_levels=["cross", "memory"]) # list
model = BaseAttentive(..., attention_levels=1) # 1=cross, 2=hier, 3=memory
V2 Schema Configuration
For programmatic, backend-neutral construction use BaseAttentiveSpec:
from base_attentive.config import BaseAttentiveSpec, BaseAttentiveComponentSpec
spec = BaseAttentiveSpec(
static_input_dim=4,
dynamic_input_dim=8,
future_input_dim=6,
output_dim=1,
forecast_horizon=24,
embed_dim=32,
hidden_units=64,
attention_heads=4,
layer_norm_epsilon=1e-6,
dropout_rate=0.1,
activation="relu",
backend_name="tensorflow",
head_type="point",
quantiles=(),
components=BaseAttentiveComponentSpec(
sequence_pooling="pool.last",
),
)
Configuration Presets
Minimal Configuration
MINIMAL = dict(embed_dim=8, hidden_units=16, lstm_units=16,
attention_units=16, num_heads=1, dropout_rate=0.1)
model = BaseAttentive(
static_input_dim=4, dynamic_input_dim=8, future_input_dim=6,
output_dim=2, forecast_horizon=24, **MINIMAL,
)
Standard Configuration
STANDARD = dict(embed_dim=32, hidden_units=64, lstm_units=64,
attention_units=32, num_heads=4, dropout_rate=0.1,
use_residuals=True)
Large Configuration
LARGE = dict(embed_dim=128, hidden_units=256, lstm_units=256,
attention_units=128, num_heads=8, dropout_rate=0.3,
use_batch_norm=True, use_residuals=True)
Hybrid Preset
HYBRID = dict(
objective="hybrid",
scales=[1, 2, 4],
multi_scale_agg="last",
embed_dim=32,
num_heads=4,
dropout_rate=0.1,
)
Transformer Preset
TRANSFORMER = dict(
objective="transformer",
num_encoder_layers=4,
embed_dim=64,
num_heads=8,
dropout_rate=0.15,
)
Tuning Guidelines
For Longer Sequences (T > 500)
model = BaseAttentive(
..., objective="hybrid", scales=[1, 2, 4],
embed_dim=32, dropout_rate=0.15,
)
For Complex Patterns
model = BaseAttentive(
..., embed_dim=64, num_heads=8,
use_batch_norm=True, use_residuals=True, dropout_rate=0.2,
)
For Fast Inference
model = BaseAttentive(
..., objective="hybrid", embed_dim=16,
hidden_units=32, dropout_rate=0.1,
attention_levels="cross",
)
For Probabilistic Forecasts
model = BaseAttentive(
..., quantiles=[0.1, 0.5, 0.9],
dropout_rate=0.2, use_residuals=True,
)
Configuration Management
Get Configuration
config = model.get_config()
print(config)
# {'static_input_dim': 4, ..., 'scales': None, 'mode': None, ...}
Create from Configuration
new_model = BaseAttentive.from_config(model.get_config())
Reconfigure Model
model2 = model.reconfigure({"encoder_type": "transformer"})
Common Mistakes
Mismatched input dimensions
# Wrong
model = BaseAttentive(static_input_dim=4, ...)
static = np.random.randn(32, 5) # 5 features but model expects 4
# Correct
static = np.random.randn(32, 4)
num_heads must divide embed_dim
# Wrong — 32 / 6 is not integer
model = BaseAttentive(..., embed_dim=32, num_heads=6)
# Correct
model = BaseAttentive(..., embed_dim=32, num_heads=4)
See Also
Quick Start Guide — Quick start guide
Architecture Guide — Architecture details
API Reference — Full API reference