Documentation
¶
Index ¶
- Variables
- func AvailabilityError() error
- func ExtractNumericFields(data []byte) (map[string]float64, error)
- func HasOpenZL() bool
- func IsAvailable() bool
- func NewFactory() processor.Factory
- func TestDelimiterFixWithStatisticalSignificance(t *testing.T)
- func ValidateConfig(cfg DetectorConfig) error
- func ValidateLibrary() error
- type Config
- type Detector
- type DetectorConfig
- type EnhancedMetrics
- type KafkaConfig
- type Metrics
- type NumericFieldExtractor
- type NumericStats
- type NumericStatsRegistry
- type NumericStatsSnapshot
- type RedisConfig
- type Tokenizer
- type TokenizerConfig
- type TokenizerStats
- type ValueOutlier
Constants ¶
This section is empty.
Variables ¶
var ErrCBADPanic = errors.New("CBAD internal panic - Rust panic caught at FFI boundary")
ErrCBADPanic indicates a Rust panic was caught in the CBAD library. This is a critical error indicating a bug in the Rust code.
Functions ¶
func AvailabilityError ¶
func AvailabilityError() error
AvailabilityError returns the last validation error, if any.
func ExtractNumericFields ¶
ExtractNumericFields recursively extracts all numeric fields from JSON. Returns map of field paths (dot-notation) to values. Example: {"user": {"age": 25}} -> {"user.age": 25}
func HasOpenZL ¶
func HasOpenZL() bool
HasOpenZL returns true when the linked Rust library was built with the optional OpenZL feature.
func IsAvailable ¶
func IsAvailable() bool
IsAvailable reports whether the CBAD Rust core is reachable via CGO.
func NewFactory ¶
func ValidateConfig ¶
func ValidateConfig(cfg DetectorConfig) error
ValidateConfig checks if the detector configuration is valid
func ValidateLibrary ¶
func ValidateLibrary() error
ValidateLibrary checks if the CBAD library is properly loaded and functional
Types ¶
type Config ¶
type Config struct {
// Embed the component.Config interface to satisfy the component configuration contract
component.Config `mapstructure:",squash"`
WindowSize int `mapstructure:"window_size"`
HopSize int `mapstructure:"hop_size"`
Threshold float64 `mapstructure:"threshold"`
Determinism bool `mapstructure:"determinism"`
// Kafka configuration for publishing OTLP events
Kafka KafkaConfig `mapstructure:"kafka"`
// Redis configuration for distributed state management
Redis RedisConfig `mapstructure:"redis"`
}
Config defines the configuration for the driftlock CBAD processor
type Detector ¶
type Detector struct {
// contains filtered or unexported fields
}
Detector represents a streaming anomaly detector
func NewDetector ¶
func NewDetector(config DetectorConfig) (*Detector, error)
NewDetector creates a new anomaly detector with the given configuration
func (*Detector) Close ¶
func (d *Detector) Close()
Close destroys the detector and frees its memory
func (*Detector) DetectAnomaly ¶
func (d *Detector) DetectAnomaly() (bool, *EnhancedMetrics, error)
DetectAnomaly performs anomaly detection and returns results
type DetectorConfig ¶
type DetectorConfig struct {
BaselineSize int
WindowSize int
HopSize int
MaxCapacity int
PValueThreshold float64
NCDThreshold float64
PermutationCount int
Seed uint64
CompressionRatioDropThreshold float64
EntropyChangeThreshold float64
CompositeThreshold float64
RequireStatisticalSignificance bool
CompressionAlgorithm string
}
DetectorConfig represents configuration for the anomaly detector.
func DefaultDemoConfig ¶
func DefaultDemoConfig() DetectorConfig
DefaultDemoConfig returns a configuration optimized for demo
func DefaultProductionConfig ¶
func DefaultProductionConfig() DetectorConfig
DefaultProductionConfig returns a production-ready configuration
type EnhancedMetrics ¶
type EnhancedMetrics struct {
NCD float64
PValue float64
BaselineCompressionRatio float64
WindowCompressionRatio float64
BaselineEntropy float64
WindowEntropy float64
IsAnomaly bool
ConfidenceLevel float64
IsStatisticallySignificant bool
CompressionRatioChange float64
EntropyChange float64
Explanation string
RecommendedNCDThreshold float64
RecommendedWindowSize int
DataStabilityScore float64
}
EnhancedMetrics represents comprehensive anomaly detection results.
func (*EnhancedMetrics) GetAnomalyExplanation ¶
func (m *EnhancedMetrics) GetAnomalyExplanation() string
GetAnomalyExplanation generates a human-readable explanation of the anomaly detection result
func (*EnhancedMetrics) GetDetailedExplanation ¶
func (m *EnhancedMetrics) GetDetailedExplanation() string
GetDetailedExplanation provides a comprehensive explanation with statistical significance
type KafkaConfig ¶
type KafkaConfig struct {
Enabled bool `mapstructure:"enabled"`
Brokers []string `mapstructure:"brokers"`
ClientID string `mapstructure:"client_id"`
EventsTopic string `mapstructure:"events_topic"`
TLSEnabled bool `mapstructure:"tls_enabled"`
BatchSize int `mapstructure:"batch_size"`
BatchTimeoutMs int `mapstructure:"batch_timeout_ms"`
}
KafkaConfig holds Apache Kafka settings for the processor
type Metrics ¶
type Metrics struct {
NCD float64
PValue float64
BaselineCompressionRatio float64
WindowCompressionRatio float64
BaselineEntropy float64
WindowEntropy float64
IsAnomaly bool
ConfidenceLevel float64
}
Metrics represents the anomaly detection results from CBAD.
func ComputeMetrics ¶
func ComputeMetrics(baseline []byte, window []byte, seed uint64, permutations int) (*Metrics, error)
ComputeMetrics calculates CBAD anomaly detection metrics using compression-based analysis
func ComputeMetricsQuick ¶
ComputeMetricsQuick calculates CBAD metrics with default configuration
func (*Metrics) GetAnomalyExplanation ¶
GetAnomalyExplanation generates a human-readable explanation of the anomaly detection result.
func (*Metrics) GetConfidenceLevel ¶
GetConfidenceLevel returns the confidence level as a percentage (0-100).
func (*Metrics) GetDetailedExplanation ¶
GetDetailedExplanation provides a comprehensive explanation with statistical significance.
func (*Metrics) IsStatisticallySignificant ¶
IsStatisticallySignificant returns true if the anomaly is statistically significant (p < 0.05).
type NumericFieldExtractor ¶
type NumericFieldExtractor struct {
// IncludePatterns: if non-empty, only fields matching these patterns are tracked
IncludePatterns []string
// ExcludePatterns: fields matching these patterns are ignored
ExcludePatterns []string
}
NumericFieldExtractor extracts numeric values from JSON with their paths
type NumericStats ¶
type NumericStats struct {
N int64 // Count of values seen
Mean float64 // Running mean
M2 float64 // Sum of squared differences from mean (for variance)
Min float64 // Minimum value seen
Max float64 // Maximum value seen
// contains filtered or unexported fields
}
NumericStats tracks running statistics using Welford's online algorithm. This allows single-pass computation of mean and variance without storing all values. Memory efficient: O(1) per field path regardless of event count.
func NewNumericStats ¶
func NewNumericStats() *NumericStats
NewNumericStats creates a new stats tracker
func (*NumericStats) IsOutlier ¶
func (s *NumericStats) IsOutlier(x float64, k float64) bool
IsOutlier checks if a value is more than k standard deviations from the mean. Common values for k: 2.0 (95%), 3.0 (99.7%), 4.0 (99.99%) Returns false if insufficient data (N < 30) for statistical significance.
func (*NumericStats) Snapshot ¶
func (s *NumericStats) Snapshot() NumericStatsSnapshot
Snapshot returns a copy of the current statistics
func (*NumericStats) StdDev ¶
func (s *NumericStats) StdDev() float64
StdDev returns the sample standard deviation.
func (*NumericStats) Update ¶
func (s *NumericStats) Update(x float64)
Update incorporates a new value using Welford's algorithm. This is numerically stable for computing variance in a single pass.
func (*NumericStats) Variance ¶
func (s *NumericStats) Variance() float64
Variance returns the sample variance (N-1 denominator). Returns 0 if fewer than 2 samples.
func (*NumericStats) ZScore ¶
func (s *NumericStats) ZScore(x float64) float64
ZScore calculates how many standard deviations x is from the mean. Returns 0 if insufficient data.
type NumericStatsRegistry ¶
type NumericStatsRegistry struct {
// contains filtered or unexported fields
}
NumericStatsRegistry manages stats for multiple streams and field paths. Thread-safe for concurrent access.
func NewNumericStatsRegistry ¶
func NewNumericStatsRegistry() *NumericStatsRegistry
NewNumericStatsRegistry creates a new registry
func (*NumericStatsRegistry) Clear ¶
func (r *NumericStatsRegistry) Clear(streamID string)
Clear removes all stats for a stream
func (*NumericStatsRegistry) GetOrCreate ¶
func (r *NumericStatsRegistry) GetOrCreate(streamID, fieldPath string) *NumericStats
GetOrCreate returns stats for a stream/field, creating if needed
func (*NumericStatsRegistry) GetStreamStats ¶
func (r *NumericStatsRegistry) GetStreamStats(streamID string) map[string]NumericStatsSnapshot
GetStreamStats returns all stats for a stream
type NumericStatsSnapshot ¶
type NumericStatsSnapshot struct {
N int64 `json:"n"`
Mean float64 `json:"mean"`
StdDev float64 `json:"std_dev"`
Min float64 `json:"min"`
Max float64 `json:"max"`
}
NumericStatsSnapshot is a point-in-time copy of statistics
type RedisConfig ¶
type RedisConfig struct {
Enabled bool `mapstructure:"enabled"`
Addr string `mapstructure:"addr"`
Password string `mapstructure:"password"`
DB int `mapstructure:"db"`
Prefix string `mapstructure:"prefix"`
}
RedisConfig holds Redis settings for distributed state management
type Tokenizer ¶
type Tokenizer struct {
// contains filtered or unexported fields
}
Tokenizer preprocesses data to replace high-entropy fields with tokens
func GetTokenizer ¶
func GetTokenizer(cfg TokenizerConfig) *Tokenizer
GetTokenizer returns a shared tokenizer for the given configuration. This avoids repeated regex compilation overhead.
func NewTokenizer ¶
func NewTokenizer(cfg TokenizerConfig) *Tokenizer
NewTokenizer creates a tokenizer with the given configuration
func (*Tokenizer) Config ¶
func (t *Tokenizer) Config() TokenizerConfig
Config returns the current tokenizer configuration
func (*Tokenizer) ResetStats ¶
func (t *Tokenizer) ResetStats()
ResetStats resets all statistics to zero
func (*Tokenizer) Stats ¶
func (t *Tokenizer) Stats() TokenizerStats
Stats returns a copy of the current tokenization statistics
func (*Tokenizer) Tokenize ¶
Tokenize replaces high-entropy fields with tokens Returns the tokenized data (new allocation, original unchanged)
func (*Tokenizer) UpdateConfig ¶
func (t *Tokenizer) UpdateConfig(cfg TokenizerConfig)
UpdateConfig updates the tokenizer configuration
type TokenizerConfig ¶
type TokenizerConfig struct {
EnableUUID bool `json:"enable_uuid"` // Replace UUIDs with <UUID>
EnableHash bool `json:"enable_hash"` // Replace hex hashes with <HASH>
EnableBase64 bool `json:"enable_base64"` // Replace Base64 with <B64>
EnableJWT bool `json:"enable_jwt"` // Replace JWTs with <JWT>
}
TokenizerConfig controls which high-entropy patterns to replace
func DefaultTokenizerConfig ¶
func DefaultTokenizerConfig() TokenizerConfig
DefaultTokenizerConfig returns a configuration with all patterns enabled
type TokenizerStats ¶
type TokenizerStats struct {
JWTCount int64
UUIDCount int64
HashCount int64
Base64Count int64
BytesSaved int64
}
TokenizerStats tracks tokenization statistics for observability
type ValueOutlier ¶
type ValueOutlier struct {
FieldPath string `json:"field_path"`
Value float64 `json:"value"`
ExpectedMean float64 `json:"expected_mean"`
StdDev float64 `json:"std_dev"`
ZScore float64 `json:"z_score"`
SampleCount int64 `json:"sample_count"`
}
ValueOutlier represents a detected numeric value anomaly
func CheckForOutliers ¶
func CheckForOutliers( registry *NumericStatsRegistry, streamID string, event []byte, kSigma float64, updateStats bool, ) ([]ValueOutlier, error)
CheckForOutliers analyzes numeric fields in an event against historical stats. Returns any fields that exceed the k-sigma threshold.