3 April 2026 | 3 min read

How I Built a Signal Scoring System with 8 Quality Factors

signal processing quantitative finance scoring systems Python AI trading

Not All Signals Are Created Equal

When you run multiple AI models generating trade signals, you quickly discover that raw signals are not enough. A momentum model might fire a buy signal on a stock with terrible liquidity during an earnings blackout period. Without a quality filter, that signal goes straight to execution and loses money.

I built a scoring system that evaluates every signal across 8 quality factors before it reaches the risk engine. Signals that score below the threshold get discarded. Here is how it works.

The 8 Quality Factors

Each factor produces a score between 0 and 1. The composite score is a weighted average of all 8 factors.

Factor 1: Signal Strength

How confident is the model in this signal? I normalize the raw model output to a 0-1 scale based on historical signal distributions.

def score_signal_strength(raw_signal: float, history: list[float]) -> float:
    percentile = np.searchsorted(sorted(history), abs(raw_signal)) / len(history)
    return percentile

Factor 2: Model Agreement

When multiple models agree on direction, the signal is more reliable. I measure agreement as the fraction of active models pointing the same way.

def score_model_agreement(signals: dict[str, float]) -> float:
    directions = [1 if s > 0 else -1 for s in signals.values() if s != 0]
    if not directions:
        return 0.0
    majority = max(directions.count(1), directions.count(-1))
    return majority / len(directions)

Factor 3: Liquidity

Signals on illiquid instruments are dangerous because you cannot enter or exit positions efficiently. I score liquidity based on average daily volume relative to the intended position size.

def score_liquidity(avg_daily_volume: float, position_size: float) -> float:
    volume_ratio = avg_daily_volume / max(position_size, 1)
    return min(volume_ratio / 50, 1.0)  # Full score at 50x volume coverage

Factor 4: Volatility Regime

Signals generated during extreme volatility regimes are less reliable because model training data typically underrepresents these conditions.

def score_volatility_regime(current_vol: float, historical_vol: list[float]) -> float:
    vol_percentile = np.searchsorted(sorted(historical_vol), current_vol) / len(historical_vol)
    if vol_percentile > 0.9:
        return 0.3  # High vol penalty
    elif vol_percentile > 0.75:
        return 0.7  # Moderate penalty
    return 1.0

Factor 5: Time Decay

Signals lose relevance over time. A signal generated 5 minutes ago is more actionable than one generated 2 hours ago. I apply exponential decay.

def score_time_decay(signal_age_minutes: float, half_life: float = 30) -> float:
    return np.exp(-0.693 * signal_age_minutes / half_life)

Factor 6: Calendar Awareness

Certain periods are systematically problematic for trading signals: earnings weeks, options expiration days, major economic releases. I maintain a calendar and penalize signals during these windows.

def score_calendar(symbol: str, timestamp: datetime, event_calendar: dict) -> float:
    events = event_calendar.get(symbol, [])
    for event in events:
        days_until = (event['date'] - timestamp.date()).days
        if 0 <= days_until <= event.get('blackout_days', 2):
            return event.get('penalty', 0.2)
    return 1.0

Factor 7: Sector Concentration

If the portfolio already has heavy exposure to a sector, additional signals in that sector receive a penalty to encourage diversification.

def score_sector_concentration(
    symbol_sector: str, 
    portfolio_sectors: dict[str, float],
    max_sector_pct: float = 0.25
) -> float:
    current_pct = portfolio_sectors.get(symbol_sector, 0)
    if current_pct >= max_sector_pct:
        return 0.0
    return 1.0 - (current_pct / max_sector_pct)

Factor 8: Recent Performance

Models go through hot and cold streaks. I track the recent hit rate of each model and boost or penalize accordingly.

def score_recent_performance(
    model_id: str, 
    recent_trades: list[dict],
    lookback: int = 20
) -> float:
    model_trades = [t for t in recent_trades if t['model'] == model_id][-lookback:]
    if len(model_trades) < 5:
        return 0.5  # Neutral for insufficient data
    win_rate = sum(1 for t in model_trades if t['pnl'] > 0) / len(model_trades)
    return win_rate

The Composite Scoring Engine

class SignalScorer:
    WEIGHTS = {
        'signal_strength': 0.20,
        'model_agreement': 0.20,
        'liquidity': 0.15,
        'volatility_regime': 0.10,
        'time_decay': 0.10,
        'calendar': 0.10,
        'sector_concentration': 0.08,
        'recent_performance': 0.07
    }
    
    def score(self, signal: TradeSignal, context: dict) -> float:
        scores = {
            'signal_strength': score_signal_strength(signal.raw, context['history']),
            'model_agreement': score_model_agreement(context['all_signals']),
            'liquidity': score_liquidity(context['adv'], signal.size),
            'volatility_regime': score_volatility_regime(
                context['current_vol'], context['vol_history']
            ),
            'time_decay': score_time_decay(signal.age_minutes),
            'calendar': score_calendar(signal.symbol, signal.timestamp, context['calendar']),
            'sector_concentration': score_sector_concentration(
                signal.sector, context['sector_weights']
            ),
            'recent_performance': score_recent_performance(
                signal.model_id, context['recent_trades']
            )
        }
        
        composite = sum(
            score * self.WEIGHTS[factor] 
            for factor, score in scores.items()
        )
        
        return composite, scores

Setting the Threshold

I calibrate the threshold using historical data. The goal is to find the score cutoff that maximizes the Sharpe ratio of accepted signals. In my systems, this typically lands between 0.55 and 0.70.

Results

After implementing this scoring system, the quality of executed trades improved dramatically. The win rate on accepted signals went from 52% to 61%, and the average profit per trade increased by 40%. More importantly, the worst single-day loss dropped by 55% because the system was filtering out the highest-risk signals before they could do damage.

Signal scoring is the layer between your AI models and your money. Build it carefully, calibrate it regularly, and it will be the most valuable component in your trading system.