Building a Real-Time Trading Scanner: Architecture Decisions

By Piyush Kumar March 2026 10 min read

Two years ago, I started building Best Algo Trading's core system: a real-time trading scanner that monitors 200+ stocks, runs 8 different strategies, and sends signals to users within milliseconds of opportunity. We built it on a bootstrap budget. No venture capital. No enterprise infrastructure.

In this post, I'll share the architectural decisions we made, the trade-offs we accepted, and the lessons that matter for anyone building fintech or trading systems. This is the real story—not the glossy demo, but the decisions that keep the system alive in production.

The Core Architecture: Python + Flask + Real-Time Data

Most people assume high-frequency trading systems need Rust, C++, or Java. We chose Python + Flask. Here's why, and the caveats.

Why Python?

Where Python Fails (And How We Fixed It)

Problem 1: The GIL (Global Interpreter Lock) Python's GIL prevents true parallelism. If you're processing multiple stock streams simultaneously, your threads fight for the same lock.

Solution: We use multiprocessing instead of threading. Each worker process gets its own Python interpreter, bypassing the GIL. Trade-off: higher memory overhead, but acceptable for our scale (200 stocks = 4-6 worker processes).

Problem 2: Latency Python is slower than compiled languages. For a retail trading system, we need sub-100ms latency from data receipt to signal send.

Solution: We don't do heavy computation in the main loop. Strategy calculations run once per minute (on pre-computed buffers), not on every tick. This works because most strategies don't need tick-level granularity—they work on 1-minute or 5-minute candles.

"The choice isn't between fast languages and slow languages. It's between doing heavy work efficiently and doing light work frequently."
Market Data Feed Scanner Engine Filter + Rank Alert / Log
Conceptual data flow of a market scanner. Architecture overview, simplified.

Data Pipeline: Real-Time Streaming vs. Polling

This is where most bootstrap systems fail. You need to decide: Do you stream data from the broker's API, or poll it periodically?

Streaming (WebSocket/gRPC)

Pros: Instant data, true real-time.

Cons: Brokers often have unstable WebSockets. Connection drops = signal delays. You need resilience, reconnection logic, and buffer management.

Polling (REST API)

Pros: Simple, reliable, forgiving of failures.

Cons: Latency. If you poll every 1 second, you miss intra-second moves.

Our Hybrid Approach

We use a primary WebSocket connection for low-latency data, with an automatic fallback to REST polling if WebSocket drops. The key insight: Most trading signals don't require sub-second precision. A 2-3 second delay is acceptable if it means we never lose data.

# Pseudo-code for our hybrid data pipeline primary_stream = websocket_connect(broker_api) backup_stream = rest_poller(interval=1_second) def get_latest_price(symbol): if primary_stream.is_alive(): return primary_stream.get(symbol) else: return backup_stream.get(symbol)

This single decision saved us 20+ hours of debugging during broker API outages.

Strategy Execution: Decoupling Signals from Orders

Here's a critical lesson: Your signal generation system and your order execution system must be completely decoupled.

Why? Because signals are fast and deterministic. Order execution is slow and error-prone. If a signal generation failure crashes your entire system, you miss trades. If an order execution failure crashes the system, you might place duplicate trades.

We have:

This architecture means we can deploy updates to our signal logic without touching the order execution. We can test new strategies in "dry run" mode (generating signals without placing orders) before turning them live. Most importantly, a broker API outage doesn't crash the entire system.

Handling 200+ Stocks in Real-Time

Here's the practical constraint: We can't afford to do deep computation on every stock every second. So we tiered our processing:

Tier 1: Fast Scan (Every 10 Seconds)

Price checks, simple volatility metrics, entry/exit conditions. Ultra-lightweight, runs across all 200 stocks.

Tier 2: Medium Computation (Every Minute)

Technical indicators (RSI, MACD, Bollinger Bands), IV percentile rankings, cross-asset correlations. Runs on stocks that passed Tier 1.

Tier 3: Heavy Computation (Every 5 Minutes)

Options Greeks, portfolio risk metrics, Monte Carlo simulations for complex strategies. Only runs on highest-conviction setups.

This tiering means we do 90% less computation while catching 95% of tradeable setups. The cost? We miss some ultra-fast reversals. Worth it.

Docker Deployment and Observability

We containerize everything with Docker. Each component is a separate container:

On failure, Kubernetes (or simple Docker restart logic) relaunches the failed container. This is critical because trading systems must never go down. In the past 18 months, we've had ~4 unplanned outages (all due to broker API changes, not our code).

For observability, we log everything: every signal generated, every order placed, every error. We use Prometheus for metrics and Grafana for dashboards. When something breaks at 4 AM, we can trace exactly what happened.

The Lessons

  1. Don't over-engineer early: We started with basic REST polling. WebSocket came only after we proved the concept worked.
  2. Embrace asynchronous processing: Don't wait for slow operations. Queue them, process them in background, check status later.
  3. Decouple everything: Signal generation, order execution, logging—treat them as separate systems. Failures in one shouldn't bring down others.
  4. Test in production (safely): Dry-run mode is your friend. New strategies should run in dry-run for days before touching real capital.
  5. Monitor ruthlessly: You can't fix what you don't measure. Invest in logging and dashboards early.
  6. Respect the broker API: Broker infrastructure will fail. Plan for it. Have fallbacks. Expect rate limits.

What We'd Change

If I were rebuilding from scratch, I'd consider:

The Bottom Line

Building a production trading system is hard. It's not hard because of the algorithms—most strategies are simple. It's hard because of infrastructure, reliability, and operations. An 80/20 system (simple signal, robust infrastructure) beats a complex algorithm on shaky infrastructure every single time.

We chose Python + Flask not because it's the fastest, but because it let us focus on what matters: reliable signal generation and rock-solid risk management.

Disclaimer: This article is for educational purposes only and does not constitute investment advice or a recommendation to trade any security. Algorithmic and options strategies involve significant risk, including loss of capital, and past performance does not guarantee future results. Trade only with capital you can afford to lose, and consult a SEBI-registered professional before making trading decisions.

Get technical insights on trading systems

Practical lessons on building fintech, algo trading systems, and production infrastructure for traders.