A specialized flight comparison engine for the Spanish market. Unlike generic aggregators, it focuses on domestic connectivity, integrating real-time data from official APIs and custom scrapers.
Problem
•Fragmented data sources: Airlines use different protocols and guard their data aggressively.
•Latency requirements: Users expect search results in under 2 seconds, but scrapers are inherently slow.
•Data consistency: Merging structured API responses with unstructured HTML scrapes is error-prone.
Approach
•Designed a hybrid persistence layer: High-speed caching (Redis) for hot routes and persistent storage (MySQL) for historical trends.
•implemented a distributed scraping architecture using Python/Selenium with autonomous error recovery.
•Built a modern, responsive frontend in React that consumes a unified Spring Boot REST API.
Outcome
•Successfully handles concurrent queries across multiple providers.
•Normalized data schema allowing for future expansion into other transport modes.
•Zero-downtime scraper deployment pipeline.
Highlights
Distributed ScrapingHybrid ArchitectureReal-time Data