How I Built a Family Events Platform with Next.js and 11 Data Sources
What Is Swindo?
Swindo.co.uk is a local platform for Swindon that helps families and residents find things to do. It aggregates events, venue information, and activity listings from 11 different data sources into a single, well-organised directory. Think of it as a local "what's on" guide, but powered by real data rather than manual curation.
I built it because Swindon's events and venue information is scattered across dozens of different websites, Facebook groups, council pages, and business listings. No single source has the complete picture. Swindo brings it all together.
The Data Sources
Getting data from 11 sources means dealing with 11 different formats, APIs, and reliability levels:
- Google Places API: Venue details, ratings, opening hours, photos
- Eventbrite API: Ticketed events
- Facebook Events: Community events (scraped, since the API is restricted)
- Swindon Borough Council: Council-run events and facilities
- TripAdvisor: Reviews and visitor ratings
- Meetup.com: Group activities and regular meetups
- Local venue websites: Individual scrapers for major venues
- Yelp: Additional business listings and reviews
- OpenStreetMap: Geographic data and categorisation
- Instagram: Visual content from venue accounts
- Manual submissions: A form for venue owners to submit updates
The Normalisation Challenge
Each source represents data differently. Google gives you structured JSON with clear fields. Facebook gives you HTML that changes layout periodically. Council data comes as semi-structured HTML tables. The normalisation layer converts everything into a unified schema:
@dataclass
class UnifiedEvent:
title: str
description: str
venue_id: Optional[int]
start_time: datetime
end_time: Optional[datetime]
source: str
source_url: str
categories: list[str]
price_info: Optional[str]
image_url: Optional[str]
confidence_score: float # how confident we are in the data quality
Deduplication Across Sources
The same event often appears in multiple sources. A pub quiz might be listed on Facebook, on the venue's own website, and on Eventbrite. The deduplication system uses a combination of:
- Venue matching (if we know the venue, events at the same venue on the same date are likely duplicates)
- Title similarity using fuzzy matching with a threshold of 85%
- Time overlap detection for events at the same location
When duplicates are found, the system merges them, keeping the richest description and the most complete metadata from across all sources.
The Next.js Frontend
I chose Next.js for the frontend because it gives me server-side rendering out of the box, which is essential for SEO. A local events platform lives or dies on organic search traffic.
The site is structured around three main views:
- Events feed: Chronological listing of upcoming events with filtering by category, date, and area
- Venue directory: Searchable directory of 300+ venues with detailed profiles
- Family activities: A curated section specifically for family-friendly events and venues
SEO Strategy
For a local platform, SEO is everything. Every venue gets its own page with a unique URL, structured data markup (Schema.org), and content optimised for local search terms. Event pages are generated dynamically and include JSON-LD structured data:
function EventJsonLd({ event }) {
return (
);
}
Automated Content Pipeline
The data ingestion runs on a schedule:
- Hourly: API sources (Google, Eventbrite, Meetup) are polled for new data
- Every 6 hours: Scrapers run against web sources
- Daily: Full deduplication pass and data quality checks
- Weekly: Stale venue data is flagged for review
The entire pipeline is managed by a Python scheduler that logs every run and alerts me if a source fails consistently.
Handling Source Failures
When you depend on 11 external sources, something is always broken. Websites change their layouts. APIs hit rate limits. Services go down for maintenance. I designed the system with graceful degradation in mind:
- Each source can fail independently without affecting others
- Failed fetches are retried with exponential backoff
- If a source fails for more than 48 hours, I get an alert
- Stale data from a failed source remains visible but is marked with a freshness indicator
Content Freshness Strategy
Events have a natural expiry date, but venue information also goes stale. I implemented a freshness scoring system that considers when each piece of data was last verified. Sources that update frequently (like Google Places) get a higher freshness weight than sources that are scraped less often. When a venue's overall freshness score drops below a threshold, its listing gets a subtle indicator showing that some information may be outdated. This honest approach to data quality has actually built more trust with users than pretending everything is always current.
Mobile-First Design
Over 70% of Swindo's traffic comes from mobile devices, which makes sense for a local discovery platform. The Next.js frontend is designed mobile-first with large touch targets, easy-to-scan cards, and a bottom navigation bar. The map integration uses Leaflet with custom markers for different venue categories. Performance optimisation was critical since users on mobile often have variable connection quality. I use Next.js image optimisation, aggressive caching, and lazy loading for below-the-fold content to keep the time to interactive under 2 seconds on 3G connections.
Results
Swindo now tracks over 300 venues and surfaces dozens of events weekly. The organic search traffic has been growing steadily, driven largely by venue pages ranking for searches like "best restaurants in Swindon" and "things to do in Swindon this weekend."
The biggest lesson from this project is that data aggregation at this scale is mostly an engineering problem, not an AI problem. The value comes from reliable pipelines, good deduplication, and presenting clean data in a useful format. AI plays a supporting role in categorisation and content enrichment, but the foundation is solid data engineering.