Google Reviews Scraper Pro — Python + MongoDB review extraction

Overview

Google Reviews Scraper Pro is a Python tool that extracts reviews from Google Maps listings, handles multiple languages, downloads review images, and stores the results in MongoDB. It was built to solve a real operational problem: manually collecting reviews for thousands of listings is not just slow, it is error-prone and impossible to scale.

The problem it solves

Review data is locked behind a JavaScript-heavy interface that actively resists scraping. Off-the-shelf tools break within weeks because Google rotates DOM selectors, throttles requests, and serves different markup to different user agents. This project takes the long view: it assumes the DOM will change and designs around that assumption.

Key features

Multi-language extraction. Reviews are captured regardless of their original language, with metadata preserved for later translation or classification.
Incremental scraping. On subsequent runs it picks up where it left off, only fetching new reviews. This makes daily cron runs cheap.
Image downloading. Reviews with photos get their images pulled into storage, with URLs rewritten to point at the local copies.
MongoDB integration. Built-in persistence means no CSV juggling. Queries are fast and the schema supports filtering by rating, language, date, and author.
Detection resilience. Rate limiting, user-agent rotation, and request shaping keep the scraper under the radar.

Tech stack

Python, Playwright for headless browsing, BeautifulSoup for parsing, MongoDB for storage, and Pillow for image processing. Dockerized so it runs anywhere with one command.

What I would do differently

If I were rebuilding this today, I'd move the image pipeline to a proper object store (Cloudflare R2 or S3) instead of local filesystem, and I'd split the scraping logic from the persistence layer so each can be tested independently. The current version couples them tightly, which makes unit tests awkward.

Takeaway

Scraping at scale is less about clever selectors and more about resilience. Every decision — rate limits, retries, checkpointing, logging — matters more than the HTML parsing itself. Build assuming things will break, and they break less.

Travel Panel: the core travel management platform illustration

FeaturedMoon Holidays

Dec 1, 202211 minDec 2022 — Present

Travel Panel: the core travel management platform

FastAPI backend, Next.js operator portal, and B2B partner portal powering Moon Holidays end to end

Travel Panel is the core system at Moon Holidays. A FastAPI backend, a Next.js operator portal, a B2B partner portal, and the orchestrator for every downstream product: TravelOffer for end customers, Live Deck for call-center TVs, Vercel Controller for deployment cache, StaySync for allotment availability, and a WebSocket messenger for internal communication. Running on AWS with ALB, MemoryDB, CloudFront, S3, and more.

fastapinextjspythontypescript

py-image-compressor: batch image compression in Python illustration

GitHub

Nov 30, 20222 min38

py-image-compressor: batch image compression in Python

A lightweight CLI for compressing, converting, and resizing images in bulk

A small Python utility for compressing, converting, and resizing hundreds of images at once. Supports modern formats like WebP, handles recursive directory scans, and preserves structure so you can run it on a whole asset folder safely.

pythoncliimageswebp

Moon Support Hub: an enterprise ticketing platform illustration

Moon Holidays

Dec 8, 20256 minDec 2025 — Present

Moon Support Hub: an enterprise ticketing platform

Next.js 16 with Prisma on MongoDB, 740+ source files, 186 React components, 135+ API endpoints, 60 models, and 14 scheduled background jobs

Moon Support Hub is a full-featured enterprise support system: ticketing with SLA management, a knowledge base with publishing workflow, role-based access control, MinIO attachments, and a customer portal. Built on Next.js 16 with Prisma on MongoDB, 740+ source files, 186 React components, 135+ API endpoints, 60 models, 14 scheduled background jobs, and 8 pre-built reports.

leadershipnextjsmongodbprisma