Case Study

Tech Collection

A live AI curation blog aggregator that crawls technical blogs, summarizes and classifies posts with ChatGPT, and improved TPS by 15x through caching.

Visit Product

Project Overview

Tech Collection collects posts from technical blogs with a Puppeteer crawler, processes them through a Spring Boot backend, summarizes and classifies content with the OpenAI API, and serves curated technology content. Memory-based caching removed hot-path database and LLM round trips, improving throughput from 20 to 300 requests per second and response time from 6 seconds to 0.5 seconds.

Key Challenges

Reducing the cost of finding relevant information across fragmented technology blogs
Summarizing and classifying long-form posts automatically
Controlling LLM cost and latency on hot request paths
Operating a hybrid Spring Boot and Node.js crawler pipeline

Key Outcomes

Built an automated crawl, summarize, classify, and serve pipeline
Improved throughput from 20 to 300 requests per second
Reduced response time from 6 seconds to 0.5 seconds
Deployed an event-driven GCP Cloud Run and Eventarc architecture

Technologies

Spring Boot 3PuppeteerOpenAI APIMySQLGCP Cloud RunCloud StorageEventarc