Back to projects

Case Study

Tech Collection

A live AI curation blog aggregator that crawls technical blogs, summarizes and classifies posts with ChatGPT, and improved TPS by 15x through caching.

Visit Product

Project Overview

Tech Collection collects posts from technical blogs with a Puppeteer crawler, processes them through a Spring Boot backend, summarizes and classifies content with the OpenAI API, and serves curated technology content. Memory-based caching removed hot-path database and LLM round trips, improving throughput from 20 to 300 requests per second and response time from 6 seconds to 0.5 seconds.

Key Challenges

  • Reducing the cost of finding relevant information across fragmented technology blogs
  • Summarizing and classifying long-form posts automatically
  • Controlling LLM cost and latency on hot request paths
  • Operating a hybrid Spring Boot and Node.js crawler pipeline

Key Outcomes

  • Built an automated crawl, summarize, classify, and serve pipeline
  • Improved throughput from 20 to 300 requests per second
  • Reduced response time from 6 seconds to 0.5 seconds
  • Deployed an event-driven GCP Cloud Run and Eventarc architecture

Technologies

Spring Boot 3PuppeteerOpenAI APIMySQLGCP Cloud RunCloud StorageEventarc