How to Build an AI-Powered Email Subscription Summarizer
GenAI 30 Project Challenge - 3
GenAI 30 Project Challenge - 3
Published: March 7, 2025 URL: https://buildtolaunch.ai/p/genai-30-project-challenge-3 Engagement: 21 likes, 4 comments, 3 restacks Word count: 1078
If you're like me, your email is probably overflowing with newsletters and notifications from all the subscriptions you've signed up for. It feels like a constant battle to catch up with all the "useful" insights that fill your inbox.
This got me thinking: What if I used GenAI to create a Subscription Summarizer?
A tool that could sift through my inbox, extract the key information, and give me concise, actionable insights.
Project Goals
For this project, I set out to build a program that could:
- Connect to my Gmail account to retrieve all unread subscription emails.
- Parse the emails to extract recommended article links.
- Use AI to read those articles and generate summaries for me.
Choice of Model and Tools
- Model: Hugging Face's BART, a transformer model well-suited for text summarization tasks.
- Programming Language: Barebones Python, keeping it lightweight and straightforward.
- Frameworks: Hugging Face Transformers library for NLP, and Gmail API for email access.
Help from Cursor (As Always)
Cursor played a pivotal role in getting the basics up and running. It:
- Generated boilerplate code to connect with the Gmail API and fetch unread emails.
- Helped parse email content, extract links, and handle authentication securely.
- Integrated BART from Hugging Face, generating a simple pipeline for summarizing the articles.
Rounds of Adjustments
1. Extracting Correct Links: Medium emails are riddled with links — often more than one for the same article. It took 20+ iterations to get the filtering right, but eventually I was extracting clean article URLs.
2. Improving Summary Length: The initial summaries generated by BART were too short — just a couple of sentences. After tweaking the parameters (max_length and min_length), I got summaries that were concise yet informative.
3. From Summarizer to Recommender: Even with improved summaries, I found it overwhelming to read through all of them. That's when I decided to level up: turning the summarizer into a recommender system.
From Summarizer to Recommender
To prioritize the best articles, I built a scoring system based on four metrics:
- Content Quality: Assesses how coherent and informative the content is, using BART to compare the summary with the original.
- Readability: Evaluates the article's ease of reading using metrics like Flesch Reading Ease.
- Relevance: Checks how well the content matches the title.
- Length Score: Scores articles based on their length to balance short, actionable reads with in-depth insights.
weights = {
'content_quality': 0.4,
'readability': 0.3,
'relevance': 0.2,
'length_score': 0.1
}
Now, instead of skimming through every summary, I can sort articles by their overall score and focus on the top ones.
Problems and Improvements
- Email Parsing Complexity: Subscription emails have inconsistent structures. There's room to automate filtering further.
- Summarization Quality: Could be improved by fine-tuning BART on a dataset of Medium articles or experimenting with newer models like T5 or GPT-3.5.
- Relevance Scoring: The relevance metric is currently based on simple keyword matching. Incorporating a semantic similarity model could make this more robust.
- Scalability: As the volume grows, performance might degrade. Optimizing the pipeline and adding caching could help.
Final Thoughts
What started as a simple summarizer evolved into a powerful recommender system that prioritizes high-quality content. Not only does it save me time, but it also ensures I don't miss out on the most valuable insights.
Working with BART and Hugging Face was a great learning experience, and Cursor made the process of writing and refining code much smoother.
On to the next GenAI challenge soon!