Srikar Portfolio

Overview

A creator-first tool that turns raw speech into accurate, audience-friendly YouTube chapters — combining multi‑language transcription, AI analysis, and human-in-the-loop editing so structure matches intent.

Built to solve real workflow pain: control over titles and boundaries, reliable timestamps, and export-ready chapters that improve discoverability and retention.

Client

Personal Project

Product DesignReactExpressAWS TranscribeGemini AI

Key Features

Multi-language Audio Input

Support for multiple languages in audio input with intelligent detection and processing, automatically converting speech to text regardless of the source language.

AI-Powered Chapter Generation

Advanced AI analysis using Gemini AI to identify natural break points, topic changes, and content flow to generate meaningful chapter markers with descriptive titles.

Human-in-the-loop Transcript Editing

Interactive transcript editing interface allowing users to review, correct, and refine AI-generated transcripts before chapter generation for maximum accuracy.

Technical Stack

Frontend

React + modern JS; responsive UI for clean workflows.

Backend

Express with AWS Transcribe; reliable speech-to-text.

AI & Cloud

Gemini AI for analysis; exportable chapter output.

Next Steps

Looking ahead, there are two promising directions for expanding the YouTube Chapter Generator:

Open-Source Version
- Allow users to choose their own ASR system for transcription (AWS Transcribe, Whisper, etc.).
- Let them pair that with their preferred LLM (Gemini, OpenAI, local models, etc.) for chapter generation.
- Distribute with simple CLI or Docker setup so it's easy to run locally.
Native macOS App
- Build a desktop app that runs the two-step process (ASR → LLM) directly on the machine.
- Integrate chapter generation into the YouTube upload workflow, so chapters attach automatically.
- Simplify the publishing process for creators who upload frequently from their Macs.

View Source Code