Build RAG Chatbot with OpenAI & Groq

What you're building: AI chatbot that answers questions using your PDF documents

Tech stack: React + NestJS + PostgreSQL + OpenAI Embeddings + Groq LLM

What is RAG?

RAG (Retrieval-Augmented Generation) = Document search + AI answers

Upload PDFs
Ask questions
Get answers based on your documents
No AI hallucinations — only facts from your files

Perfect for: Knowledge bases, customer support, research, legal docs, education

Step 1: Tell Cocoding.ai What to Build

Copy/paste this prompt:

Build a document-based chatbot application with these features:

Frontend (React):
- User authentication (register/login)
- PDF upload with drag-and-drop
- Document management page
- Chat interface with message history
- Modern UI with Tailwind CSS

Backend (NestJS):
- JWT authentication
- PDF text extraction
- Vector embeddings generation
- Semantic search using embeddings
- Chat API with Groq LLM integration
- PostgreSQL database with pgvector

Technical Requirements:
- Use Groq API for fast LLM responses
- Use OpenAI API for text embeddings
- Store documents and chat history in PostgreSQL
- Implement retrieval-augmented generation (RAG)
- Secure file upload validation
- CORS configuration for API access

RAG Workflow:
1. Extract text from uploaded PDFs
2. Split text into chunks
3. Generate embeddings for each chunk
4. Store embeddings in database
5. On query: find relevant chunks via semantic search
6. Send chunks + question to Groq LLM
7. Return AI-generated answer based only on document content

Cocoding.ai creates everything automatically — backend, frontend, database, RAG pipeline.

Step 2: Get API Keys

2.1 OpenAI API Key (for Embeddings)

📸 Screenshot placeholder: OpenAI API keys page

Visit [platform.openai.com/api-keys](https://platform.openai.com/api-keys)
Sign up/login
Click "Create new secret key"
Name it: RAG Application
Copy key immediately (starts with sk-proj- or sk-)

Cost: ~$0.02 per 1M tokens = extremely cheap!

100-page PDF ≈ $0.001 (0.1 cents)
Most users spend < $1/month

Payment required: Add card at Settings → Billing (set limit to $5/month)

2.2 Groq API Key (for Chat)

📸 Screenshot placeholder: Groq console API keys page

Visit [console.groq.com/keys](https://console.groq.com/keys)
Sign up/login
Click "Create API Key"
Copy key (starts with gsk_)

Cost: FREE with generous limits!

Step 3: Configure App with API Keys

📸 Screenshot placeholder: .env file showing API key configuration

Open backend/.env and update:

GROQ_API_KEY=gsk_YOUR_ACTUAL_GROQ_KEY_HERE
OPENAI_API_KEY=sk-proj-YOUR_ACTUAL_OPENAI_KEY_HERE

Save the file — backend restarts automatically.

Step 4: Test Your RAG App

📸 Screenshot placeholder: App homepage/login

4.1 Create Account

Open app (Cocoding.ai provides URL)
Click "Sign up"
Enter name, email, password
Click "Create Account"

📸 Screenshot placeholder: Dashboard/documents page

4.2 Upload PDF

Click "Choose File" or drag & drop PDF
Click "Upload PDF"
Wait for processing (10-30 seconds):

- Text extraction

- Chunk splitting

- Embedding generation

- Database storage

📸 Screenshot placeholder: Document list with uploaded PDF

4.3 Chat with Document

Click "Chat" on your document
Ask a question:

- "What is this document about?"

- "Summarize the main points"

- "What does it say about [topic]?"

Get AI answer based on document content!

📸 Screenshot placeholder: Chat interface showing Q&A

How RAG Works

1. Upload PDF → Extract text

2. Split into chunks → 1000 characters each

3. Generate embeddings (OpenAI) → Convert text to vectors

4. Store in database with pgvector

5. Ask question → Generate query embedding

6. Semantic search → Find top 5 similar chunks

7. Generate answer (Groq Llama 3.3) → Based on chunks only

8. Display answer → Show sources used

Troubleshooting

OpenAI API errors?

Verify key starts with sk-proj- or sk-
Check for extra spaces in .env
Ensure payment method added
Restart backend

Groq API errors?

Verify key starts with gsk_
Check key is active
Restart backend

PDF upload fails?

Use text-based PDFs (not scanned images)
Remove password protection
Check file size (max 10MB)

Backend won't start?

cd backend
rm -rf node_modules
pnpm install
pnpm run dev

Customize Your App

Change AI Models

Groq models (edit backend/src/chat/chat.service.ts):

1model: 'llama-3.3-70b-versatile'  // Default
2// 'llama-3.1-70b-versatile'       // Alternative
3// 'mixtral-8x7b-32768'            // Long context
4// 'gemma2-9b-it'                  // Faster, smaller

OpenAI embeddings (edit backend/src/embeddings/embeddings.service.ts):

1model: 'text-embedding-3-small'    // Default - cheap
2// 'text-embedding-3-large'        // More accurate

Adjust Chunk Size

In backend/src/embeddings/embeddings.service.ts:

1chunkSize: 1000  // Default
2// Increase for longer context (2000)
3// Decrease for precision (500)

Change Retrieved Chunks

In backend/src/chat/chat.service.ts:

1findRelevantChunks(query, documentId, 5)  // Default: 5 chunks
2// Use 3-10 chunks

Pro Tips

Best PDFs:

✅ Text-based (digitally created)
✅ Well-formatted with headings
❌ Scanned documents (images)
❌ Password-protected

Best questions:

✅ "What are the main conclusions?"
✅ "Summarize the methodology"
✅ "What does it say about [topic]?"
❌ "What?" (too vague)
❌ Content NOT in document

Optimize for your use case:

Academic papers: Chunk size 2000, 7-10 chunks, use text-embedding-3-large
Quick facts: Chunk size 500, 3-5 chunks, use text-embedding-3-small
Long docs: Default settings, split into sections

Cost Breakdown

Usage	OpenAI Cost	Groq Cost
10-page PDF	$0.0001	FREE
100-page PDF	$0.001	FREE
1,000-page PDF	$0.01	FREE

Typical user: < $1/month

Power user: $5-10/month

Security Best Practices

✅ Never commit `.env` to git

✅ Rotate keys every 90 days

✅ Set usage limits on OpenAI dashboard

✅ Use separate keys for dev/production

✅ Regenerate keys if exposed

Customize Further

Tell Cocoding.ai what you want:

"Add document summarization"
"Enable multi-document search"
"Export chat history to PDF"
"Add shareable conversations"
"Integrate with Claude instead of Groq"
"Add voice input for questions"
"Support multiple file formats (Word, TXT)"
"Add citation highlighting"

Going to Production

Before deploying:

Change JWT secret in .env
Set CORS origin to your domain
Add rate limiting to prevent abuse
Use hosting platform secrets (never expose .env)
Enable monitoring for API usage
Set up backups for database

Deploy options:

Vercel (frontend) + Railway/Render (backend)
AWS, Google Cloud, or Azure
Docker containers

FAQ

Q: Can I use this without OpenAI?

A: Yes, but embedding quality will be lower. OpenAI recommended for best results.

Q: Can I use other LLMs?

A: Yes! Ask Cocoding.ai: "Integrate with Claude/Anthropic" or "Use OpenAI GPT-4 instead of Groq"

Q: How many documents can I upload?

A: Unlimited, but consider database storage and API costs.

Q: Can I use commercially?

A: Yes! Comply with OpenAI and Groq terms of service.