Back to projects
AI/ML
Document Processing AI
An AI-powered document processing system with web scraping capabilities that extracts tables from multiple sources, enabling semantic search and RAG-based querying over structured data.
System design
Key features
- Upload documents (Excel, CSV, PDF, Word) with automated web scraping
- Neural network-powered table extraction with advanced OCR
- Vector storage of tables using LangChain + embeddings
- RAG pipeline to answer natural language queries on tables
- Web UI built with Next.js (TypeScript)
- Containerized microservices deployed with Kubernetes
Technologies
LangChainNeural NetworksNext.js (TypeScript)FastAPI (Python)PostgreSQLWeb ScrapingDockerKubernetes