AI & Machine Learning | Joshua Curry

Large Language Models & RAG Pipelines

chatjc

RAG-Powered Conversational Chatbot

MCP server and REST API that answers questions about the developer's professional background using retrieval-augmented generation. The pipeline loads markdown context documents, splits them into chunks, embeds them with Mistral AI, and stores them in a custom in-memory vector store. At query time, cosine similarity retrieval selects the most relevant chunks, which are injected into a Mistral AI chat prompt via an LCEL chain built with LangChain.

LLM: Mistral AI (chat completions + text embeddings)
Frameworks: LangChain (@langchain/core, @langchain/mistralai, @langchain/textsplitters)
Patterns: Full RAG pipeline (load → split → embed → store → retrieve → generate), LCEL chain composition, RecursiveCharacterTextSplitter, custom in-memory vector store with cosine similarity, session-based conversation history, prompt injection detection, response length truncation
Language: TypeScript (ESM)

Demonstrates end-to-end RAG pipeline construction, LangChain LCEL orchestration, embedding-based semantic retrieval, and production-ready LLM API integration with security guardrails and comprehensive test coverage.

generatejsonld

LLM-Powered Structured Data Generator

CLI tool that crawls a website and uses Mistral AI to generate Schema.org JSON-LD structured data for each page. Playwright handles headless browser crawling; Cheerio parses HTML into clean Markdown with metadata; Mistral AI determines the appropriate Schema.org type and generates valid JSON-LD for each page context.

LLM: Mistral AI (chat completions)
Tools: Playwright (headless browser), Cheerio (HTML parsing), dotenv
Patterns: Web scraping → HTML-to-Markdown conversion → LLM enrichment pipeline, per-page structured data generation, dry-run mode, session tracking, configurable crawl depth
Language: JavaScript (Node.js)

Demonstrates applied LLM use for SEO automation, prompt design for structured output (JSON), and multi-step pipeline architecture combining web automation with AI generation.

Vision & Multimodal AI

alt-tagger

AI Alt Text Generator for Image Libraries

CLI tool that generates descriptive alt text for all images hosted on a Cloudinary account using OpenAI's Vision API. The pipeline lists resources via the Cloudinary Admin API, downloads each image, sends it to GPT-4o-mini (or GPT-4o) for description, and pushes the generated alt text back to Cloudinary as metadata.

LLM: OpenAI Vision API (GPT-4o-mini / GPT-4o)
Integrations: Cloudinary SDK (Admin API + upload API)
Patterns: Multi-step task pipeline (list → download → generate → push), rate-limited API calls, batch processing, error handling with placeholder fallbacks, environment-based model configuration
Language: JavaScript (Node.js)

Demonstrates OpenAI Vision API integration, multimodal LLM prompting for image description, and end-to-end cloud media platform automation for accessibility and SEO use cases.

LLM Fine-Tuning & Text Generation

Critic (2019)

AI Art Criticism Generator (GPT-2 Fine-Tuning)

System that generates original art exhibition reviews by fine-tuning GPT-2 on over 10,000 Artforum magazine reviews. Deployed on a Raspberry Pi (accessed via SSH tunnel) with a message queue architecture for async generation. Users request a review via a Flask web interface; a RabbitMQ worker picks up the job, runs inference, and emails the result.

Models: GPT-2 (fine-tuned via gpt_2_simple), TensorFlow
Frameworks: Flask, TensorFlow, gpt_2_simple
Infrastructure: RabbitMQ (Pika) message queue, MySQL + SQLite, systemd services, Apache, Raspberry Pi
Patterns: Domain-specific LLM fine-tuning, queue-based async inference, database abstraction layer, email notification on completion, content moderation/flagging, iterative versioned deployment
Language: Python

Demonstrates LLM fine-tuning on a specialized corpus, production ML model deployment, asynchronous inference architecture with message queues, and iterative multi-environment deployment management.

Edge AI & Computer Vision

Oracle

Edge AI Computer Vision on Microcontroller

Firmware and toolchain for deploying computer vision models to the Seeed Grove Vision AI Module V2 (Himax WiseEye2 microcontroller). Implements multiple vision tasks: YOLOv8 and YOLO11 object detection, face detection and mesh, pose estimation, and gender classification. Models run as TensorFlow Lite Micro (TFLite) binaries optimized with CMSIS-NN for ARM Cortex-M cores. Also supports ExecuTorch model format.

Models: YOLOv8, YOLO11, face detection/mesh, pose estimation, gender classification
Frameworks: TensorFlow Lite Micro (TFLite), CMSIS-NN (ARM neural network kernels), ExecuTorch
Hardware: Seeed Grove Vision AI Module V2, Himax WiseEye2 (ARM Cortex-M), on-device camera
Patterns: Edge inference on microcontroller (no cloud), quantized TFLite model deployment, multi-task computer vision, firmware build and flash toolchain, model format comparison (TFLite vs. ExecuTorch)
Language: C / firmware toolchain

Demonstrates embedded AI deployment, TFLite Micro for constrained hardware, real-time on-device inference without network connectivity, and multi-task computer vision model integration.

AI Concepts & Patterns

Retrieval-Augmented Generation (RAG)

Built a complete RAG pipeline from scratch: document loading, text splitting, embedding generation, in-memory vector store with cosine similarity, top-k retrieval, and LangChain LCEL chain composition. Understands chunking strategy trade-offs (chunk size, overlap), embedding model selection, and context injection into chat prompts.

LLM API Integration

Worked directly with Mistral AI and OpenAI APIs for both chat completions and embeddings. Experience with prompt engineering for structured output (JSON-LD), conversational RAG, image description, and domain-specific text generation.

Vision AI & Multimodal Models

Used OpenAI Vision (GPT-4o-mini/GPT-4o) for image-to-text tasks, integrating vision inference into automated pipelines for accessibility metadata generation.

Edge AI & On-Device Inference

Deployed quantized neural network models (TFLite Micro) to a low-power ARM microcontroller. Understands the constraints of edge inference: model size, quantization, CMSIS-NN kernel optimization, and the firmware deployment pipeline.

AI Infrastructure & Architecture

Experience with async ML inference via message queues (RabbitMQ), production deployment of AI services (systemd, Flask), prompt injection detection and input/output guardrails, rate limiting for LLM API calls, and test-driven development with mocked LLM responses.

Frameworks & Tools

LangChain: LCEL chains, document loaders, text splitters, vector stores, chat models, embeddings
TensorFlow / TFLite: Model training, TFLite Micro for edge deployment, CMSIS-NN optimization
OpenAI API: Chat completions, Vision API (GPT-4o-mini, GPT-4o)
Mistral AI: Chat completions, text embeddings
gpt_2_simple: GPT-2 fine-tuning and inference
Roboflow: Dataset management and computer vision model tooling
YOLOv8 / YOLO11: Object detection model architecture and deployment