AI & ML Projects | David Cheng Portfolio

AI Image Recognition System

Duration: 3 months Status: Completed Role: Lead Developer

Overview

Developed a robust image recognition system using Convolutional Neural Networks (CNNs) to classify images across 10 different categories. The model was trained on a custom dataset of 50,000 images and achieved 94% accuracy on the test set, demonstrating strong performance in computer vision tasks.

Key Features

Custom CNN architecture with 5 convolutional layers optimized for multi-class classification
Advanced data augmentation pipeline (rotation, flipping, scaling) to improve model generalization
Real-time image classification with sub-second inference times (avg. 0.3s per image)
Interactive web interface built with Flask for easy testing and demonstration
Model versioning and experiment tracking with TensorBoard

Technical Stack

Python 3.9 TensorFlow 2.x Keras NumPy OpenCV Flask Docker

Challenges & Solutions

The main challenge was dealing with class imbalance in the dataset, where some categories had 3x more samples than others. I addressed this by implementing weighted loss functions and applying SMOTE (Synthetic Minority Over-sampling Technique) to balance the training data. This improved minority class accuracy by 18%.

Results & Impact

The final model achieved 94% overall accuracy with F1-scores above 0.90 for all classes. The system processes images 5x faster than the baseline approach and has been deployed as a demonstration project showcasing practical computer vision applications.

View on GitHub →

Natural Language Chatbot

Duration: 4 months Status: Ongoing Role: Full-Stack Developer

Overview

Built an intelligent conversational agent using transformer-based models that can understand context, maintain conversation history, and provide relevant responses. The chatbot includes advanced sentiment analysis to adapt its tone based on user emotions, creating more natural and empathetic interactions.

Key Features

Context-aware responses using attention mechanisms and BERT embeddings
Real-time sentiment analysis of user inputs with 89% accuracy
Multi-turn conversation handling with sliding window memory (up to 10 previous exchanges)
Integration with popular messaging platforms (Slack, Discord)
Customizable personality traits and response styles
RESTful API for easy integration into existing applications

Technical Stack

Python PyTorch Hugging Face Transformers BERT FastAPI Redis Docker PostgreSQL

Challenges & Solutions

Managing conversation context over long dialogues was challenging, as storing full conversation history quickly consumed memory. I implemented a sliding window approach combined with Redis caching to efficiently store and retrieve conversation history without sacrificing response time. This reduced memory usage by 60% while maintaining context quality.

Results & Impact

The chatbot successfully handles complex multi-turn conversations with an average response time of 1.2 seconds. User testing showed 85% satisfaction rate with response relevance and naturalness. The system is currently being enhanced with retrieval-augmented generation (RAG) for domain-specific knowledge.

View on GitHub →

Predictive Analytics Dashboard

Duration: 2 months Status: Completed Role: Full-Stack Developer

Overview

Created an interactive web application that allows users to upload datasets, visualize trends, and generate predictions using various machine learning algorithms. The dashboard supports regression, classification, and time-series forecasting with an intuitive drag-and-drop interface suitable for both technical and non-technical users.

Key Features

Drag-and-drop CSV/Excel file upload with automatic schema detection
Automated data preprocessing and feature engineering (handling missing values, encoding, scaling)
Multiple ML algorithm selection (Random Forest, XGBoost, LSTM, Linear Regression)
Interactive visualizations with Chart.js and D3.js (scatter plots, time series, correlation matrices)
Model performance metrics and comparison tools (RMSE, MAE, R², confusion matrices)
Export predictions and trained models for future use
Responsive design optimized for desktop and tablet devices

Technical Stack

React Python scikit-learn pandas AWS Lambda AWS S3 PostgreSQL Chart.js D3.js

Challenges & Solutions

Handling large datasets in the browser was initially problematic, causing slowdowns and memory issues for files over 50MB. I solved this by implementing serverless processing with AWS Lambda for heavy computations, while using React with virtualized lists for a responsive and smooth user interface. This architectural decision reduced client-side memory usage by 80% and improved processing speed for large datasets.

Results & Impact

The dashboard successfully processes datasets up to 1GB in size and generates predictions within 30 seconds for most common use cases. The application has been used for exploratory data analysis and quick prototyping of ML models, reducing the time from data to insights by approximately 70% compared to traditional coding approaches.

View on GitHub →