Near real-time conversion for sketches into flower drawings using pix2pix GAN ๐จ โ ๐ธ
Transform rough sketches of flowers into photorealistic images through an interactive web interface. Built on the pix2pix architecture and trained on a custom dataset of 9.5k+ curated flower images.
- ๐๏ธ Interactive Drawing Canvas - Draw sketches directly in your browser
- โก Near real-time Inference - See results instantly (GPU: ~100ms, CPU: ~1-2s)
- ๐จ Multiple Variations - Generate up to 4 different outputs from one sketch with diversity and noise strength settings
- ๐ Model Switching - Load and switch between multiple trained models at runtime
- ๐ฅ Auto-Updates - Automatically downloads pretrained models from GitHub releases
- ๐ Auto-Detection - Intelligently reads model parameters from binaries
- Python 3.10.x
-
Clone the repository
git clone https://github.com/Cosmic-Infinity/draw2pix.git cd draw2pix -
Install dependencies
pip install -r requirements.txt
-
Run the web application
# Windows start.bat # Linux/Mac python app/web_app.py --model_dir pretrained_models
-
Open your browser
http://127.0.0.1:5000
| Input Sketch | Generated Output | Ground Truth |
|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Training validation samples showing model's ability to generate realistic flower images from sketches
- Draw your sketch on the sketch canvas using your mouse or stylus
- Click "Generate" (or leave Auto-Generate ON) to convert your sketch to a photo
- Adjust settings (optional):
- Enable/Disable Dropout
- Perturbation strength (low/medium/high)
- Clear the canvas to start over or save your results
python app/web_app.py [options]
Options:
--model_dir Directory containing .pth model files (default: pretrained_models)
--input_nc Input channels: 1 for grayscale, 3 for RGB (default: 1)
--output_nc Output channels: 3 for RGB (default: 3)
--port Port to run server on (default: 5000)
--host Host to run server on (default: 127.0.0.1)draw2pix/
โโโ app/
โ โโโ index.html # Frontend interface
โ โโโ web_app.py # Flask backend application
โ โโโ test_setup.py # Setup verification script
โโโ pretrained_models/ # Trained model weights (.pth files)
โ โโโ *.pth # Model binaries (15 models)
โ โโโ version.txt # Version tracking for updates
โ โโโ MODELS_RELEASE.md # Model release notes
โโโ pix2pix/ # Pix2pix framework (PyTorch-CycleGAN-pix2pix)
โ โโโ models/ # Model architectures (networks, base model)
โ โโโ options/ # Configuration options (base, test)
โ โโโ data/ # Data loading utilities
โ โโโ util/ # Helper functions
โ โโโ train.py # Training script
โ โโโ test.py # Inference script
โ โโโ THIRD_PARTY_LICENSES.txt
โโโ docs/ # Comprehensive documentation
โ โโโ WEB_APP_README.md # Web app usage guide
โ โโโ ARCHITECTURE.md # System architecture details
โ โโโ MODEL_REFERENCE.md # All 15 trained models reference
โ โโโ RELEASE_GUIDE.md # Release creation guide
โโโ Progress Tracker/ # Training logs and progress
โ โโโ Shubham.md # Model training tracker
โ โโโ Sourajit.md # Frontend development tracker
โ โโโ train_graphs/ # Training visualization graphs
โโโ flowers Dataset/ # Custom curated dataset (9.5k images)
โโโ screenshots/ # Application screenshots & samples
โโโ UI Design/ # UI mockups and designs
โโโ requirements.txt # Python dependencies
โโโ start.bat # Quick start script (Windows)
โโโ LICENSE # MIT License
- Base: pix2pix (Conditional GAN)
- Generator: U-Net-256 or ResNet-9blocks
- Discriminator: PatchGAN (Basic)
- Input: 256ร256 grayscale sketches (1 or 3 channel)
- Output: 256ร256 RGB images (3 channels)
- Training Dataset: 9.5k curated flower image pairs
All models are automatically downloaded via start.bat or available from GitHub Releases.
| Model | Architecture | Epochs | Time | LR | ฮป_L1 | Batch | Notes |
|---|---|---|---|---|---|---|---|
| U256_Flower_1 | UNet-256 | 300 | 14h 4m | 0.0002 | 100 | 85 | Discriminator flatline |
| U256_Flower_2 | UNet-256 | 150 | 2h 31m | 0.00015 | 100 | 72 | Good discriminator stability |
| U256_Flower_3 | UNet-256 | 200 | 3h 13m | 0.0002 | 50 | 70 | Low ฮป_L1 |
| U256_Flower_4 | UNet-256 | 250 | 9h 18m | 0.0001 | 50 | 70 | Low learning rate |
| U256_Flower_5 | UNet-256 | 100 | 3h 44m | 0.00025 | 65 | 70 | Quick training |
| U256_Flower_6 | UNet-256 | 250 | 9h 19m | 0.00022 | 85 | 70 | Severe collapse |
| U256_Flower_7 | UNet-256 | 50 | 1h 53m | 0.0008 | 85 | 70 | High LR experiment |
| U256_Flower_8 | UNet-256 | 75 | 2h 48m | 0.0015 | 85 | 70 | Extreme LR |
| U256_Flower_9 | UNet-256 | 120* | 13h 8m | 0.0002 | 100 | 1 | Batch size = 1 |
| U256_Flower_10 | UNet-256 | 210* | 22h 18m | 0.0002 | 80 | 1 | Batch size = 1 |
| U256_Flower_11 | UNet-256 | 200 | 20h 56m | 0.0004 | 100 | 1 | Label smoothing + noise |
| U256_Flower_12 | UNet-256 | 200 | 7h 39m | 0.0003 | 100 | 32 | Regularization techniques |
| R9_Flower_13 | ResNet-9 | 50* | 7h 15m | 0.0032 | 100 | 16 | High LR instability |
| R9_Flower_14 | ResNet-9 | 107* | 15h 39m | 0.0002 | 75 | 8 | LSGAN + no dropout |
| U256_Flower_15 | UNet-256 | 200 | 21h 16m | 0.0002 | 100 | 1 | RGB input (experimental) |
* = Training stopped early
๐ For detailed training commands and loss curves, see docs/MODEL_REFERENCE.md
- Flask - Web framework
- PyTorch - Deep learning framework
- torchvision - Image transformations
- Pillow - Image processing
- NumPy - Numerical operations
- HTML5 Canvas - Drawing interface
- Vanilla JavaScript - No framework dependencies
- CSS3 - Modern styling with gradients and animations
- Input: 256ร256 grayscale sketch
- Output: 256ร256 RGB photorealistic image
- Inference Time: ~100ms (GPU) / ~1-2s (CPU)
- docs/WEB_APP_README.md - Detailed web application usage guide
- docs/ARCHITECTURE.md - System architecture and data flow
- docs/MODEL_REFERENCE.md - Complete reference for all 15 trained models
- docs/RELEASE_GUIDE.md - Guide for creating and publishing releases
- Progress Tracker/ - Training logs and development progress
The models were trained on a custom dataset of flower images with the following characteristics:
- Dataset Size: 9.5k manually curated image pairs (cleaned from original 12.5k)
- Resolution: 256ร256 pixels
- Format: Aligned paired images (sketch | photo)
- Hardware: NVIDIA A2000 12GB GPU. courtesy IoT Lab, KIIT
- Framework: PyTorch with pix2pix implementation
- Training Time: 50-300 epochs (1h 53m to 22h 18m per model)
- Total Experiments: 15 model variants with different hyperparameters
Training GANs proved inherently complex due to adversarial dynamics. Key observations:
- Discriminator Collapse: Persistent flatlining of discriminator losses across most models
- GAN Instability: Multi-objective optimization creates non-convex landscape
- Data Quality: Edge thickness inconsistencies and natural lighting variations
- Color Bias: Dataset overrepresentation of yellow/white flowers
๐ For complete training details and analysis, see docs/MODEL_REFERENCE.md and Progress Tracker/Shubham.md
This is an academic project. For questions or suggestions:
- Open an issue on GitHub
- Check existing documentation
- Review the Progress Tracker for known limitations
- Custom Code (app/, docs/, etc.): MIT License
- pix2pix Framework: BSD License - See pix2pix/THIRD_PARTY_LICENSES.txt
- Trained Models: Created by this project (MIT License)
- Output quality varies based on sketch complexity.
- Some models may show bias towards yellow/white flowers. Probable overfitting.
- Texture details may appear stylized rather than photorealistic
- Some models may provide best results with clear, simple flower sketches
- Dropout may drastically improve output in some instances, while ruining colours/texture in others.
- Expand dataset with more diverse flower colors and species
- Experiment with higher resolution models (512ร512) and upscaling techniques
- Test alternative loss functions (Wasserstein GAN, Hinge loss)
- Add progressive rendering? for better UX on slower hardware
- Implement model quantization for faster CPU inference
- Mobile-responsive UI for tablet/phone drawing
A project made in collaboration
Shubham ๐ Model Training ๐๏ธ System Architecture ๐ Dataset Curation ๐ Report & Presentation |
Sourajit ๐จ Frontend and Design ๐ Dataset Curation ๐ Report & Presentation |
Tanmay ๐ท๏ธ Web Scraping ๐ง Data Preprocessing ๐ Dataset Curation ๐ Report & Presentation |
Shivam ๐ Dataset Curation ๐ Report & Presentation |
Alok ๐ Dataset Curation ๐ Report & Presentation |
Snigdha ๐ Dataset Curation ๐ Report & Presentation |
Upasana ๐ Dataset Curation ๐ Report & Presentation |
Urshita ๐ Dataset Curation ๐ Report & Presentation |
Ankit |
Abhijeet |











