-
Notifications
You must be signed in to change notification settings - Fork 9
Training Data Generation Workflow
This workflow is to generate training data. Result of this workflow contains 400X400 px tif files with annotations.txt, train_labels.csv and test_labels.csv. CSV file contains the filename with jpg extension and the bounding box (xmin, ymin, xmax, ymax) and class name. For successfull completion of the workflow the tif file name and shape file name should be exactly same. The resultant of the filename is tif_file_name append with '_counter'. If the tif filename is already in 400X400 px then only change in the filename i.e _000 will append at the end, as the counter is zero. The config file is very important for this workflow, it contains information about the files to process.
Note:- The filename from s3 should not contain special characters like ':' as this filename got changed in other formates in s3 event and the process will get failed.

ms_training_data_generation/
- Dockerfile
- .env
- lambda_function.py
- requirements.txt
- src
- training_data_generation_wrapper.py
- training_data_generation_config.json
- upload.py
- utils
- generate_training_data.py
- generate_train_test_split_from_csv.py
Python: 3.6.9 OS: Ubuntu Required packages: AWS CLI, requirement.txt
Open a terminal and move to the desired directory in which you want to set up all the things. Make sure you have defined python version and AWS CLI setup on your machine. Install virtualenv package using pip3: "pip3 install virtualenv" Create a virtualenv inside the dir so that the package required for this workflow can be installed separately: "python3 -m virtualenv venv" venv is the name of the virtualenv. Activate the venv : "source venv/bin/activate" You can see "(venv)" at the starting of every line in your terminal. i.e the env is now activated. So any installation made further will move to this virtual environment. Clone the code from GitHub to local dir: "git clone https://github.com/krakchris/TreeTect.git" Move to this workflow dir ie Treetect/ms_training_data_generation Install the package required for this workflow: "pip3 install -r requirements.txt" These installation steps may vary depending on your machine i.e may be in python can refer to python3.6.9 in that case use python instead of python3, same goes for pip.