Information retrieval on source code through natural language queries
-
Updated
Jan 23, 2022 - Python
Information retrieval on source code through natural language queries
CloneHunter finds duplicate code across mixed-language repositories. It runs a semantic retrieval pipeline with model inference and indexing to catch harder, non-trivial duplicate patterns.
Semantic Code Search using GPT2 small model as Code and Query Encoder
Add a description, image, and links to the semantic-code-search topic page so that developers can more easily learn about it.
To associate your repository with the semantic-code-search topic, visit your repo's landing page and select "manage topics."