Skip to content

Brandon82/astnn

 
 

Repository files navigation

ASTNN - An Equivalent Mutant Identifier

An Abstract Syntax Tree Neural Network aiming to identify equivalent mutants within a dataset of mutants.

  • Please note that the /parser folder isn't necessary and was used for additional preprocessing on our dataset before integrating it into the pipeline.
  • Our initial dataset consisted solely of unified difference strings, so we created a unified-diff parser to insert the mutations into their original programs.

Requirements

  python 3.10
  pandas 2.0.0
  gensim 4.3.1
  scikit-learn 1.2.2
  torch 2.0.0
  pycparser 2.21
  javalang 0.13

Note that the original ASTNN implementation, from which this was forked, utilized Python 3.6 and older library versions. We have updated the project to Python 3.10 and more recent library versions.

Added Features

Here's a list of changes compared to the original repository:

  • Implemented a mutant dataset from Mutantbench
    • Created a parser that inserts each mutation into its original program and returns the mutated method/file.
  • Updated to Python 3.10
  • Updated to more recent library versions
  • Added open_data.py to convert .pkl files to .csv for easier viewing
  • Updated the model to support equivalent mutant identification
  • Saved the trained model to allow for inference
  • Added test_from_trained.py to get predictions on additional data from the trained model
  • General refactoring of pipeline.py and train.py to improve code quality
  • Provided a new approach to equivalent mutant identification: using Code2Vec's code vectors with a feedforward neural network found in model_c2v.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 77.1%
  • Java 19.2%
  • Python 3.7%