Right now our extraction uses Cyberneko to parse HTML and provide a DOM.
However since it does not have a JS engine, contents loaded through JS is not extractable.
One way to overcome this is to interface with phantom.js (which provides DOM and JS engine) for extraction.
This will require designing an extraction engine API in BSJava, and implementing a wrapper for phantom.js.