Skip to content

A small but useful HTML Parser in Python that allows you to pick a special tag with its class !

Notifications You must be signed in to change notification settings

himmelarthur/html-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

html-parser

A small but useful HTML Parser in Python that allows you to pick a special tag with its class !

Purpose

html-parser is a tiny python module that can parse any HTML webpage (thanks to Python's urllib2) in order to find a particular tag with its name, or its name and its class.

The parser uses a Stack to benefit from the HTML structure : opening tags and their string position are pushed into the stacked, and popped out when their closing counterpart in encountered, the opening and closing position allows to select the tag's content and to store it into a list of hashtables. Single-block tags (such as img, input, etc.) are stored with their attributes as their content.

It is made of a single function, getInsideHTMLTag, that takes one to three arguments :

  • An html string : the content of an html file, gotten from a string, a distant website or a local file
  • A tag name (such as a, div, span, iframe, etc.)
  • A class name (optional) : allows you to filter the tags by a specific class

About

A small but useful HTML Parser in Python that allows you to pick a special tag with its class !

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages