Text extraction
In the example we will learn how extract text using BS4, we will use the following HTML file:

Extract header tag, heading and lists
We use the BS4 module to get three HTML tags (head
, h6
and li
). The HTML file above is given as the standard input. Lines 5-6 extracts the data from the stdin
and saves into the list variable data
.
Python (3.7.3)
-
Input
Data download
You can download the index.html
file zipped from below: