Learn how to extract text from the soup!

Text extraction

In the example we will learn how extract text using BS4, we will use the following HTML file:


Extract header tag, heading and lists

We use the BS4 module to get three HTML tags (head, h6 and li). The HTML file above is given as the standard input. Lines 5-6 extracts the data from the stdin and saves into the list variable data.

Python (3.7.3)
You can download the index.html file zipped from below:


