Course


Hello world! BS4

Your first BS4 program.

Hello world!

Let's begin to test our Beautiful Soup package in one of the html pages. Let's just print the title of this scientificprogramming.io webpage.

Python (3.7.3)
  • Input  

How did it work?

First, we setup the url to our webpage.

url = "https://scientificprogramming.io"

Next, we scrap the whole page using the Python requests module.

req = requests.get(url)

Now, the functionality of the Bs4 begins. The BeautifulSoup constructor function takes in two string arguments:

  • The HTML string to be parsed (req.text).
  • Optionally, the name of a parser (html.parser).

The final soup object after parsing:

soup = BeautifulSoup(req.text, "html.parser")

We can do the following with this soup object:

soup.title
soup.title.name
soup.title.string
soup.title.parent.name
soup.p
soup.p['class']
soup.a
soup.find_all('a')
soup.find(id="link3")

Great! you have learned your first BS4 scraping!