The BBC Scraper is a powerful tool designed to extract and organize news content from the British Broadcasting Corporation (BBC) website. This scraper offers an efficient way to gather up-to-date news articles, headlines, and related information, providing valuable insights for various industries and applications.
Data Point | Description |
---|---|
Article Title | The headline of the news article |
Publication Date | When the article was published or last updated |
Author | The writer or contributor of the article |
Content | The main body of the news article |
Category | The section or topic of the news (e.g., Politics, Technology, Sports) |
URL | The web address of the article |
The BBC Scraper is a valuable tool for various sectors, including:
The BBC Scraper API allows you to extract news articles and information from BBC.com. This powerful tool enables developers to integrate BBC news content into their applications, conduct news analysis, or monitor current events and trends.
To use the BBC Scraper API, you'll need to authenticate your requests using your API key. The API provides two endpoints for retrieving news information:
Include your API key in the request headers:
Authorization: Bearer YOUR_API_KEY
The request body should be a JSON object with the following structure:
// For news endpoint
{
"url": "https://www.bbc.com/news/articles/article-id"
}
// For news-by-keyword endpoint
{
"keyword": "search term"
}
Please note that usage is subject to rate limiting. Refer to your plan details for specific limits.
Here's an example of the data you can expect to receive:
[
{
"input": {
"url": "https://www.bbc.com/news/articles/cx29rq6qwezo",
"keyword": ""
},
"id": "cx29rq6qwezo",
"url": "https://www.bbc.com/news/articles/cx29rq6qwezo",
"author": "BBC",
"headline": "Missouri's top court allows vote on abortion rights",
"topics": [
"Missouri",
"US Supreme Court",
"US abortion debate",
"United States"
],
"publication_date": "2024-09-10T19:50:56.795Z",
"content": "Missouri's top court on Tuesday ruled that a proposed abortion rights amendment to the state constitution will appear on the ballot in November...",
"videos": [],
"images": [
{
"image_url": "https://ichef.bbci.co.uk/images/ic/1920xn06c7/live/aae3eed0-6fab-11ef-862e-2bfc5e255dff.jpg",
"image_description": "Main image"
}
],
"related_articles": [
{
"article_title": "US confirms first human bird flu case with no known animal exposure",
"article_url": "https://www.bbc.com/news/articles/cy0rzqwxp7jo"
},
{
"article_title": "Missouri death row inmate in plea deal to avoid execution",
"article_url": "https://www.bbc.com/news/articles/cy9e8ypz4xxo"
},
{
"article_title": "'Squad' member Cori Bush loses congressional primary",
"article_url": "https://www.bbc.com/news/articles/cewlle7jrgdo"
}
],
"keyword": null
}
]
import requests
import json
# Your API Key
api_key = 'YOUR_API_KEY'
# API Endpoint for news by URL
url = 'https://taskagi.net/api/news/bbc-scraper/news'
# Headers
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
# Request Body
data = {
'url': 'https://www.bbc.com/news/articles/cx29rq6qwezo'
}
# Send POST request
response = requests.post(url, headers=headers, json=data)
# Check if the request was successful
if response.status_code == 200:
# Parse the JSON response
news_info = response.json()
# Print the news information
print(json.dumps(news_info, indent=2))
else:
print(f"Error: {response.status_code}")
print(response.text)
# API Endpoint for news by keyword
url = 'https://taskagi.net/api/news/bbc-scraper/news-by-keyword'
# Request Body for keyword search
data = {
'keyword': 'climate change'
}
# Send POST request
response = requests.post(url, headers=headers, json=data)
# Check if the request was successful
if response.status_code == 200:
# Parse the JSON response
news_by_keyword = response.json()
# Print the news information
print(json.dumps(news_by_keyword, indent=2))
else:
print(f"Error: {response.status_code}")
print(response.text)