Facebook — Data Extraction

Exhibit 25.16 provides the code for retrieving Facebook posts and their comments using the Facebook Graph API and storing them in a MongoDB database. Here is a breakdown of the coding steps:

  1. Import necessary libraries: requests for making HTTP requests to the Facebook Graph API and pymongo for interacting with the MongoDB database
  2. Define API parameters:
    • version: Specifies the version of the Facebook Graph API to use.
    • access_token: Your Facebook access token for authentication.
    • brand_id: The ID of the Facebook page you want to retrieve posts from.
  3. Define endpoints:
    • page_url: The URL for retrieving posts from the specified page.
    • comments_url: The URL for retrieving comments for a specific post.
  4. Connect to MongoDB: Create a MongoClient instance to connect to the MongoDB server, and select the database and collections to store posts and comments.
  5. Retrieve initial posts: Make an HTTP GET request to the page_url using the requests library. Parse the JSON response to get the initial page of posts.
  6. Iterate through posts:
    • Loop through each post in the retrieved data.
    • Insert the post into the collection_posts collection.
    • Retrieve comments for the post using the comments_url and the post’s ID.
  7. Iterate through comments:
    • Loop through each comment in the retrieved comments data.
    • Add the post’s ID to the comment.
    • Insert the comment into the collection_comments collection.
    • Check if there are more pages of comments to retrieve. If so, make another request to get the next page.
  8. Check for next page of posts: Check if there are more pages of posts to retrieve. If so, make another request to get the next page.
  9. Handle errors: Use a try-except block to catch any exceptions that might occur during the process.
  10. Close the MongoDB connection: Close the MongoDB connection using client.close().

Facebook: Extracting Posts from Brand Pages via Graph API
import requests
from pymongo import MongoClient

# Define Facebook Graph API Version
version = 'v2.13'

# Access Token (Replace with your actual token)
access_token = 'YOUR_ACCESS_TOKEN'

# Define Page ID
brand_id = '{brand_id}'

# Define Endpoints
page_url = f'https://graph.facebook.com/{version}/{brand_id}/feed?fields=id,message,reactions,shares,from,caption,created_time,likes.summary(true)'
comments_url = f'https://graph.facebook.com/{version}/{{post_id}}/comments?filter=stream&limit=100'

# Connect to MongoDB
client = MongoClient('localhost:27017')
db = client.facebook
collection_posts = db.posts
collection_comments = db.comments

params = {'access_token': access_token}

# Get the first page of posts
posts = requests.get(page_url, params=params).json()

# Loop through posts and their comments
while True:
    try:
        for element in posts['data']:
            collection_posts.insert_one(element)

            this_comment_url = comments_url.format(post_id=element['id'])
            comments = requests.get(this_comment_url, params=params).json()

            # Loop through comments using cursor for pagination
            for comment in comments['data']:
                comment['post_id'] = element['id']
                collection_comments.insert_one(comment)

            # Check for next page of comments
            if 'paging' not in comments or 'next' not in comments['paging']:
                break

            comments = requests.get(comments['paging']['next'], params=params).json()

        # Check for next page of posts
        if 'paging' not in posts or 'next' not in posts['paging']:
            break

        posts = requests.get(posts['paging']['next'], params=params).json()

    except Exception as e:
        print(f"Error: {e}")
        break

# Close connection
client.close()

Exhibit 25.16   Facebook: Extracting Posts from Brand Pages via Graph API (Python implementation). Jupyter notebook.


Previous     Next

Use the Search Bar to find content on MarketingMind.