Word Cloud (FB data) in Python

A broader perspective on visualization of social media data with Python are covered in Section Appendix — Python Visualization. In this section is devoted to the application of word clouds for text analytics using the Facebook data in fb_data.txt.

A word cloud is a visual representation of text data where word size indicates frequency or importance, highlighting key themes within the text. It simplifies complex information, making it easier to understand and analyze. For details on the creation of word clouds, and their benefits, refer Section Word Clouds in the Appendix — Python Visualization.

Exhibit 25.21 provides the Python code for generating word cloud for the Facebook data stored in the file fb_data.txt. Here is a breakdown of what it does:

  1. Importing Libraries:
    • wordcloud: This library provides tools for creating word clouds.
    • matplotlib.pyplot: This library helps visualize the word cloud.
  2. Facebook Data File: The code opens the fb_data.txt file and reads its contents into a variable called text.
  3. Setting Up Stop Words: The code imports a set of common words considered unimportant for analysis, like “the”, “and”, “a”, etc. These are stored in stopwords.
  4. Creating the Word Cloud: WordCloud is called to create an image object. We specify:
    • Size: width and height are set to 800 pixels, making it an 800 × 800 image.
    • Background: background_color is set to “white” for a clean background.
    • Stop Words: We tell the word cloud to exclude the stop words defined earlier using stopwords.
    • Minimum Font Size: min_font_size is set to 10 to ensure all words are at least visible.
    • generate(text): This line takes the text data from text (presumably containing the cleaned Facebook data) and uses it to create the word cloud.
  5. Visualizing the Word Cloud:
    • plt.figure: Creates a new figure for displaying the image.
    • plt.imshow(wordcloud): This displays the generated word cloud on the figure.
    • plt.axis("off"): Hides the x and y axes since they are not relevant for the word cloud.
    • plt.tight_layout(pad = 0): Adjusts spacing to ensure the word cloud fills the entire area without extra padding.
    • plt.show(): Finally, this line displays the generated word cloud image on your screen.

Word Cloud
from wordcloud import WordCloud, STOPWORDS # for generation word cloud
import matplotlib.pyplot as plt # for visualization of data
import pandas as pd  # panel data analysis/python data analysis 
import nltk  # natural language toolkit

# Open fb_data.txt file and assign it to the variable f
with open('data/fb_data.txt') as f: 
    # read file f and assign the resulting string to variable text
    text = f.read()  

stopwords = set(STOPWORDS) # Convert stop words list into a set. 

# Generate an 800 X 800 wordcloud image from the tokens in text 
wordcloud = WordCloud(width = 800, height = 800,background_color ='white', 
    stopwords = stopwords, min_font_size = 10).generate(text) 

# plot the WordCloud image                        
plt.figure(figsize = (8, 8), facecolor = None) 
plt.imshow(wordcloud) 
plt.axis("off") 
plt.tight_layout(pad = 0) 
  
plt.show() 
    Matplotlib - Word Cloud    

Exhibit 25.21   This code demonstrates how to generate a word cloud from a textual data file containing Facebook posts. Jupyter notebook.

Overall, this code processes the Facebook data, removes unimportant words(stopwords), and then creates a visual representation where word size reflects how often it appears in the text.

Another coding example for generating a word cloud is provided in Section Word Cloud in Python.


Previous     Next

Use the Search Bar to find content on MarketingMind.