Facebook-Thailand-Engagement-Analysis

Back

Facebook Thailand Engagement Analysis

Facebook onboarding and engagement

Project Overview

This project analyzes Facebook post data from Thailand to understand how different types of content (photos, videos, text) influence user engagement metrics such as likes, shares, comments, and reactions. The analysis incorporates techniques like Principal Component Analysis (PCA), clustering, and logistic regression modeling to provide insights that could inform effective social media marketing strategies.

Dataset

The dataset consists of various Facebook posts, each characterized by:

Objectives

Technologies Used

Welcome to our comprehensive analysis of Facebook interactions in Thailand! We delve into how different types of content—photos, videos, and text—impact user engagement. Our tools include Principal Component Analysis (PCA), clustering, and logistic regression to uncover trends that can shape effective social media strategies. Our dataset consists of individual Facebook posts, featuring various engagement metrics:

Data Preparation

Data preparation is crucial for accurate analysis. We handle missing values and standardize features to ensure consistency across the dataset:

# Standardizing the dataset
df_scaled = unsupervised_scaler(df[['num_comments', 'num_shares', 'num_likes', ...]])

Exploratory Data Analysis (EDA)

Our initial exploration focuses on understanding the distribution of post types and their correlation with engagement metrics:

# Analyzing post types distribution
print(df['status_type'].value_counts(normalize=True))

# Visualizing correlations between engagement metrics
sns.heatmap(df[engagement_metrics].corr(), annot=True, cmap='coolwarm')

We discover that photos dominate the dataset, making up 61% of the posts. Videos, though fewer, generate significantly higher engagement across all metrics.

Principal Component Analysis (PCA)

To reduce dimensionality and focus on the most impactful features, we apply PCA:

# Conducting PCA
pca = PCA(n_components=3)
pca_results = pca.fit_transform(df_scaled)
scree_plot(pca)

The scree plot suggests that three components explain most of the variance, providing a simplified yet powerful representation of our data.

Clustering with K-Means

Identifying clusters within our data helps us understand distinct patterns of engagement:

# Finding optimal cluster count
elbow_plot(range(1, 11), inertias)

# Applying K-Means Clustering
kmeans = KMeans(n_clusters=4)
df['cluster'] = kmeans.fit_predict(pca_results)

Analysis reveals four unique clusters, each representing a different engagement pattern, from “Low Engagement” to “High General Engagement.”

Model Development: Logistic Regression

We build logistic regression models to predict the likelihood of a post being a photo based on engagement metrics:

# Logistic regression using original, PCA, and cluster features
model1 = LogisticRegression().fit(X_train, y_train)
model2 = LogisticRegression().fit(X_train_pca, y_train)
model3 = LogisticRegression().fit(X_train_clusters, y_train)

Each model’s performance is assessed, with the original features model showing the best ability to differentiate photo from non-photo posts based on engagement.

Engagement Insights

Insights and Actionable Strategies

Our analysis leads to actionable insights:

Conclusion

Leveraging these insights, businesses can tailor their content strategies to maximize engagement on Facebook, optimizing for the types of interactions that best suit their goals. Our analysis not only aids in strategic decision-making but also highlights the power of data-driven approaches in digital marketing. Detailed codes, comments, and visualizations can be found here

Acknowledgments

Back to homepage