Photogram is a realistic, synthetic dataset that mirrors how modern image-sharing platforms work. It captures everything from users, posts, likes, and comments to hashtags and timelines—making it perfect for building, testing, or analyzing social media features. Whether you're training recommendation systems, experimenting with content feeds, or simulating user behavior, Photogram gives you a practical, data-rich playground.
Ideal for developers, analysts, and students alike, Photogram helps validate backend features, model user engagement, and even test moderation systems. Its detailed, time-stamped records and relational structure make it great for honing SQL and data modeling skills, building dashboards, or studying how content goes viral.
Highlights:
- Simulates an end-to-end social media experience, including posts, comments, likes, shares, and user interactions.
- Ideal for machine learning applications like recommendation engines, bot detection, and content ranking systems.
- Enables testing and development of social media backend features such as feeds, follower logic, and notification triggers.
- Supports time-series and behavioral analysis for engagement modeling, trend tracking, and sentiment insights.
- Great for database design practice, advanced SQL exercises, and building ETL pipelines with dynamic user activity.
The Photogram dataset replicates the structure of a social media photo-sharing platform. It simulates user engagement, content creation, messaging, interactions, and account personalization, making it ideal for database design, API testing, backend development, and social media behavior analysis.
Key tables in the dataset include:
- Users: Stores user credentials and profile details (bio, gender, profile picture).
- Account Center: Manages user settings like social media links, login preferences, visibility, and subscriptions.
- Posts: Stores user-generated image posts, including URLs, captions, locations, and the user who posted.
- Saved Posts: Tracks posts saved by users for later viewing.
- Stories: Stores time-limited content (stories) with images and user metadata.
- Story Views: Logs who viewed which stories and when.
- Likes: Records which users liked which posts and when.
- Followers: Captures user-following relationships.
- Messages: Handles private direct messages between users, including sender, receiver, and content.
- Message Attachments: Stores media or file attachments linked to messages.
- Notifications: Logs notifications sent to users (e.g., new followers, likes, comments).
- Hashtags: Maintains a catalog of hashtags used across the platform for content discovery.
The Photogram dataset is synthetically generated using advanced generative AI models, designed to simulate a real-world social media photo-sharing environment. This dataset replicates user behavior such as photo uploads, likes, comments, follower interactions, and tagging—without containing any real or identifiable user data. Our simulation agents are built to follow behavioral patterns inspired by publicly documented social media trends and app usage statistics. This allows developers, analysts, and researchers to work with realistic data that mirrors common social media dynamics while adhering to strict ethical data creation practices. Photogram’s synthetic dataset is ideal for training AI models, testing engagement algorithms, or developing features for social platforms—risk-free and 100% privacy-compliant.
Photogram offers complete flexibility for data exploration and integration across various technical environments. Designed for both data scientists and app developers, the dataset can be used for anything from modeling social engagement to testing media-based applications. Whether you're working locally or in the cloud, Photogram’s data structure ensures high compatibility and seamless access across tools and platforms.
- Available file formats: CSV, JSON, Excel
- Available databases: MySQL, PostgreSQL, SQL Server
- Cloud database access: Snowflake