Premium Synthetic Dataset

MyTube: Your
Real-World
Video Platform Data Playground

MyTube is a ready-to-use synthetic dataset that mimics the experience of a real video-sharing platform. Ideal for content recommendations, viewer retention analysis, and feed personalization.

MyTube mascot

Overview

MyTube is a ready-to-use synthetic dataset that mimics the experience of a real video-sharing platform. It includes everything you'd expect—users, video uploads, likes, dislikes, comments, subscriptions, watch history, and even ad interactions. Whether you're working on content recommendations, viewer retention, or feed personalization, MyTube gives you the data to simulate real-world scenarios and test your ideas confidently.


Perfect for students, developers, and data enthusiasts, this dataset is a great way to sharpen your skills in SQL, machine learning, and backend development. Use it to build dashboards, run A/B tests, or train models for video suggestions, sentiment analysis, and creator analytics—all in a safe and privacy-friendly environment.



Full Platform Lifecycle

Simulates a full video platform lifecycle, including uploads, views, likes, dislikes, comments, and subscriptions.

Built for Development & Testing

Enables development and testing of video recommendation systems, audience clustering, and creator performance analysis.

Backend Feature Testing

Suitable for backend feature testing such as watch history tracking, channel feed generation, and moderation workflows.

Rich Engagement Data

Includes data for user activity patterns, search queries, session duration, ad impressions, and engagement metrics.

Analytics & Research

Excellent for use in behavioral analytics, trend detection, A/B testing, and digital media research.

How it Works

01

AI-Generated & Fully Synthetic

The MyTube dataset is generated using advanced AI agents, creating a realistic yet entirely synthetic representation of video platform interactions with zero real-world or personally identifiable data.

02

Realistic Simulation with Privacy

It simulates user accounts, video uploads, view histories, comments, likes, and content engagement behaviors informed by industry-standard trends and public consumption patterns.

03

High-Quality & Safe for Use

Built using insights from public trends and industry data, the dataset delivers structured, high-quality data suitable for recommendation systems, UI testing, and data-driven performance experiments.

Dataset Schema

A comprehensive relational model representing a modern video-sharing platform engineered for deep analysis and complex querying.

Users

Stores account information like login details, profile, and contact info.

Channels

Represents user-managed content hubs with channel name and ownership.

Channel Details

Contains extended info like banner images and social media links.

Videos

Stores video data including title, description, URLs, and view count.

Video Views

Tracks views, timestamps, and user behavior to analyze video engagement.

Likes / Dislikes

Records user feedback (likes and dislikes) for videos.

Comments

Manages video comments including text, authorship, and timestamps.

Categories

Organizes videos by content type (e.g., Education, Entertainment).

Playlists

Allows users to group videos into collections for easier viewing or curation.

Payments

Records user payments, such as subscriptions or channel features.

Subscriptions

Tracks user subscriptions to channels for personalized feeds and notifications.

Shorts

Manages short-form video content, ideal for quick consumption.

Stories

Supports temporary content (stories), providing engagement through short-lived videos.

Video Tags

Maintains tags for content labeling, aiding search and discovery.

Playlist Videos

Manages the relationship between playlists and the videos they contain.

Available formats
  • CSV
  • JSON
  • Excel
Supported databases
  • MySQL
  • PostgreSQL
  • SQL Server
Cloud access
  • Snowflake