uSkill: Your
Real-World
Online Learning
&
Course Data Playground
uSkill is a ready-to-use synthetic dataset that mimics the experience of a digital learning platform. Ideal for recommendation systems, student segmentation, dropout prediction, and course performance forecasting.
Overview
uSkill is a synthetic dataset that brings the experience of a digital learning platform to life. From discovering a course to enrolling, consuming lessons, and leaving reviews, it mimics the full journey of a learner. It's a great resource for anyone looking to explore how people interact with online education, uncover learning trends, or test systems behind platforms like Udemy or Coursera.
Whether you're a developer, analyst, or data science enthusiast, uSkill is packed with realistic mock data — users, courses, instructors, pricing, reviews, enrollments, and more. It's ideal for building personalized course recommendations, predicting student dropouts, analyzing completion rates, or testing enrollment logic and payment flows. With this dataset, you can experiment, learn, and build confidently in the world of digital education.
Full Learning Lifecycle
Simulates a complete e-learning platform experience including course listings, lesson enrollments, student profiles, instructor management, and reviews.
Built for Development & Testing
Excellent for building and testing recommendation engines, student segmentation models, dropout prediction systems, and course completion forecasting pipelines.
Backend Feature Testing
Useful for developers working on enrollment systems, quiz engines, lesson progress tracking, and payment processing systems.
Rich Transactional Data
Includes transactional data for enrollments, cancellations, payments, coupons, and certification systems. Supports category-based analytics and engagement trend tracking.
Analytics & Research
Structured for SQL practice, data cleaning exercises, dashboard creation, performance benchmarking by category or course type, and real-world data modeling.
How it Works
AI-Generated & Fully Synthetic
The uSkill dataset was synthetically generated to replicate the user experience of an online course and learning platform. Advanced AI agents were used to model realistic behaviors of both learners and instructors — with zero real student or course data.
Realistic Simulation with Privacy
It simulates course listings, lesson availability, enrollments, user reviews, pricing, and category-based search — without any real transactions or student data, ensuring ethical use across all educational applications.
High-Quality & Safe for Use
Built for testing e-learning enrollment engines, analytics dashboards, and student-facing education apps — 100% privacy-compliant and ready to use out of the box.
Dataset Schema
A comprehensive relational model representing a modern online learning platform engineered for deep analysis and complex querying.
Users
Contains user data like login details, contact info, and role (student, instructor, admin), linked to enrollments, payments, and reviews.
Courses
Represents courses listed by instructors, including title, description, price, duration, and level, linked to lessons and promotions.
Course Lessons
Stores lesson info within courses, including type, content, duration, and lesson order.
Enrollments
Tracks course enrollments, including progress, completion status, and enrollment date, associated with users and courses.
Payments
Records payments for course purchases, including method, status, and transaction details.
Reviews
Allows students to leave reviews for courses, including ratings and written feedback.
Instructors
Stores instructor profiles including bio and profile picture, linked to their published courses.
Categories
Organizes courses by category (e.g., Programming, Design) with a name and description.
Coupons
Stores discount codes for course purchases, defining discount percentage and valid dates.
Course Categories
Defines the relationship between courses and categories, allowing courses to belong to multiple categories.
Notifications
Tracks notifications sent to users, such as enrollment updates or promotional alerts, with message content and read status.
- CSV
- JSON
- Excel
- MySQL
- PostgreSQL
- SQL Server
- Snowflake