uSkill is a synthetic dataset that simulates the operations of an online learning platform, designed to provide data for analyzing user behavior, course effectiveness, and platform performance. Whether you're developing recommendation systems, performing student retention analysis, or simulating backend workflows, uSkill offers a robust dataset for testing and research.
The dataset includes detailed records for users, courses, instructors, enrollments, quizzes, video consumption, ratings, and feedback. With this, you can simulate and analyze the full learning lifecycle, from course creation to student progress, interaction patterns, and completion rates. Machine learning engineers can use uSkill for projects like personalized course recommendations, churn prediction, and predicting course completions, while developers can test course enrollment systems, user dashboards, and certificate generation logic.
For data analysts, uSkill is ideal for exploring trends in course popularity, learner engagement, and pricing strategies. You can conduct cohort analyses to assess user retention, track progress over time, and study the impact of course features on student satisfaction and outcomes. It’s also an excellent resource for SQL practice, database management, and building ETL pipelines for educational platforms.
Highlights:
- Simulates an entire online learning environment with course listings, user profiles, quiz scores, and ratings.
- Supports machine learning tasks like recommendation algorithms, dropout prediction, and content personalization.
- Great for backend testing, including user registration, course management, and certification workflows.
- Provides insights into learner behavior, content engagement, and course effectiveness.
- Useful for SQL training, data cleaning, and advanced analytics like A/B testing and market trend analysis.
Uskill is an all-in-one online learning platform designed to connect learners, educators, and admins. It provides a simple and interactive environment for course creation, enrollment, tracking progress, and handling payments. The platform supports various skill levels across multiple categories, offering a personalized learning experience.
Key tables in the dataset include:
- Users: Stores user info (username, email, phone, password, role) and tracks user activity (creation, updates).
- Courses: Stores course details like title, description, instructor, price, duration, and level. Tracks course creation and updates.
- Categories: Organizes courses by category (e.g., Programming, Design) with a name and description.
- Course Enrollments: Tracks student enrollments in courses, including course progress, status (enrolled, completed, etc.), and enrollment date.
- Course Lessons: Stores lesson info within a course, including title, content (video/audio/docs), duration, and lesson order.
- Reviews: Allows students to leave course ratings and feedback.
- Payments: Tracks course payment details, including payment method, amount, and transaction ID.
- Coupons: Defines discount codes for course purchases, including code, discount, and validity.
- Course Payments: Stores payment records related to courses, including course ID, user ID, payment amount, and method.
- Notifications: Tracks notifications sent to users with message content and read status.
- Instructors: Stores instructor details (user ID, bio, profile picture) to manage instructor profiles.
- Course Categories: Defines the relationship between courses and categories, allowing courses to belong to multiple categories.
The Uskill dataset is a synthetically generated representation of a modern online learning and course marketplace. It simulates a multi-role ecosystem with users acting as students, instructors, and administrators. AI agents were used to mirror behaviors such as course creation, enrollment, lesson consumption, reviews, and course completion tracking.
The data includes structured interactions between users and educational content, capturing realistic timelines for learning progress, payment flows, coupon applications, and user feedback. Importantly, no real instructors, students, or course materials were involved in the generation process. This dataset serves as a safe, privacy-first environment for building and testing educational platforms, recommendation systems, student analytics tools, and payment integrations in e-learning contexts.
The USkill dataset emulates the structure of a modern online learning platform, complete with courses, enrollments, and payment flows. It is ideal for testing e-learning algorithms, dashboard design, and data analytics. It supports flexible access across popular file formats, relational databases, and scalable cloud systems.
- Available file formats: CSV, JSON, Excel
- Available databases: MySQL, PostgreSQL, SQL Server
- Cloud database access: Snowflake