Amazing is a synthetic dataset that simulates the full journey of a large e-commerce platform — from browsing and cart activity to purchases, payments, and returns. Built to mirror real-world digital retail systems, it’s perfect for experimenting with recommendation engines, checkout flows, dynamic pricing, and customer behavior analysis.
Whether you're a data scientist building predictive models or a developer testing backend workflows like inventory updates and payment handling, Amazing provides structured data across customers, products, vendors, and transactions. It's also great for analysts exploring sales trends, user cohorts, or building dashboards. If you're prepping for a role in e-commerce tech, this dataset offers hands-on practice with SQL, ETL, and data modeling in a realistic retail context.
Highlights:
- Simulates an end-to-end e-commerce journey: user onboarding, product discovery, cart management, orders, and fulfillment.
- Ideal for developing ML models in recommendation systems, purchase prediction, and demand forecasting.
- Supports backend testing for catalog management, order processing, returns, refunds, and customer support flows.
- Includes detailed data on user preferences, transaction timelines, product reviews, and promotional impact.
- Great for retail analytics, customer segmentation, inventory planning, and trend visualization.
The Amazing Schema is designed for an online marketplace platform that supports product listings, user accounts, orders, payments, and various user engagement features like reviews and wishlists. This schema can be used to manage the entire lifecycle of user transactions, product management, and promotional offers. The schema is structured to support a wide range of e-commerce functions, including managing users, products, categories, orders, payments, and user-generated content such as reviews and wishlists.
Key tables in the dataset include:
- app.categories: Organizes products into categories for better navigation.
- app.coupons: Stores discount codes, expiration dates, and discount percentages for promotions.
- app.users: Contains user account details, including login credentials, personal info, and user type.
- app.products: Stores product details like names, prices, descriptions, and stock quantities.
- app.orders: Tracks customer orders, including status, total amounts, and shipping addresses.
- app.payments: Records payment transactions for orders, including method, status, and amount.
- app.addresses: Stores user shipping addresses, linking them to orders for delivery.
- app.cart: Manages a user’s shopping cart with selected items before purchase.
- app.reviews: Stores product reviews, including ratings and descriptions, for feedback.
- app.wishlists: Manages user-created wishlists for future purchase items.
- app.subcategories: Further categorizes products within main categories for better organization.
- app.cart_items: Stores individual products added to a user's cart with quantity and details.
- app.order_items: Stores product details in an order, including quantity and price.
- app.wishlist_items: Manages items in a user’s wishlist, linking products to wishlists.
The Amazing dataset was created using AI-based synthetic data generation, built to reflect the workflows of a modern e-commerce marketplace. From customer browsing habits and product listings to shopping cart activity, order placement, and payment flows, this dataset captures it all—without referencing any real-world users or transactions. Using machine learning models trained on publicly available data patterns from global e-commerce platforms, we’ve crafted realistic interactions, purchase behaviors, and inventory management processes. This approach ensures the dataset maintains high fidelity to real-life scenarios while being entirely fictional. It’s perfectly suited for testing e-commerce systems, analytics pipelines, or recommendation engines without compromising on privacy or compliance.
The Amazing dataset was crafted to support diverse data use cases such as e-commerce modeling, recommendation systems, and backend testing. Built with compatibility and portability in mind, it can be accessed through various storage formats and deployed on industry-standard databases and cloud platforms for scalable processing and analytics.
- Available file formats: CSV, JSON, Excel
- Available databases: MySQL, PostgreSQL, SQL Server
- Cloud database access: Snowflake