Carlelo is a synthetic dataset that mirrors how a modern car listing and sales platform works. It includes everything from vehicle listings and dealer info to customer interactions and pricing trends — making it a great resource for anyone exploring the automotive sales space. Whether you're experimenting with price prediction models, analyzing what makes a listing stand out, or testing how users interact with listings, Carlelo gives you a realistic data environment to dive into.
The dataset covers users, vehicles, transactions, specs, reviews, and more — perfect for building recommendation engines, simulating search and filter functions, or tracking market trends. It's useful for developers testing backend systems like listing management or checkout workflows, and for analysts who want to explore how factors like mileage or location impact sales. Plus, it’s ideal for SQL training, data modeling, and making sense of user preferences in the car buying journey.
Highlights:
- Simulates a full car sales platform with vehicle listings, buyer profiles, transactions, reviews, and dealership details.
- Perfect for building models for price prediction, market demand forecasting, and recommendation engines.
- Supports backend testing for vehicle listing management, search functionality, and transaction processing.
- Provides insights into vehicle pricing trends, customer preferences, and regional sales performance.
- Great for SQL training, data cleaning, cohort analysis, and A/B testing of pricing and sales strategies.
The Carlelo schema manages a platform for buying and selling cars. It includes tables for managing users, car listings, payments, reviews, and inquiries. Buyers can browse listings, inquire about cars, make payments, and leave reviews, while sellers can list cars and manage transactions. Admins oversee the entire process.
Key tables in the dataset include:
- Users: Stores user data such as login details, contact info, and role (buyer, seller, admin). Linked to platform actions (listings, payments, inquiries, reviews).
- Cars: Represents cars for sale with details like brand, model, price, and fuel type. Key for managing inventory.
- Car Images: Holds images of cars (exterior, interior, engine), linked to car listings for display.
- Car Listings: Tracks car listings with pricing, status (active, sold), and dates. Allows updates and visibility of cars for sale.
- Reviews: Stores reviews for cars, including ratings and comments, offering feedback for buyers and sellers.
- Payments: Records payment details for car listings, including method, status, and transaction information.
- Inquiries: Tracks buyer inquiries about cars, including messages and dates, facilitating communication before purchase.
- Transactions: Logs final transactions for car purchases, with buyer info, amount, and status (pending/completed).
- Notifications: Stores notifications for users about platform updates, payments, and car listing changes.
- Featured Cars: Lists cars featured for promotion for a specified time, helping highlight select inventory.
The Carlelo dataset was designed to emulate the operations of a modern vehicle marketplace platform. It includes realistic, synthetic data representing car listings, user inquiries, pricing, reviews, transaction records, and vehicle details. AI agents simulate behaviors of both individual sellers and potential buyers, covering scenarios like browsing, negotiation, and payment. This dataset is crafted to reflect industry-standard workflows in car marketplaces, without using any real dealer or user data—perfect for building vehicle listing portals, pricing engines, and analytics tools for auto platforms.
CarLelo provides structured automotive marketplace data, including listings, inquiries, and transactions. The dataset is made available in formats and environments that suit analytics teams, software engineers, and ML practitioners alike—across both local infrastructure and cloud ecosystems.
- Available file formats: CSV, JSON, Excel
- Available databases: MySQL, PostgreSQL, SQL Server
- Cloud database access: Snowflake