BOYO is a synthetic dataset that brings the experience of a digital hospitality platform to life. From discovering a property to booking, checking in, and leaving reviews, it mimics the full journey of a guest. It’s a great resource for anyone looking to explore how people interact with stays, uncover booking trends, or test systems behind platforms like Airbnb or Booking.com.
Whether you're a developer, analyst, or data science enthusiast, BOYO is packed with realistic mock data — users, properties, pricing, reviews, bookings, and more. It's ideal for building personalized stay recommendations, predicting cancellations, analyzing occupancy rates, or testing booking logic and payment flows. With this dataset, you can experiment, learn, and build confidently in the world of digital lodging.
Highlights:
- Simulates a full hospitality lifecycle: property listings, room bookings, customer profiles, host management, and reviews.
- Ideal for modeling price optimization, guest segmentation, churn prediction, and availability forecasting.
- Includes transactional data for bookings, cancellations, payments, discounts, and loyalty systems.
- Rich support for location-based analytics, seasonal trends, and performance benchmarking by region or property type.
- Great for backend feature testing (availability checks, pricing engines, booking pipelines, etc.).
- Structured for SQL practice, data cleaning exercises, dashboard creation, and real-world data modeling.
The Boyo schema supports a property booking platform where users can book rooms in properties listed by hosts. It includes tables for managing users, properties, rooms, bookings, payments, reviews, and notifications. The schema allows hosts to manage their properties and bookings, while providing customers with an easy way to book rooms, make payments, and leave feedback.
Key tables in the dataset include:
- Users: Contains user data like login details, contact info, and role (customer, host, admin), linked to bookings, payments, and reviews.
- Properties: Represents properties listed by hosts, including address, description, and host association, linked to rooms and promotions.
- Rooms: Stores room info within properties, including type, price, capacity, and amenities.
- Bookings: Tracks room bookings, including check-in/check-out dates, total amount, and booking status, associated with users and rooms.
- Payments: Records payments for bookings, including method, status, and transaction details.
- Reviews: Allows users to leave reviews for properties, including ratings and comments.
- Facilities: Details amenities available at properties, such as gyms or pools, linked to specific properties.
- Property Images: Stores images related to properties, including interior, exterior, and room photos.
- Promotions: Stores promotional codes and discounts for properties, defining discount percentage and valid dates.
- Property Promotions: Links properties with promotions, allowing hosts to apply discounts for a limited time.
- Notifications: Tracks notifications sent to users, such as booking updates or promotional alerts, with message content and read status.
The Boyo dataset was synthetically generated to replicate the user experience of an online accommodation and room-booking platform. It simulates key functionalities such as hotel listings, room availability, bookings, user reviews, pricing, and location-based search. Advanced AI agents were used to model realistic behaviors of both travelers and accommodation providers, creating a dataset that reflects actual industry patterns without using any real hotel or guest data. This synthetic dataset is ideal for testing hospitality booking engines, analytics dashboards, and customer-facing travel apps while maintaining full privacy compliance.
Boyo is engineered to replicate hotel booking behaviors, search filters, and pricing patterns. Whether you're running analytics or testing recommendation engines, Boyo’s structured data format ensures it fits perfectly into your preferred tech stack, with support across all modern platforms and data environments.
- Available file formats: CSV, JSON, Excel
- Available databases: MySQL, PostgreSQL, SQL Server
- Cloud database access: Snowflake