2025-09-01 –, B07-B08
In the fast-paced realm of travel experiences, GetYourGuide encountered the challenge of maintaining consistent, high-quality content across its global marketplace. Manual content creation by suppliers often resulted in inconsistencies and errors, negatively impacting conversion rates. To address this, we leveraged large language models (LLMs) to automate content generation, ensuring uniformity and accuracy. This talk will explore our innovative approach, including the development of fine-tuned models for generating key text sections and the use of Function Calling GPT API for structured data. A pivotal aspect of our solution was the creation of an LLM evaluator to detect and correct hallucinations, thereby improving factual accuracy. Through A/B testing, we demonstrated that AI-driven content led to fewer defects and increased bookings. Attendees will gain insights into training data refinement, prompt engineering, and deploying AI at scale, offering valuable lessons for automating content creation across industries.
GetYourGuide, a global marketplace for travel experiences, needs to provide structured and inspiring content for every activity in its marketplace.
Before the release of our AI models, suppliers would create their content fully manually. The manual approach led to several issues in production, such as content inconsistencies, incorrect grammar, non-English language, and poor adherence to our content guidelines.
These content defects negatively impact the conversion rate of activities.
At the same time, with the large scale of new activity generation, our internal teams could only review a very small fraction of the submitted content.
With our LLM solution, suppliers can now automatically generate optimal content for their activities. Our feature allows users to simply copy-paste any existing raw text of their activity, and our models would then prefill most of the content sections. Suppliers then have the opportunity to review and edit the content.
We chose two different methods to generate free text content and structured information.
For free text, we used the OpenAI fine-tune API to create two different models generating the relevant sections of our travel activities, i.e. the title, the highlights, the short and full descriptions.
For structured information, we used the Function Calling gpt API to prefill the different activities tags and categories that have fixed values constraints in our database, such as the transport used or the type of the guide.
In order to validate our models, as well as for production monitoring, we developed a dedicated LLM evaluator that identifies hallucinations for our specific case, that is our models generating information that is not factually correct as compared to the input supplier text. With this hallucination evaluator, we were able to score the performance of different models and unlock key learnings and iterations. The evaluator also enables our internal team to detect and correct the hallucinations in production.
After several AB experiments, the new automated content creation feature is fully released to all our suppliers. The activities with content generated via AI showed significantly fewer content defects and a significant increase in bookings, with only a small fraction of hallucinations that can be reviewed and corrected manually.
In this talk, we will share our long journey consisting of several training data iterations to build our fine-tuned models, the prompt engineering challenges in building our evaluator and our function call model. We will also cover the different experiments and the operational challenges in training the models and deploying the service in production.
The talk will provide some concrete ideas and tools to automate the generation of optimal content with LLMs, which is a common use case in many industries.
Advanced
Prerequisites:Openai fine-tuning: https://platform.openai.com/docs/guides/fine-tuning
Openai function calling: https://platform.openai.com/docs/guides/function-calling?api-mode=chat
Evaluating model performance with LLM: https://platform.openai.com/docs/guides/evals
"Discover how GetYourGuide uses AI to revolutionize content creation! By automating with LLMs, we've boosted accuracy & bookings. Join us at #PyDataBerlin2025 to learn about training data refinement, prompt engineering, & scaling AI. #AI #TravelTech"
With over a decade of experience in data science and analytics, I am a Senior Data Scientist at GetYourGuide, where I lead initiatives in leveraging large language models (LLMs) to enhance content quality and conversion rates. My expertise includes fine-tuning LLMs for custom text generation and classification, developing NLP models for discovering new travel interests, and automating predictive models for global travel demand. I have a robust background in machine learning, natural language processing, and AI-driven content automation, which has significantly improved operational efficiencies and business outcomes.
Prior to moving to Data Science, I was a Senior Data Analyst at GetYourGuide, where I developed key metrics for availability and loyalty, built automated forecasting for our travel activities, performed impact analyses for sales and marketing, and automated data analyses with custom libraries.
Before joining GetYourGuide, I worked as Data Analyst in Foodpanda, an online food delivery platform, where I optimized restaurant ranking algorithms and developed recommendation systems.
My analytical journey began at Wealth-X in Budapest, where I worked as a Business Analyst, and later as Research Consultant in Millward Brown Vermeer, where I applied statistical techniques to report insights to external customers.
I hold a Master's degree in Marketing from Rotterdam School of Management, Erasmus University, graduated cum laude, and a Bachelor's degree in Business/Managerial Economics from Università di Pisa.
Driven by a passion for data-driven decision-making, I am committed to advancing AI technologies to solve complex business challenges. At PyData 2025 Berlin, I aim to share insights into deploying AI at scale, refining training data, and mastering prompt engineering to automate content creation across industries.