Mikko Lehtimäki

Perustaja, Chief Data Scientist

#LLM

Tilaa blogimme

Viimeisimmät julkaisut

AI for SaaS – Making Your SaaS Product AI-Native

Qdrant: pilvi-integroinnin helpottaminen dynaamisilla Bearer-tunnisteilla

How to trust an LLM: Evaluations Part One

AI for SaaS – Making Your SaaS Product AI-Native

Qdrant: pilvi-integroinnin helpottaminen dynaamisilla Bearer-tunnisteilla

How to trust an LLM: Evaluations Part One

Cover Image for ud2rj73 _1aprqjg3s _1aprqjg3t _1aprqjg3v _1aprqjgew _1aprqjgex _1aprqjgez

LLM-based Agents for Tabular Data Processing

Language models are pretty great at understanding and generating text that sounds like a human wrote it. But what about when we throw tables full of data at them? For example, data in ERP and CRM systems, databases and Excel sheets is often structured in a manner that a reasonable analysis must consider every element of the data to draw conclusions. The actual values in the tables may be unstructered (think free-form text), which makes the problem intractable for typical spreadsheet analysis.

This blog post dives into how language models handle structured, tabular data, exploring the unique challenges they face and the cool benefits they offer. We’ll look into practical ways these AI tools can help us make sense of numbers and tables, using non-trivial tabular data with free text columns as an example. A component for understanding tabular data is an integral part of a solid AI engine after all - and you'll want one if you plan to fully utilize generative AI.

Understanding and acting on tabular data, such as spreadsheets and comma separated values, remains challenging for large language models (LLMs). Two technical hurdles remain

LLMs are bad at working with numbers
LLMs have limited capability to work with a large amount of data in one call

Both of the above issues are prevalent with tabular data, which often comprises rows containing numerical and textual data, in large amounts. Hence just pasting your Excel sheet to a chatbot will likely have limited utility. The results will not properly consider all the data and any numerical operations will be unreliable.

Note that even if an LLM with a long context window does well in the needle-in-a-haystack test, that performance does not typically transfer well into working with very information dense data like tables.

To tackle these challenges, we've come up with an agent-based method that can handle infinitely large datasets and even databases. This innovative approach uses multiple language models, or 'agents,' each specialized in different aspects of processing data. By working together, these agents can process, analyze, and interpret large volumes of tabular data more effectively than a single LLM could on its own.

In this agent-based system, each agent is equipped with its own set of tools allowing them to use tools and execute code independently. An agent can even run machine learning methods on the data! This setup enables the agents to handle specific tasks, from querying and manipulating data to performing complex analyses. These tools give each agent the ability to operate autonomously yet collaboratively, ensuring that even the most extensive datasets are processed efficiently and effectively.

Our agents can:

Clean, filter and join tabular data
Run code on tables to process numerical columns
Summarize and cluster content
Create visualizations
Serve the results as downloadable files

The diagram below illustrates the overall functionality of data processing agents:

Schematic description of agents for tabular data analysis.

Let’s see the tabular data agents in action by analyzing a subset of the dataset of Disneyland reviews from Kaggle. The dataset contains 42 000 rows of review texts and metadata, posted to Trip Advisor.

LLM's overview of the dataset.

The following visualization was created for us:

A simple data analysis task that can be answered with the existing columns is to ask which visitors, by their home country, are most satisfied with the Paris Disneyland.

The LLM points out a typical data analysis oversight.

One agent makes a valid point, there are countries with very few reviews in the dataset, which could bias average ratings. Let’s have the agents only consider countries with 50+ reviews.

Disneyland rating by visitor country

Seems like the French are not too happy with their amusement Park 🤷

It’s worth emphasizing that just passing the large dataset to an LLM as text would not give good results for the following tasks. Even if a long-context LLM passes the needle-in-a-haystack test (processing sparse data), the type of analysis we are doing here requires understanding very dense data. The attention mechanism of transformer LLMs is not great for such a task. Worse yet, the LLM wouldn’t refuse the assignment, making the result appear valid on the surface. The agentic approach will go through each review to synthesize an answer, making it a more reliable approach.

Let’s ask the agents to give a topic to each rating, so that we can better see what it is about, and let’s also ask which specific attractions were mentioned. For this task, we’ll ask for 1000 random reviews from California Disneyland and proceed with those.

Topics

Notice that extracting the topics is not a trivial step. It requires a topic modeling step for free-form text data that considers all of the dataset, and our agents can handle it. Let’s see the review ratings for each review topic that we found:

Ratings per agent-extracted topic.

Interesting! So reviews that were about traveling (only 9 entries, explaining the perfect 5.0) and holding or attending events score rather high, while those that mention service and cleanliness consistently score lower. Not a huge surprise, and further analysis is warranted. The food appears to be tasty!

Experiences and Other are a bit of catch-all columns. Let’s ask the agents to summarize the contents of Experiences topic for each rating from 1 to 5. Again, passing the large dataset to an LLM as text would not give good results. The agentic approach will go through each review to synthesize an answer. Let’s see what happens, below is the LLM analysis:

For the 1-star ratings, reviewers were dissatisfied with the long wait times, crowded conditions, and the lack of Disney vibe at Disneyland. They mentioned that many attractions were closed, and there were no characters out for the kids to take pictures with. Specific attractions like California Screamin' were criticized for poor planning and low visibility. Reviewers also expressed frustration with the dining package not providing any advantage for getting a good view of the shows. Suggestions for improvement included better organization for show seating and reserved areas for dining options. Overall, the 1-star reviews highlighted disappointment with the overall experience and a preference to spend money and time on better experiences.

For the 2-star ratings, visitors commonly dislike long wait times, crowded areas, unavailable attractions, poor quality food, expensive merchandise, and unhelpful staff. Specific attractions like Haunted Mansion, Pirates of the Caribbean, and Splash Mountain are mentioned negatively. Visitors also express disappointment in the lack of magic and organization compared to other Disney parks, particularly Disney World in Florida. Suggestions for improvement include better management of crowds, more diverse and accessible food options, and overall park renovations to enhance the magical experience. On a positive note, some visitors commend the excellent character staff and the day and night parades. However, overall, visitors recommend against the experience unless significant improvements are made.

For the 3 star ratings, reviewers liked the attractions such as the parade, fireworks, and some of the rides. They appreciated the convenience of using the mobile ticket barcode and the availability of water fountains. Some also mentioned the magic and fun experienced by the kids. However, common dislikes included the long queues for both rides and dining, overpriced food and drinks, the crowded and chaotic environment, and the lack of proper maintenance in some areas. Some reviewers mentioned dissatisfaction with the staff's attitude or service. Visitors also express dissatisfaction with the high prices, long waiting times for rides, and the limited choice of places to eat. Suggestions for improvement include better staff attitude, more available rides, and a review of pricing and ticket pass policies to enhance the overall experience.

For the batch of reviews with a rating of 4, visitors express satisfaction with attractions at Disneyland California such as Space Mountain, Haunted Mansion, Pirates of the Caribbean, and Captain Eo. They appreciate the cleanliness of the park, helpful staff, and the experience for children. Common points of dissatisfaction include long waiting lines, expensive food, and souvenirs, as well as the high cost of admission. Visitors suggest improvements in managing waiting times, lowering prices, and extending park hours. Some visitors also express appreciation for the magical experience, memorable attractions, and the attention to detail in the park. However, they also mention high prices for food and souvenirs, and long lineups for popular attractions. Suggestions for improvement include reducing ticket pricing and providing outdoor heaters during cold weather. Specific attractions mentioned include Disneyland, the Indiana Jones ride, and Disney World.

Visitors who give 5-star ratings consistently praise the magical and memorable experience of Disneyland, mentioning specific attractions like Space Mountain, Haunted Mansion, Fantasmic, California Adventure, night shows/fireworks, Pirates of the Caribbean, and Star Tours as particularly enjoyable. They also appreciate the cleanliness of the park, the kindness of the staff, and the fun for all ages. However, some mention long lines, crowded parks, high expenses for food, and high cost of admission as downsides. Suggestions for improvement include better crowd management, managing wait times, reducing food prices, and adding healthier food options. Overall, the general sentiment for the 5-star ratings is overwhelmingly positive, with praise for the attractions and the magical experience.

In every rating category people complain about wait times. It appears that those that arrive with a plan to visit specific attractions are generally satisfied with their experience.

Attractions

If we look at ratings where a specific attraction or adventure was mentioned, we can see that highest scores were from reviewers that took the time to mention a specific attraction, with lowest ratings from those that don’t mention attractions. This is consistent with our above findings based on the “Experiences” review category:

Ratings by Disneyland attraction.

We are offered the following interpretation about engagement and managing expectations, which could help design advertising campaigns and planning events:

Analysis of reviews after data processing.

Seems that visitors are generally happy when they have a specific attraction they want to visit, leading to a more fulfilling experience! Perhaps a fact-based advertising campaign about the attractions and best times to visit could be very successful.

So there you have it, a glimpse of what is possible with agents that can handle tabular data like Excel sheets and databases! Based on our experience, not every tabular data agent deployment needs all of these abilities. In many cases it is possible to simplify the available agent skills to a subset of those discussed here, improving reliability and time-to-answer. So as always, use case first is the way to a reliable GenAI deployment.

Blogi

Tilaa blogimme

Viimeisimmät julkaisut

AI for SaaS – Making Your SaaS Product AI-Native

Qdrant: pilvi-integroinnin helpottaminen dynaamisilla Bearer-tunnisteilla

How to trust an LLM: Evaluations Part One

AI for SaaS – Making Your SaaS Product AI-Native

Qdrant: pilvi-integroinnin helpottaminen dynaamisilla Bearer-tunnisteilla

How to trust an LLM: Evaluations Part One

LLM-based Agents for Tabular Data Processing

Topics

Attractions

Ota yhteyttä meihin