By Maya Boeye, AI Researcher/Analyst
At DiligentIQ, we are continually pushing the boundaries of data visualization and chart analysis through the integration of Generative AI. Spearheaded by Software Engineer, Michael Scoleri, our recent efforts have focused on addressing the challenges of accurately reading and interpreting charts using Large Language Models (LLMs). Michael's work has advanced our understanding and capability in this area, and I'm pleased to share the insights and progress we've made.
A persistent challenge with LLMs is their difficulty in accurately analyzing charts and extracting information when the numerical data isn't directly visible in the image. As you can see in Michael’s Custom Chart Analysis report, this limitation was evident in our testing of the Anthropic model and OpenAI's vision model. These models frequently misinterpret values in bar and pie charts, which undermines their reliability for precise data extraction and analysis.
To address this, we conducted a series of tests using the DiligentIQ staging environment. Testing in the staging environment allowed us to evaluate each model's capabilities in a raw, unassisted state, providing a clear picture of their strengths and weaknesses. The chart images used during testing were sourced from this Hugging Face Dataset.
Our tests revealed significant errors in the models' ability to perform manual calculations. Both models often assigned the same value to different sections of pie charts and consistently under- or overestimated values in bar charts, as you can see see the example below.
Chart Provided:
Model Responses:
Despite these challenges, the models performed adequately when the data was present directly in the image and no calculations were required. This indicates that the issue is really in the models' ability to process and interpret visual data accurately.
To overcome these limitations, Michael implemented a combination of manual analysis and advanced image processing techniques. Here’s a breakdown of the approach,
Compared to the initial performance of the LLMs, our approach showed a significant increase in accuracy. As you can see in the image below, by combining advanced image processing techniques with manual analysis, we were able to reliably interpret and analyze the same charts that models failed to in prior tests.
This journey wasn't without its hurdles. Initial approaches, such as measuring circumferences and using a range of RGB values, proved ineffective. For more detailed information about our initial failed approaches, see Michael’s full report linked at the top of this post.
This project highlights DiligentIQ’s commitment to innovation and precision. By integrating Generative AI with advanced image processing and machine learning techniques, we have significantly enhanced our model’s ability to accurately interpret and analyze chart data. Michael's work has been instrumental in this progress, and we are excited to continue building on these advancements. A special thank you to Junyu Luo for the invaluable dataset that made this test possible. We look forward to sharing more updates as we continue to innovate and refine our capabilities.