Generative AI Meets Chart Analysis: Innovations and Insights from DiligentIQ

August 8, 2024

By Maya Boeye, AI Researcher/Analyst

At DiligentIQ, we are continually pushing the boundaries of data visualization and chart analysis through the integration of Generative AI. Spearheaded by Software Engineer, Michael Scoleri, our recent efforts have focused on addressing the challenges of accurately reading and interpreting charts using Large Language Models (LLMs). Michael's work has advanced our understanding and capability in this area, and I'm pleased to share the insights and progress we've made.

The Challenge ➡️ LLM Chart Reading Performance

A persistent challenge with LLMs is their difficulty in accurately analyzing charts and extracting information when the numerical data isn't directly visible in the image. As you can see in Michael’s Custom Chart Analysis report, this limitation was evident in our testing of the Anthropic model and OpenAI's vision model. These models frequently misinterpret values in bar and pie charts, which undermines their reliability for precise data extraction and analysis.

To address this, we conducted a series of tests using the DiligentIQ staging environment. Testing in the staging environment allowed us to evaluate each model's capabilities in a raw, unassisted state, providing a clear picture of their strengths and weaknesses. The chart images used during testing were sourced from this Hugging Face Dataset.

Preliminary Findings ➡️ LLMs Struggle with Manual Calculations

Our tests revealed significant errors in the models' ability to perform manual calculations. Both models often assigned the same value to different sections of pie charts and consistently under- or overestimated values in bar charts, as you can see see the example below. 

Chart Provided:

Prompt: Report back the pie chart names and corresponding values

Model Responses:

As you can see, the model responses to this prompt were not accurate.

Despite these challenges, the models performed adequately when the data was present directly in the image and no calculations were required. This indicates that the issue is really in the models' ability to process and interpret visual data accurately.

Innovative Solutions ➡️ Combining AI with Manual Analysis

To overcome these limitations, Michael implemented a combination of manual analysis and advanced image processing techniques. Here’s a breakdown of the approach,

  1. Pie Chart Analysis:
    • Identification: Using OpenCV’s Hough Circle Transform to accurately locate the pie chart within the image.
    • Color Quantization: Applying the k-means clustering algorithm to simplify the color palette, making it easier to analyze.
    • Label Matching: Utilizing cosine similarity to match RGB values provided by the LLM with actual chart colors, ensuring precise label matching.
    • Pixel Calculation: Employing the k-means clustering algorithm to isolate and count the number of pixels for each color and calculating their percentage representation in the pie chart.
  2. Bar Chart Analysis:
    • Y-Axis Extraction: Using a custom adapter on AWS Textract to extract y-axis information and determine the scaling factor.
    • Noise Reduction: Applying preparation filters to reduce image noise.
    • Color Quantization: Applying the k-means clustering algorithm to simplify the color palette, making it easier to analyze.
    • Edge Detection: Using the Canny Edge detection algorithm to outline edges, followed by the findContours function to identify potential bars.
    • Bar Confirmation: Utilizing the Structural Analysis and Shape Descriptors function to estimate the number of edges in the detected contour.
    • Label Identification: Applying AWS Textract to expand search regions horizontally to find x-axis labels.
    • Height Calculation: Calculating the height of each confirmed bar in pixels, then using the scaling factor and label identification to report findings.

Results ➡️ Significant Improvement in Accuracy

Compared to the initial performance of the LLMs, our approach showed a significant increase in accuracy. As you can see in the image below, by combining advanced image processing techniques with manual analysis, we were able to reliably interpret and analyze the same charts that models failed to in prior tests. 

This DiligentIQ response was generated from the same chart and prompt as the initial model test pictured in preliminary findings.

This journey wasn't without its hurdles. Initial approaches, such as measuring circumferences and using a range of RGB values, proved ineffective. For more detailed information about our initial failed approaches, see Michael’s full report linked at the top of this post. 

This project highlights DiligentIQ’s commitment to innovation and precision. By integrating Generative AI with advanced image processing and machine learning techniques, we have significantly enhanced our model’s ability to accurately interpret and analyze chart data. Michael's work has been instrumental in this progress, and we are excited to continue building on these advancements. A special thank you to Junyu Luo for the invaluable dataset that made this test possible. We look forward to sharing more updates as we continue to innovate and refine our capabilities.

POWERED BY

AI will rapidly change Private Equity due diligence

Get out ahead of the change.
Partner with a team that knows PE.
Get Started
PRIVACY POLICYTERMS OF USESUBPROCESSOR LIST


© 2024, DiligentIQ