How Accurate Are Multimodal AI Models for Football/Soccer Video Analysis? A Benchmark Study of Gemini and Qwen
By: Smruthi Karthik
Introduction
Modern multimodal AI models are becoming increasingly powerful at understanding video. In theory, these systems should be able to watch a football match clip and automatically identify key events such as shots on goal, missed attempts or goalkeeper saves.
But how accurate are these models when applied to real match footage?
There is still limited independent testing of how accurately commercially available models perform basic football performance analysis tasks.
To explore this, we conducted an experiment using three commercially available multimodal AI systems: Gemini (Google AI Pro), Gemini Basic (Free Version) and Qwen 3.6 Plus.
The purpose of this experiment was to evaluate the accuracy of modern multimodal AI models in performing basic football (soccer) performance analysis using a broadcast quality video clip captured from a single camera angle, without any specialized training or domain-specific customization.
The results showed that while the models can appear convincing at first glance, none of them reached the accuracy levels required for professional performance analysis. Among the models tested, Gemini Pro Plus produced the most reliable results, though a few notable inaccuracies remained.
Testing Conditions
- All experiments were conducted on April 7, 2026.
- The models used were Gemini 3 Flash with Google AI Pro, Gemini 3 Flash (Free tier), Qwen3.6-Plus.
- The same prompt was used for all models to ensure fairness.
- Broadcast- quality video resolution using a single side view camera angle. No other close up or goal view cameras were involved.
- The video clip contained all attacking attempts from one team.
- The length of the video clip was 118 seconds long. (Sample Video is given below)
Prompt for AI Video Analysis
You are analysing a football match video clip…
Please follow these instructions carefully:
-
Identify Every Shot Attempt
Scan the video from start to end frame-by-frame and identify every shot attempt taken by the white team (MDSC).For each shot attempt, record:
- Timestamp of the moment the shot is taken (format: mm:ss).
- Type of shot:
- Right foot
- Left foot
- Header
- Shot outcome:
- On target
- Off target
-
Definitions to Follow
Shot on target
- The ball is heading into the goal…
- The shot forces a clear save by the goalkeeper.
Shot off target
- The ball goes wide of the goal,
- Over the crossbar, or
- Is blocked before reaching the goal frame.
Header
- The player makes contact with the ball using their head.
Right foot / Left foot
- Identify the foot used at the moment of ball contact.
If unclear, mark it as “uncertain”.
-
Important Conditions
- Only count shots taken by the white team (MDSC).
- Ignore attempts from the opposing team.
- Do not count passes, crosses, or defensive clearances as shots unless intended to score.
- If the same shot appears in slow-motion replay, count it only once.
-
Output Format
Provide the results in a structured table like this:Shot No. Timestamp On Target / Off Target Body Part Used Short Description 1 00:14 On Target Right Foot White player shoots from inside the box, saved by goalkeeper -
Summary Statistics
- Total shots by MDSC
- Shots on target
- Shots off target
- Shots with right foot
- Shots with left foot
- Headers
Example summary:
Total shots: X
On target: X
Off target: X
Right foot: X
Left foot: X
Headers: X -
Accuracy Instructions
- Watch the clip multiple times if necessary.
- Carefully verify timestamps and classification before reporting.
- If uncertain about any event, clearly mark it as uncertain instead of guessing.
Your goal is to prioritize accuracy over speed and provide precise timestamps and correct event classification.
Ground Truth Data from the Football/Soccer Video Clip
| Shot No. | Timestamp | On Target / Off Target | Body Part Used | Short Description |
| 1 | 00:05 | On Target | Left Foot | Shot by No. 13 from just outside the left edge of the penalty area, saved by the goalkeeper |
| 2 | 00:10 | On target | Right Foot | Shot by No. 11 from inside the penalty area. Blocked by a defender. |
| 3 | 00:16 | On Target | Right Foot | Long-range free-kick from beyond the half line, comfortably caught by the goalkeeper. |
| 4 | 00:24 | On Target | Right Foot | Long-range shot blocked by defender before it reached the penalty area. |
| 5 | 00:30 | Off target | Header | Header by No. 17 goes wide towards the right. |
| 6 | 00:38 | On Target | Right Foot | Long-range shot by No. 11 saved by the goalkeeper; rebound of the save converted into a goal. |
| 7 | 00:42 | On Target | Right Foot | Long-range shot by No. 11 saved by the goalkeeper; rebound of the save converted into a goal. |
| 8 | 00:54 | Off Target | Header | Header from a corner kick misses the target and goes high over the crossbar. |
| 9 | 01:03 | Off Target | Right Foot | Long range shot attempt goes above the crossbar. |
| 10 | 01:08 | On Target | Right foot | Long range shot by No. 13 saved by the goal keeper. |
| 11 | 01:18 | On Target | Right Foot | Long range shot, a powerful strike, from distance saved by keeper. |
| 12 | 01:26 | Off Target | Left foot | Shot by No. 13 goes wide towards the right. |
| 13 | 01:33 | Off Target | Right Foot | Long range shot from a central position flies high over the crossbar. |
| 14 | 01:41 | On Target | Left Foot | Long range shot saved by the keeper. |
| 15 | 01:49 | Off Target | Left Foot | Long range shot goes towards the left of the goal. |
| 16 | 01:55 | On Target | Right Foot | Long range shot blocked by a defender outside the penalty area. |
Summary Statistics of Ground Truth
- Total shots by MDSC: 16
- Shots on target: 10
- Shots off target: 6
- Shots with right foot: 10
- Shots with left foot: 4
- Headers: 2
Comparison of Shot Analysis Statistics Across Models
| Metric | Ground Truth | Gemini
(Google AI Pro) |
Gemini Basic
(Free Version) |
Qwen 3.6 Plus |
| Total Shots by MDSC | 16 | 10 | 12 | 9 |
| Shots on Target | 10 | 5 | 7 | 6 |
| Shots Off Target | 6 | 5 | 5 | 3 |
| Shots with Right Foot | 10 | 8 | 9 | 6 |
| Shots with Left Foot | 4 | 2 | 1 | 1 |
| Headers | 2 | 0 | 2 | 2 |
Results Generated by Multimodal AI Models
Gemini Basic (Free Version)
Output
The video clip captures several attacking moments for Mohammedan SC (MDSC), the team playing in the white kits. Based on the analysis of the footage, here are the details for every shot attempt made by the white team.
MDSC Shot Analysis
| Shot No. | Timestamp | On Target / Off Target | Body Part Used | Short Description |
| 1 | 00:12 | On Target | Right Foot | Shot from the left side of the penalty area, saved by the goalkeeper. |
| 2 | 00:25 | On Target | Right Foot | A low long-range shot from outside the box is comfortably caught by the keeper. |
| 3 | 00:31 | On Target | Header | Header from a corner kick delivery is saved by the goalkeeper near the post. |
| 4 | 00:38 | On Target | Right Foot | Shot from inside the box is pared away by the goalkeeper. |
| 5 | 00:39 | On Target | Left Foot | Rebound from the previous save is turned into the net for a goal. |
| 6 | 00:56 | Off Target | Header | A header from a corner kick misses the target and goes wide of the post. |
| 7 | 01:10 | Off Target | Right Foot | A low, long-range effort from outside the box skids wide of the goal. |
| 8 | 01:20 | On Target | Right Foot | A powerful strike from distance is parried over the bar by the goalkeeper. |
| 9 | 01:34 | Off Target | Right Foot | A long-range shot from a central position flies high over the crossbar. |
| 10 | 01:43 | On Target | Right Foot | A low shot from outside the box is gathered cleanly by the goalkeeper. |
| 11 | 01:51 | Off Target | Right Foot | A speculative long-range attempt from distance goes wide of the goal. |
| 12 | 01:56 | Off Target | Right Foot | Final long-range shot from outside the area misses the target. |
Summary Statistics for Gemini Free Version
- Total shots by MDSC: 12
- Shots on target: 7
- Shots off target: 5
- Shots with right foot: 9
- Shots with left foot: 1
- Headers: 2
Evaluation Parameters for Gemini Free Version
- Speed / Time Taken
The thinking mode of Gemini Free Version took approximately 1 minute and 40 seconds to analyze the video and generate a response.
- Timestamp Accuracy
The timestamps were consistently about one second later than the actual moment of the shot. Instead of marking the exact shot attempt, the timestamps often corresponded to when the shot was saved, when the ball went out, or when the commentator reacted to the attempt. It also missed a few shots, identifying 12 shots out of the actual 16.
- Event Accuracy
The model correctly identified the goal-scoring moment, which shows some ability to detect key events in the match. It also successfully identified header attempts. However, there were multiple inaccuracies in the descriptions of the shots. In some cases, shots that were clearly off target were described as saves by the goalkeeper, and vice versa. While headers were detected, the description of one of the header attempts was incorrect.
- Summary
Overall, Gemini Basic (free version) was able to detect some key events like the goal and headers, but timestamp delays, missed shots, and inaccurate descriptions limited the reliability of its analysis.
Gemini (Google AI Pro)
Output of Gemini (Google AI Pro)
Sure thing! I’ve carefully reviewed the footage of the MDSC (White) vs. MBSG (Maroon) match to track every attacking attempt made by the white team.
Here is the breakdown of the shots taken by MDSC:
MDSC Shooting Analysis
| Shot No. | Timestamp | On Target / Off Target | Body Part Used | Short Description |
| 1 | 00:11 | Off Target | Right Foot | #11 takes a shot from the edge of the box; it’s blocked by a defender. |
| 2 | 00:38 | On Target | Left Foot | #11 strikes a powerful low shot from the left side; the keeper makes a diving save. |
| 3 | 00:44 | On Target | Right Foot | GOAL. #19 follows up on the rebound from the previous save and slots it in. |
| 4 | 00:55 | Off Target | Right Foot | #22 attempts a long-range effort from the right wing; it sails wide of the post. |
| 5 | 01:10 | On Target | Right Foot | GOAL. #11 shoots from just inside the box; the ball skids under the keeper into the net. |
| 6 | 01:20 | On Target | Left Foot | #11 hits a low, curling shot toward the bottom corner; the keeper tips it away. |
| 7 | 01:34 | Off Target | Right Foot | #11 tries a speculative shot from a central position; it goes high and wide. |
| 8 | 01:42 | On Target | Right Foot | #11 fires a low shot from distance; the goalkeeper catches it comfortably. |
| 9 | 01:50 | Off Target | Right Foot | #11 takes a snapshot from outside the area; it misses the target to the left. |
| 10 | 01:56 | Off Target | Right Foot | #11 attempts a final long-range strike; the ball goes well over the crossbar. |
Summary Statistics for Gemini (Google AI Pro)
- Total shots by MDSC: 10
- Shots on target: 5
- Shots off target: 5
- Shots with right foot: 8
- Shots with left foot: 2
- Headers: 0
Evaluation Parameters for Gemini (Google AI Pro)
- Speed / Time Taken
The Google AI Pro processed the video and generated the analysis in approximately 15 seconds.
- Timestamp Accuracy
The timestamps generated by Gemini corresponded to actual moments in the video; it did not hallucinate or create timestamps for events that did not exist. However, the Google AI Pro model missed 6 shots, identifying only 10 shots instead of the correct total of 16 shots.
- Event Accuracy
In five instances (0:38, 0:55, 1:20, 1:42, 1:50), Gemini incorrectly classified the body part used for the shot, often struggling to identify between right foot and left foot. The textual descriptions were mostly correct in intent, but they lacked clarity and precise explanation in several cases. In a few instances, the Google AI Pro model used inaccurate terminology to describe the type of shot. There were some clear misidentifications. For example, at 1:56, the Google AI Pro model completely misidentified the shot, describing the action differently from what actually happened in the video.
- Summary
Overall, Gemini AI Pro analyzed the video quickly and avoided hallucinated timestamps, but it missed several shot events and showed inconsistencies in shot identification and description clarity.
Qwen 3.6 Plus
Output for Qwen 3.6 Plus
| Shot No. | Timestamp | On Target / Off Target | Body Part Used | Short Description |
| 1 | 00:06 | On Target | Right Foot | Shot from inside the penalty area, forced a diving save from the goalkeeper |
| 2 | 00:26 | Off Target | Left Foot | Shot from the left side of the box goes wide of the near post |
| 3 | 00:31 | On Target | Header | Header from a corner kick is saved by the goalkeeper |
| 4 | 00:40 | On Target | Right Foot | Low driven shot from inside the box, parried away by the keeper |
| 5 | 00:56 | On Target | Header | Header from a corner kick finds the back of the net (Goal) |
| 6 | 01:12 | On Target | Right Foot | Close-range right-footed shot is blocked/saved by the goalkeeper |
| 7 | 01:35 | Off Target | Right Foot | Long-range effort from outside the box sails over the crossbar |
| 8 | 01:44 | On Target | Right Foot | Shot from just outside the penalty arc is comfortably caught by the keeper |
| 9 | 01:51 | Off Target | Right Foot | Shot from the right side of the box rolls wide of the far post |
Summary Statistics for Qwen 3.6 Plus
- Total shots by MDSC: 9
- Shots on target: 6
- Shots off target: 3
- Shots with right foot: 6
- Shots with left foot: 1
- Headers: 2
Evaluation Parameters for Qwen 3.6 Plus
- Speed / Time Taken
Qwen 3.6 Plus processed the video and generated the analysis in approximately 1 minute 20 seconds.
- Timestamp Accuracy
The timestamps generated by Qwen 3.6 Plus were not very accurate. In several cases, the timestamp did not correspond to the exact moment the shot was taken, but instead reflected when the shot ended or when the commentary referred to the attempt. The Qwen 3.6 Plus model also missed multiple shots, identifying only 10 shots instead of the correct total of 16 in the video.
- Event accuracy
Qwen 3.6 Plus correctly identified both header attempts, which indicates some ability to detect which body part was used in the shot. However, there were errors in classifying right-foot and left-foot shots. The textual descriptions also contained multiple inaccuracies, often misrepresenting what was happening in the play. For example, Qwen 3.6 Plus failed to correctly identify the actual goal moment and instead labeled an off-target header as the goal. It also wrongly described some long-range shots as attempts taken from inside the penalty box, showing inaccuracy in description and understanding of the shots.
Best Performing Model
Among the models we tested Gemini Google AI Pro delivered the strongest overall performance and produced the fastest responses. It showed comparatively better event coverage and better timestamp alignment. It had better event clarity but missed a few shots and occasionally misclassified the type of shot. As a result, its performance still falls short of the reliability standards required for professional football performance analysis.
Conclusion
The results of this experiment suggest that current commercially available multimodal AI models are not yet capable of delivering the level of accuracy required for reliable football/soccer match analysis, even when provided with fairly detailed prompts. While further improvements in prompt engineering may lead to improved results, the gap between the AI model outputs and the ground truth in this experiment suggests that prompt optimization alone may not be sufficient. A better approach for achieving higher accuracy is likely through fine-tuned models trained on domain-specific sports data.
SPAN by Banyan Board is our video labeling software. SPAN enables large teams of annotators to label events within video footage, making it easier to generate training data for fine-tuning Multimodal AI models used in sports performance analysis.
To learn more, visit our website and explore our knowledge base.
