Automation & AI

How Google Achieves Superior Intent Extraction with Small AI Models

Danny Da Rocha - Founder of Adsroid
January 26, 2026

Google's new method breaks intent understanding into steps, enabling small AI models to interpret user behavior accurately on-device, improving speed, privacy, and reducing expenses.

Intent extraction is a crucial technology enabling AI systems to understand what users intend to do based on their interaction with digital devices and applications. Google’s latest research focuses on enhancing this capability by leveraging small multimodal AI models that run efficiently on devices while maintaining high accuracy.

The Challenge of Intent Extraction in AI

Understanding user intent from behavioral data, such as taps, clicks, scrolling, and screen transitions, plays a pivotal role in delivering relevant actions and answers before a search query is explicitly entered. Traditionally, large AI models process this data in the cloud, which incurs latency, increased operational costs, and raises privacy concerns since sensitive user data is transmitted off device.

Google’s research sought to overcome these challenges by enabling intent extraction directly on devices, using small AI models that match the performance of much larger cloud-based systems like Gemini 1.5 Pro but with faster response times and lower costs.

A Novel Two-Step Decomposition Approach

The key innovation lies in decomposing the intent understanding task into two simpler steps:

Step One: Per Screen Interaction Summarization

Each interaction the user has with the screen is individually summarized. This summary captures what appeared on the screen, the user’s specific action, and a tentative hypothesis about the purpose behind that action. This granularity avoids overwhelming the AI with the entire session’s complexity at once.

Step Two: Consolidation to Overall Intent

A second small AI model reviews all factual summaries from the first step, deliberately ignoring speculative guesses. It then generates a concise statement representing the user’s overarching goal throughout the session.

By partitioning the task and focusing attention, the system reduces common failure modes experienced by small models, such as confusion over long and messy interaction histories.

“Breaking down intent extraction into smaller, manageable pieces enables compact AI models to deliver surprisingly robust understanding while maintaining privacy and operational efficiency,” said Dr. Lisa Kim, AI research scientist.

Measuring Success with Bi-Fact Evaluation

Performance is evaluated using the Bi-Fact metric, which assesses whether the AI successfully captures relevant factual elements of intent without adding incorrect inferences. This granular evaluation surpasses traditional similarity-based metrics by revealing where the model omits or invents details.

Results show that an 8-billion parameter model called Gemini 1.5 Flash, operating with this stepwise approach, matches the effectiveness of its larger progenitor, Gemini 1.5 Pro, on mobile user behavior datasets.

Importantly, hallucinations—false or speculative content generated by AI—are significantly reduced because the model filters out initial guesses before final intent formulation, resulting in more reliable outputs.

Advantages Over Large, Cloud-Based Models

The approach confers multiple benefits:

1. Privacy Preservation: Processing user data on-device avoids transmitting sensitive information to cloud servers, mitigating privacy risks.
2. Lower Latency: Local computation eliminates delays inherent in network communication.
3. Cost Efficiency: Smaller models consume fewer resources, reducing operational expenses.
4. Robustness to Noisy Data: Stepwise decomposition maintains performance despite imperfect or inconsistent training labels common in real-world behavior data.

According to Pavel Novik, a developer specializing in AI deployments, “This decomposition method is a game changer for developing privacy-focused AI applications that must run smoothly on edge devices.”

Implications for Future AI-Driven User Experiences

As AI assistants and agents evolve to anticipate user needs proactively, understanding intent from user interaction patterns becomes increasingly vital. Instead of relying solely on explicit keywords typed by users, models will integrate behavioral signals across apps and websites to predict goals and offer timely assistance.

This trend encourages a shift in digital strategy, emphasizing clear and logical user journeys that AI can interpret easily, rather than optimizing only for search query terms.

Examples and Applications

Consider a user navigating through a travel booking app by browsing flights, selecting dates, and examining hotel options. Through the two-step model, the AI infers that the intent is completing a travel reservation, enabling it to offer relevant suggestions or autofill details proactively.

Similarly, in complex workflows such as form completion or e-commerce checkout, segmenting intent understanding improves the accuracy of timely recommendations and assistance.

Stay Ahead with AI-Powered Marketing Insights

Get weekly updates on how to leverage AI and automation to scale your campaigns, cut costs, and maximize ROI. No fluff — only actionable strategies.

Technical Insights into Model Design

The models focus on multimodal inputs—interpreting visual elements on screen together with user actions over time. This capability ensures context-aware analysis, integral to accurately grasping intent.

The research paper highlights that decomposing intent into smaller factual units allows tracking which facts were correctly identified, missed, or erroneously invented, enabling targeted improvements in model training and validation.

Additionally, this modular strategy makes it easier to update or fine-tune specific components without retraining massive end-to-end models.

Adsroid - An AI agent that understands your campaigns

Save up to 5–10 hours per week by turning complex ad data into clear answers and decisions.

Conclusion

Google’s small AI model approach to intent extraction through decomposition represents significant progress toward efficient, private, and scalable AI understanding of user behavior. This advancement not only enhances user experience by anticipating needs more accurately but also sets a precedent for on-device intelligent systems that balance performance with privacy and cost.

For developers and businesses, embracing such AI architectures means preparing for a future where intent-driven automation and personalized assistance become standard, driving innovation across digital interactions.

Share the post

About the author

Danny Da Rocha - Founder of Adsroid

Danny Da Rocha is a digital marketing and automation expert with over 10 years of experience at the intersection of performance advertising, AI, and large-scale automation. He has designed and deployed advanced systems combining Google Ads, data pipelines, and AI-driven decision-making for startups, agencies, and large advertisers. His work has been recognized through multiple industry distinctions for innovation in marketing automation and AI-powered advertising systems. Danny focuses on building practical AI tools that augment human decision-making rather than replacing it.

Get your Ads AI Agent For Free

Chat or speak with your AI agent directly in Slack for instant recommendations. No complicated setup, no data stored, just instant insights to grow your campaigns on Google ads or Meta ads.

How to Use Conversational AI and API Integrations to Automate Multi-Channel Paid Media Budget Alerting and Proactive Optimization

Danny Da Rocha - Founder of Adsroid
January 31, 2026

Explore how conversational AI combined with API integrations can automate alerting for your paid media budgets across multiple channels and enable proactive optimization strategies.

Automation & AI

How to Use Conversational AI and API Integrations to Automate Cross-Platform Ad Creative Budget Allocation and Performance Insights

Danny Da Rocha - Founder of Adsroid
January 31, 2026

Learn how conversational AI and API integrations streamline cross-platform ad budget allocation and provide actionable performance insights, boosting marketing effectiveness and efficiency.

Google Ads Attribution Changes and Impact on Time Lag Reporting

Danny Da Rocha - Founder of Adsroid
January 30, 2026

Google Ads’ recent attribution changes affect how time lag reports display conversion data, requiring advertisers to understand new models and their implications on campaign performance analysis.