Polly is generally available everywhere you work in LangSmith
Debugging agents is different from debugging anything else you've built. Traces run hundreds of steps deep, prompts span thousands of lines, and when something goes wrong, the context that caused it is buried somewhere in the middle.We built Polly to be the AI assistant that can read
The PhD students who became the judges of the AI industry
Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard for frontier LLMs, influencing funding, launches, and PR...
7 Readability Features for Your Next Machine Learning Model
Unlike fully structured tabular data, preparing text data for machine learning models typically entails tasks like tokenization, embeddings, or sentiment analysis.
One Model to Rule Them All? SAP-RPT-1 and the Future of Tabular Foundation Models
A hands-on case study and practical guidance
The post One Model to Rule Them All? SAP-RPT-1 and the Future of Tabular Foundation Models appeared first on Towards Data Science.
Hitachi Vantara Expands Hitachi iQ Capabilities for Responsible Agentic AI
Expanded AI blueprints, infrastructure capabilities and intelligent data integration strengthen the Hitachi iQ portfolio for secure, on-prem production AI Hitachi Vantara, the data storage, infrastructure and hybrid cloud management subsidiary of Hitachi Ltd. (TSE: 6501), today announced new capabil...
Mondoo Announced the Launch of Agentic Managed Vulnerability Service
World-class security experts, powered by Mondoo’s proven AI platform, now deliver a 60% reduction in vulnerabilities and sub-16-day MTTR, so overwhelmed security teams don’t have to do it alone Mondoo, the pioneer in agentic vulnerability management, today announced the Mondoo Agentic Managed Vulner...
ActiveState Announced the Launch of Curated Catalogs
New private repository secures the AI-driven development boom by grounding LLMs in a library of 79 million vetted, rebuilt-from-source components ActiveState, a global leader in trusted, managed open source software, today announced the launch of the ActiveState Curated Catalog. This new offering pr...
Cayosoft Debuts Agentic AI Identity Change Controls, IR Offering at RSA 2026
Cayosoft Guardian 7.2 expands Identity Threat Detection and Response and Automated Rollback for AI identities; Launches Expert-Led, Identity-First Incident Response Service. Cayosoft, the undisputed leader in Microsoft hybrid Active Directory (AD), Entra ID, and Microsoft 365 management, monitoring,...
From engagement to fulfillment: How Agentic AI is rewriting product metrics
As AI agents begin executing tasks on users’ behalf, traditional engagement metrics are becoming less meaningful. In the age of agentic AI, product teams may need a new north star: measuring whether user intent was successfully fulfilled.
When AI judges AI: The hidden dangers of reasoning models in alignment
The race to build more capable AI systems has created an unexpected problem:
As we push toward more sophisticated models, we need equally sophisticated ways to evaluate and align them.
Acalvio Launches 360 Deception to Break AI Attack Automation
Next-Generation Cyber Deception Disrupts and Denies Agentic and AI-Assisted Attacks by Controlling the Attacker’s Reality Acalvio, an AI-powered preemptive cybersecurity company, today announced 360 Deception, the next generation of cyber deception designed to break AI-driven attack automation. As a...
NetLib Security Launches Winter 2026 Release with AI Enhancements
Winter 2026 Introduces Expanded Platform Support, Azure Key Vault Integration, and Advanced Centralized Key Management Capabilities NetLib Security, a leader in transparent data encryption, today announced the general availability of Encryptionizer Winter 2026, alongside significant upgrades to the ...
Physicl Launches the Data Infrastructure Layer for Physical AI at NVIDIA GTC
Emerging from stealth to scale world-ready data for robotics, world models, and embodied AI Physicl today emerged from stealth at NVIDIA GTC, introducing a new data infrastructure platform purpose-built for Physical AI and robotics. Launched by members of the team behind Nfinite — the company known ...
Company’s Latest Solution Integrates Protocol Design, Document Generation, and Statistical Programming to Accelerate Trial Execution and Deliver Submission-ready Data PhaseV, a leader in AI/ML for clinical development, today announced the launch of its AI Conductor, a centralized platform that autom...
Cato Networks Launches GPU-Powered SASE with Native AI Security
Cato Neural Edge embeds NVIDIA GPUs across Cato’s global private backbone, enabling real-time AI inspection; Cato AI Security delivers unified governance and protection for enterprise AI adoption Cato Networks, the SASE leader, today unveiled two major innovations for the Cato SASE Platform to secur...
NVIDIA AI Open-Sources ‘OpenShell’: A Secure Runtime Environment for Autonomous AI Agents
The deployment of autonomous AI agents—systems capable of using tools and executing code—presents a unique security challenge. While standard LLM applications are restricted to text-based interactions, autonomous agents require access to shell environments, file systems, and network endpoints to per...
ServiceNow Research Introduces EnterpriseOps-Gym: A High-Fidelity Benchmark Designed to Evaluate Agentic Planning in Realistic Enterprise Settings
Large language models (LLMs) are transitioning from conversational to autonomous agents capable of executing complex professional workflows. However, their deployment in enterprise environments remains limited by the lack of benchmarks that capture the specific challenges of professional settings: l...
Tokenization Tradeoffs in Structured EHR Foundation Models
arXiv:2603.15644v1 Announce Type: new
Abstract: Foundation models for structured electronic health records (EHRs) are pretrained on longitudinal sequences of timestamped clinical events to learn adaptable patient representations. Tokenization -- how these timelines are converted into discrete model...
Alternating Reinforcement Learning with Contextual Rubric Rewards
arXiv:2603.15646v1 Announce Type: new
Abstract: Reinforcement Learning with Rubric Rewards (RLRR) is a framework that extends conventional reinforcement learning from human feedback (RLHF) and verifiable rewards (RLVR) by replacing scalar preference signals with structured, multi-dimensional, conte...
Steering Frozen LLMs: Adaptive Social Alignment via Online Prompt Routing
arXiv:2603.15647v1 Announce Type: new
Abstract: Large language models (LLMs) are typically governed by post-training alignment (e.g., RLHF or DPO), which yields a largely static policy during deployment and inference. However, real-world safety is a full-lifecycle problem: static defenses degrade a...
Sustaining diplomacy amid competition in US-China relations
At MIT, former U.S. ambassador to China Nicholas Burns highlights climate change as an area for diplomatic engagement, while exploring areas including China's emphasis on STEM education.