RADAR: Learning to Route with Asymmetry-aware DistAnce Representations
arXiv:2603.03388v1 Announce Type: new
Abstract: Recent neural solvers have achieved strong performance on vehicle routing problems (VRPs), yet they mainly assume symmetric Euclidean distances, restricting applicability to real-world scenarios. A core challenge is encoding the relational features in...
Towards Improved Sentence Representations using Token Graphs
arXiv:2603.03389v1 Announce Type: new
Abstract: Obtaining a single-vector representation from a Large Language Model's (LLM) token-level outputs is a critical step for nearly all sentence-level tasks. However, standard pooling methods like mean or max aggregation treat tokens as an independent set,...
Heterogeneous Time Constants Improve Stability in Equilibrium Propagation
arXiv:2603.03402v1 Announce Type: new
Abstract: Equilibrium propagation (EP) is a biologically plausible alternative to backpropagation for training neural networks. However, existing EP models use a uniform scalar time step dt, which corresponds biologically to a membrane time constant that is het...
Asymmetric Goal Drift in Coding Agents Under Value Conflict
arXiv:2603.03456v1 Announce Type: new
Abstract: Agentic coding agents are increasingly deployed autonomously, at scale, and over long-context horizons. Throughout an agent's lifetime, it must navigate tensions between explicit instructions, learned values, and environmental pressures, often in cont...
Build, Judge, Optimize: A Blueprint for Continuous Improvement of Multi-Agent Consumer Assistants
arXiv:2603.03565v1 Announce Type: new
Abstract: Conversational shopping assistants (CSAs) represent a compelling application of agentic AI, but moving from prototype to production reveals two underexplored challenges: how to evaluate multi-turn interactions and how to optimize tightly coupled multi...
Mozi: Governed Autonomy for Drug Discovery LLM Agents
arXiv:2603.03655v1 Announce Type: new
Abstract: Tool-augmented large language model (LLM) agents promise to unify scientific reasoning with computation, yet their deployment in high-stakes domains like drug discovery is bottlenecked by two critical barriers: unconstrained tool-use governance and po...
MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation
arXiv:2603.03680v1 Announce Type: new
Abstract: Large Language Model (LLM) agents have demonstrated remarkable proficiency in learned tasks, yet they often struggle to adapt to non-stationary environments with feedback. While In-Context Learning and external memory offer some flexibility, they fail...
AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment
arXiv:2603.03686v1 Announce Type: new
Abstract: Automated design of chemical formulations is a cornerstone of materials science, yet it requires navigating a high-dimensional combinatorial space involving discrete compositional choices and continuous geometric constraints. Existing Large Language M...
Introducing ChatGPT for Excel and new financial data integrations
OpenAI introduces ChatGPT for Excel and new financial app integrations, powered by GPT-5.4 to accelerate modeling, research, and analysis in regulated environments.
Stop Tuning Hyperparameters. Start Tuning Your Problem.
80% of ML projects fail from bad problem framing, not bad models. A 5-step protocol to define the right problem before you write training code.
The post Stop Tuning Hyperparameters. Start Tuning Your Problem. appeared first on Towards Data Science.
LangWatch Open Sources the Missing Evaluation Layer for AI Agents to Enable End-to-End Tracing, Simulation, and Systematic Testing
As AI development shifts from simple chat interfaces to complex, multi-step autonomous agents, the industry has encountered a significant bottleneck: non-determinism. Unlike traditional software where code follows a predictable path, agents built on LLMs introduce a high degree of variance. LangWatc...
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model
We are pleased to announce Phi-4-reasoning-vision-15B, a 15 billion parameter open‑weight multimodal reasoning model, available through Microsoft Foundry (opens in new tab), HuggingFace (opens in new tab) and GitHub (opens in new tab). Phi-4-reasoning-vision-15B is a broadly capable model that can b...
We’re releasing a CLI along with our first set of skills to give AI coding agents expertise in the LangSmith ecosystem. This includes adding tracing to agents, understanding their execution, building test sets, and evaluating performance. On our eval set, this bumps Claude Code’s performance on
We’re releasing our first set of skills to give AI coding agents expertise in the open source LangChain ecosystem. This includes building agents with LangChain, LangGraph, and Deep Agents. On our eval set, this bumps Claude Code’s performance on these tasks from 29% to 95%.What
Guardrails, Not Roadblocks: The Adaptive AI Integrity Framework for Generative and Agentic Governance
Implementing the Adaptive AI Integrity Framework (AAIF): A pragmatic, scalable governance model to balance speed and safety when deploying generative and agentic AI.
Knack, the no-code platform for building custom business applications, today announced the launch of Knack Health, a dedicated healthcare product designed to help organizations build secure, HIPAA-compliant applications and databases without writing code. Knack Health provides healthcare teams with ...
Bugcrowd’s authorized cybersecurity platform enables federal agencies and other highly regulated industries to rapidly deploy offensive testing across the globe Bugcrowd, today announced it has achieved FedRAMP Moderate Authorization, sponsored by the U.S. Cybersecurity and Infrastructure Security A...
LILT Launches Industry-First MCP Server and Agent-to-Agent Integration
LILT MCP bridges the gap between GenAI speed and enterprise-grade quality LILT, the leading provider of enterprise AI translation, today announced the launch of its Model Context Protocol (MCP) Server and Agent-to-Agent (A2A) card. This integration enables professional, brand-aligned translations di...