Researchers from the Shanghai Artificial Intelligence Laboratory and Tsinghua University have developed a speech-based AI ...
Google launches Gemini Robotics 1.5 and Robotics-ER 1.5 to bring planning, reasoning, and action to real-world robots, ...
Microsoft adds Agent Mode and Office Agent to Microsoft 365 Copilot; OpenAI and Anthropic models generate, evaluate and ...
Microsoft claims that Agent Mode will make M365 Copilot more reliable in Excel. In its tests, Agent Mode received a score of ...
MMLU-Pro holds steady at 85.0, AIME 2025 slightly improves to 89.3, while GPQA-Diamond dips from 80.7 to 79.9. Coding and agent benchmarks tell a similar story, with Codeforces ratings rising from ...
According to OpenAI, the tasks were created by professionals with an average of 14 years of experience in relevant fields to reflect "real work products, such as a legal brief, an engineering ...
Oracle’s shift to AI infrastructure, major contracts, and rapid cloud growth position it for strong future growth. Read here ...
Discover how DeepSeek V3.1 Terminus balances stability, cost efficiency, and reasoning for practical AI applications in this ...
Artificial intelligence has taken many forms over the years and is still evolving. Will machines soon surpass human knowledge ...
Naomi Saphra thinks that most research into language models focuses too much on the finished product. She’s mining the ...
One of the first randomized controlled trials assessing the effectiveness of a large language model (LLM) chatbot ‘Amanda’ for relationship support shows that a single session of chatbot therapy can ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results