Reinforcement Learning Course

Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

Ant Group, an affiliate of Alibaba, released Ring-1T which it says is the first trillion parameter open-source model.

The End of Tabula Rasa: How Pre-Trained World Models are Redefining Reinforcement Learning

For a long time, the core idea in reinforcement learning (RL) was that AI agents should learn every new task from scratch, like a blank slate. This "tabula rasa" approach led to amazing achievements, ...

Cognizant's AI Lab Announces Breakthrough Research for Fine-Tuning LLMs and Records its 61st U.S. Patent Issuance

Cognizant (Nasdaq: CTSH) today announced a breakthrough from its AI Lab that introduces a novel, efficiency-focused method ...

18d

Nvidia researchers boost LLMs reasoning skills by getting them to 'think' during pre-training

By teaching models to reason during foundational training, the verifier-free method aims to reduce logical errors and boost ...

Healthcare IT News

NTU leads app-based psychological first aid training in Singapore

Featuring AI-powered role-play simulations, the app allows learners to practise recognising distress and offering empathetic ...

19d

This Startup Wants to Spark a US DeepSeek Moment

With the US falling behind on open source models, one startup has a bold idea for democratizing AI: let anyone run ...

23d

The reinforcement gap — or why some AI skills improve faster than others

AI tasks that work well with reinforcement learning are getting better fast — and threatening to leave the rest of the ...

Communications of the ACM

Shields for Safe Reinforcement Learning

Evaluating the advantages and potential drawbacks of shielding as a method for safe RL. Bettina Könighofer is an assistant ...

NextBigFuture

Looking at Current AI Learning Frameworks to Create Learning Pipelines to Achieve Superintelligence

Andrej Karpathy says that reinforcement learning is still terrible but better than all other AI learning approaches. Elon ...

Communications of the ACM

The Reasons AI May Act Secretive

When responding to a prompt, an AI model may conceal information from the user entering the prompt. This practice, known as ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results