☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.ml

Machine Learning

machinelearning@lemmy.ml

PostsComments

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 9 months ago

The Bitter Lesson is coming for Tokenization

lucalp.dev

The Bitter Lesson is coming for Tokenization

lucalp.dev

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 9 months ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

The Attention Mechanism Born for Cost Optimization

oilbeater.com

The Attention Mechanism Born for Cost Optimization

oilbeater.com

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

thickertoofan@lemm.ee

thickertoofan@lemm.eeEnglish · 1 year ago

dcdaML - devanagari character detection dataset training framework

github.com

dcdaML - devanagari character detection dataset training framework

github.com

thickertoofan@lemm.eeEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Neural Graffiti is an experiment in adding a "Spray Layer" to a transformer model, which injects a memory trace into the final stages of inference without finetuning or retraining

github.com

Neural Graffiti is an experiment in adding a "Spray Layer" to a transformer model, which injects a memory trace into the final stages of inference without finetuning or retraining

github.com

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

fubarx@lemmy.world

fubarx@lemmy.world · 1 year ago

Breaking GPT-5 News!

fubarx@lemmy.world · 1 year ago

4Robato@lemmy.world

4Robato@lemmy.worldEnglish · 1 year ago

I want to open source a dataset but I'm not sure what license to use

4Robato@lemmy.worldEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Why do LLMs make stuff up? New research peers under the hood.

arstechnica.com

Why do LLMs make stuff up? New research peers under the hood.

arstechnica.com

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

oba@lemmy.world

oba@lemmy.worldEnglish · 1 year ago

MLOps tips I gathered recently

www.readyforagents.com

MLOps tips I gathered recently

www.readyforagents.com

oba@lemmy.worldEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

DeepSeek open source DeepEP – library for MoE training and Inference

github.com

DeepSeek open source DeepEP – library for MoE training and Inference

github.com

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

transformer-circuits.pub

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

transformer-circuits.pub

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

transformer-circuits.pub

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

transformer-circuits.pub

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

arxiv.org

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

arxiv.org

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Neurosymbolic AI -- Why, What, and How

arxiv.org

Neurosymbolic AI -- Why, What, and How

arxiv.org

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Classical Sorting Algorithms as a Model of Morphogenesis: self-sorting arrays reveal unexpected competencies in a minimal model of basal intelligence

arxiv.org

Classical Sorting Algorithms as a Model of Morphogenesis: self-sorting arrays reveal unexpected competencies in a minimal model of basal intelligence

arxiv.org

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Genie 2: A large-scale foundation world model

deepmind.google

Genie 2: A large-scale foundation world model

deepmind.google

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

A good primer on what to expect running local LLMs

nullprogram.com

A good primer on what to expect running local LLMs

nullprogram.com

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year ago

Shamar@feddit.it

Shamar@feddit.itEnglish · 1 year ago

A community statement supporting the Open Source Definition (OSD)

osd.fyi

A community statement supporting the Open Source Definition (OSD)

osd.fyi

Shamar@feddit.itEnglish · 1 year ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 years ago

How ‘Embeddings’ Encode What Words Mean

www.quantamagazine.org

How ‘Embeddings’ Encode What Words Mean

www.quantamagazine.org

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 years ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 years ago

New AI model “learns” how to simulate Super Mario Bros. from video footage

arstechnica.com

New AI model “learns” how to simulate Super Mario Bros. from video footage

arstechnica.com

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 years ago

☆ Yσɠƚԋσʂ ☆@lemmy.ml

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 years ago

Reflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)

huggingface.co

Reflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)

huggingface.co

☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 years ago