Kyutai is a French AI research lab with a $330 million budget that will make everything open source

Tibert@jlai.lu · edit-2 1 year ago

Kyutai is a French AI research lab with a $330 million budget that will make everything open source

q47tx@lemmy.world · 1 year ago

deleted by creator

LUHG@lemmy.world · 1 year ago

330m is not much.

weew@lemmy.ca · 1 year ago

Maybe they can buy… three nVidia GPUs and electricity to run them!

RGB3x3@lemmy.world · 1 year ago

In that case, I’ll let them get a discount on the one I’m selling… Now only $100m.

oce 🐆@jlai.lu · 1 year ago

That’s thousands of salaries for a year, that’s not too bad for an unknown company. More than enough to produce something that can attract more funding. Many startups became successful with less funding.

voluble@lemmy.world · edit-2 1 year ago

$330m is not nothing. But, with a funding split between a telecom CEO, and a shipping & logistics CEO - person has to wonder what sort of direction & tuning the team might be encouraged to explore. How will they stack up against existing & proven open source non-profits with impressive releases like EleutherAI?

These open source projects are neat, in that they give the average person the opportunity to peek under the hood of an LLM that they’d never be able to run on consumer level hardware. There are some interesting things to find, especially in the dataset snapshots that Eleuther made available.

In general, kind of cool to see France being on the cutting edge of these things. And I think it’s worth saluting any project that moves to decentralize power from states and megacorps, who seal wonderful, powerful things in black boxes.

erwan@lemmy.ml · 1 year ago

France is on the cutting edge of AI indeed, the FAIR (Facebook AI lab) has a big office in Paris and its boss is Yann Le Cun. So there are plenty of researchers getting trained on the state of the art.

Goldmage263@sh.itjust.works · 1 year ago

Makes sense it’d be the French again. They pioneered the internet after all.

Sandbag@lemmy.world · 1 year ago

I’m sorry are you crazy? Do you know any part of the internets history? American universities, government and defense contractors that created the internet.

undetermined@lemmy.world · 1 year ago

deleted by creator

Goldmage263@sh.itjust.works · 1 year ago

Thanks for the assist. I’m not an expert on the deep lore of the internet, but remember a few things from History class.

HaggierRapscallier@feddit.nl · edit-2 1 year ago

What did the Teletubbies have to do with it then? I could have sworn early development was tied with their government research?

dustyData@lemmy.world · 1 year ago

You could read the article. It was actually DARPA, from the department of defense and not the CIA, who initially created the first working network. But it was some time later that CYCLADES in France demonstrated the first inter-network with a lot of the working concepts that later would make the internet as we know it today. It wouldn’t go global until we invented the TCP/IP protocols, that was a joint effort of a lot of universities over in Europe and the USA.

yamanii@lemmy.world · 1 year ago

I hope they actually do, unlike "Open"AI

cyd@lemmy.world · edit-2 1 year ago

Ideally, they’d just blow the entire $330M training an LLM, and release the weights. In reality, much of that money will probably go into paying salaries, various smaller research projects, etc.

Mahlzeit@feddit.de · 1 year ago

Ideally, they wouldn’t be paying salaries? What?

AeroLemming@lemm.ee · edit-2 2 months ago

deleted by creator

cyd@lemmy.world · edit-2 1 year ago

The context is that LLMs need a big up front capital expenditure to get started, because of the processor time to train these giant neural networks. This is a huge barrier to the development of a fully open source LLM. Once such a foundation model is available, building on top of it is relatively cheaper; one can then envision an explosion of open source models targeting specific applications, which would be amazing.

So if the bulk of this €300M could go into training, it would go a long way to plugging the gap. But in reality, a lot of that sum is going to be dissipated into other expenses, so there’s going to be a lot less than €300M for actual training.

interceder270@lemmy.world · 1 year ago

Is there any way we can decentralize the training of neural networks?

I recall something being released awhile ago that let people use their computers for scientific computations. Couldn’t something similar be done for training AI?

Mahlzeit@feddit.de · 1 year ago

There is a project (AI Horde) that allows you to donate compute for inference. I’m not sure why the same doesn’t exist for training. I think the RAM/VRAM requirements just can’t be lowered/split.

Another way to contribute is by helping with training data. LAION, which created the dataset behind Stable Diffusion, is a volunteer effort. Stable Diffusion itself was developed at a tax-funded public university in Germany. However, the cost of the processing for training, etc. was covered by a single rich guy.

Sanyanov@lemmy.world · 1 year ago

Btw yes! Why not include such project in something like BOINC and let people help training free AI?

Dojan@lemmy.world · 1 year ago

Folding at home.

I dunno. I wouldn’t lend my spare power to put people out of a job.

5BC2E7@lemmy.world · 1 year ago

?? Do you know people with enough time qualifications and money that are willing to work for free? I haven’t.

Thekingoflorda@lemmy.world · 1 year ago

I know a few people on a certain site that host a kind of reddit alternative in their free time.

5BC2E7@lemmy.world · 1 year ago

I still don’t know any.

puppy@lemmy.world · 1 year ago

I think you just replied to one.

Thekingoflorda@lemmy.world · 1 year ago

No not me lol, I’m a glorified moderator. The guys I’m talking about are the actual hosts of this website.

5BC2E7@lemmy.world · 1 year ago

I don’t know him but I do know that people have been banned for criticizing Lemmy administration wrt potential monetization.

I don’t want to be banned so I will not comment further on the admin.

I don’t deny that there are software engineers that would work for free but they are not common.

iByteABit [he/him]@lemm.ee · 1 year ago

The potential monetization being the donations to give them a living wage? What exactly is the criticism about, that they shouldn’t get a living wage through donations and should rather make the platform paid or ruin it with ads?

Why do so many people get off on attacking the idea of FOSS, much of the software that is running our everyday lives is supported through FOSS (and lots of them are also being donated to so that the devs can afford to put food on their tables).

There are also many devs (on Lemmy as well) that contribute a lot without being paid, simply because they like the project, want to make it better, and want to learn by doing.

interceder270@lemmy.world · 1 year ago

I don’t think many people were paid to work on the Fediverse.

Or emulators.

Or most free software.

Honytawk@lemmy.zip · 1 year ago

That isn’t a job to them though, it is more like a hobby.

If you want peoples undivided attention, you will have to pay them, no matter how utopian your vision.

Which you can easily afford with 330 million funding.

Viking_Hippie@lemmy.world · 1 year ago

Methinks cyd might be a libertarian 😄

Honytawk@lemmy.zip · 1 year ago

Good luck training an LLM without any developer.

AutoTL;DR@lemmings.world · 1 year ago

This is the best summary I could come up with:

This morning at Scaleway’s ai-PULSE conference, French billionaire and Iliad CEO Xavier Niel gave some extra details about his plans for an AI research lab based in Paris.

Six men took the stage this morning to talk about their previous work and what they have in mind for the research lab — Patrick Perez, Edouard Grave, Hervé Jegou, Laurent Mazaré, Neil Zeghidour and Alexandre Defossez.

Kyutai has also put together a team of scientific advisors who are well-known AI researchers — Yejin Choi, Yann LeCun and Bernhard Schölkopf.

“When it comes to the timeline, I don’t think our aim is necessarily to go as fast as Mistral, because our ambition is to provide a scientific purpose, an understanding and a code base to explain the results,” Defossez said at the press conference.

Macron also used this opportunity to define and defend France’s position on Europe’s AI Act, saying that use cases should be regulated, not model makers.

It’s not a question of defining good models, but we need to ensure that the services made available to our citizens are safe for them, for other economic players and for our democracy,” Macron said.

The original article contains 905 words, the summary contains 192 words. Saved 79%. I’m a bot and I’m open source!

whyNotSquirrel@sh.itjust.works · 1 year ago

with translation as well! good bot!

Tibert@jlai.lu · edit-2 1 year ago

The article is English. Only the one in the post text for additional info is French.

jack@monero.town · 1 year ago

Please put a space between the link and parenthesis so the link doesn’t break

Echo Dot@feddit.uk · 1 year ago

https://www.clubic.com/actualite-509350-intelligence-artificielle-xavier-niel-free-et-l-ancien-pdg-de-google-lancent-kyutai-un-concurrent-europeen-a-openai.html

There I fixed the link. Sadly still in some weird arcane language but never mind

Tibert@jlai.lu · 1 year ago

Sry did it. The apps I use seem to be smart enough to stop at html.

SchizoDenji@lemm.ee · 1 year ago

Smart. Even Google knows that they can’t compete with open source models since open source development of AI models is much more optimized and a compliance serving model can’t catch up with it.

So an open source model is their best way to leapfrog these giants.

RAM@discuss.tchncs.de · 1 year ago

nice :)

vrighter@discuss.tchncs.de · 1 year ago

so they have enough money to train one model

dustyData@lemmy.world · edit-2 1 year ago

It seems like their goal is not to train new LLMs, but to actually do scientific research. Large language models are such a tiny part of the whole machine learning and AI field that it’s ridiculous the amount of attention they get from mass media. But people do like their stupid chatbots.

Treczoks@lemmy.world · 1 year ago

What use would an AI be if it was made by French developers? The source would likely be in French (i.e. Variables, functions, objects names as well as comments). Yes, they are that in love with their own language. Check out their names for about everything related to computers…

filister@lemmy.world · 1 year ago

Tell me you know nothing about coding without telling me

REdOG@lemmy.world · 1 year ago

Wait until he finds out about obfuscated code he’s going to be real frenchy then

kautau@lemmy.world · 1 year ago

h shpxva jbg z8

ChaoticNeutralCzech@feddit.de · edit-2 1 year ago

příkaz trojúhelník "a
  pd
  opakuj 3 
  [
    dopředu :a
    vpravo 120
  ]
konec

porcariasagrada@kbin.social · 1 year ago

wut!? did some bagget french kissed your mom or something?

whyNotSquirrel@sh.itjust.works · 1 year ago

Isn’t it Quebec(Canada) you’re thinking about?

Ive never seen french code in my jobs, it’s in English, Most Frameworks are in English anyway so why would they code in French

PHP Symfony is from a french company, and it’s in English, docs also available in English

And there might be translation of english words in French yes, how is it crazy, that’s the definition of a language otherwise we would all have the same words for everything and therefore the same language

Schneemensch@programming.dev · 1 year ago

As a SW Engineer from Germany, you will be surprised how much code exists in other languages. But I would expect companies on the edge of technology, who are either working closely with universities or with open source, that they usually chose English.

Touching_Grass@lemmy.world · 1 year ago

In Quebec all code must be in both official languages, Maple and C

Infiltrated_ad8271@kbin.social · edit-2 1 year ago

Let me guess, you are a ~~murcian~~ murican who only understands english.

Skua@kbin.social · 1 year ago

Must be really difficult being from Spain and only knowing English

Infiltrated_ad8271@kbin.social · edit-2 1 year ago

If you had put as much effort into searching on the internet as my comment history to speculate where I’m from, you would know that in spain (depending on the region) it’s mandatory to learn up to 4 languages.

Skua@kbin.social · 1 year ago

I’ve not looked at your comment history at all, I was just making a silly joke about how you wrote “Murcian” instead of “Murican”. No harm intended

Infiltrated_ad8271@kbin.social · 1 year ago

Oh, hehe.

Treczoks@lemmy.world · 1 year ago

No, I’m not, but I had to fight with a source where the original author was no longer available - and everything was in French. It sucks.

Echo Dot@feddit.uk · 1 year ago

The source code was in a programming language which will have been in English it’ll be in the code comments and variable names it will have been in French and you can translate those

Anyway accepted convention is to programming English if it’s going to be available to an international audience.

Treczoks@lemmy.world · 1 year ago

and you can translate those

You know that there was a time before google translate, yes?

OsrsNeedsF2P@lemmy.ml · 1 year ago

You ever heard of VLC?

msgomez06@literature.cafe · 1 year ago

I’ll take this opportunity to highlight that Scikit-Learn (Open Source ML library) is developed in large part by INRIA (based in Paris) and people have been relying on their code for preprocessing, baselines, and the rest for a long time. And all of the documentation is in English.

evranch@lemmy.ca · 1 year ago

On the other side, MicMac which is by far the best free photogrammetry package, is developed by France’s IGN and it’s loaded with French comments, function and variable names etc…

However the English wiki has come a LONG way since I first had to try to figure it out, and while it’s still much more of a box of tools and parts than a single click app, it’s likely gone from “set of blueprints and sack of unsorted bolts” to “kit car with rolling chassis”

BastingChemina@slrpnk.net · 1 year ago

The difference here is that MicMac was probably developed as an internal tool, with no intention to distribute it at first.

Treczoks@lemmy.world · 1 year ago

Good for them. I has different experiences.

idefix@sh.itjust.works · 1 year ago

Seriously, all code produced by French devs are in English minor a few personal projects

GBU_28@lemm.ee · 1 year ago

Meh most software is largely in English. It’s not quite like commercial piloting but it’s pretty prevalent

Damage@feddit.it · 1 year ago

Eh get the AI to translate the code to your language of preference.

Personally as an Italian, I think it would be good for Europeans to learn other languages aside from English… And the most widespread are French and Spanish.

xePBMg9@lemmynsfw.com · edit-2 1 year ago

I went through school, having classes for Spanish for 4 years and Italian for two. It was obligatory to choose a third language. The second I was done with school, I had already forgotten everything I learnt of those languages. If I am not gonna use those languages daily I will forget them. It was a monumental waste of the students time. Now I have a spouse that speaks another language, that was never an option to learn in school. We both speak english though. There is little way to predict what languages will be useful for a student. English as a second language is a good bet. Every other european, except Russians, have been able to communicate with me in English. 99% will never benefit from a third language. They should have taught computer science or something instead. Before LLMs I thought they should at least have focused on teaching languages that were hard to machine translate. Like Russian, Japanese or Korean. That is my opinion.

Merwyn@sh.itjust.works · 1 year ago

Probably not, this is Quebec. I’m in a french lab and everything is written in English. You don’t really have choice as you are collaborating internationally. Even if the lab is based/funded in France, not all of the people inside will be french. They plan to have scientific advisors that are not french according to the link.

Flying Squid@lemmy.world · 1 year ago

Let’s say you’re right and it will only be in French. It’s a language with hundreds of millions of speakers. Why not have it in French?

Treczoks@lemmy.world · 1 year ago

While it may have hundreds of millions of speakers, there are still billions of people who don’t speak french.

kadu@lemmy.world · 1 year ago

deleted by creator

5BC2E7@lemmy.world · 1 year ago

I was assuming they would not self sabotage like that but if they do then you can use a llm to fix that issue with tranlations

Kyutai is a French AI research lab with a $330 million budget that will make everything open source

Kyutai is a French AI research lab with a $330 million budget that will make everything open source

Kyutai is a French AI research lab with a $330 million budget that will make everything open source | TechCrunch