Jensen Huang says kids shouldn't learn to code — they should leave it up to AI.

L4sBot@lemmy.world · 9 months ago

Jensen Huang says kids shouldn't learn to code — they should leave it up to AI.

Dojan@lemmy.world · 9 months ago

There are definitely ways in which LLMs and imaging models are useful. Hell I’ve been playing around with vocal synthesis for years, SynthV’s AI models are amazing, so even for music there’s use cases. The problem is big corporations just fucking it up. Rampant theft, no compensation for the original creators, and then they’re sitting on the models like dragons. OpenAI needs to rename themselves, preferably years ago, because there’s nothing open about them.

The way I see it, the way SynthV (and VOCALOID prior to that) works is great; you hire a vocalist with the express purpose of making a model out of their voice. They know what they’re getting into, and are getting compensated for it. Then there are licenses and such on these models. In some cases, like those produced by Eclipsed Sounds, anyone that uses a model to create a song gets decently free reign. In others, like the Bushiroad models, you are fairly restricted in what you can do with them.

Meaning the original artist has a say. It’s why some models, like Cangqiong, will never get AI updates; the voice provider’s wishes matter.

Using computer generated stuff as a crutch in the creation process is perfectly fine I feel, but outright trying to replace humans with “AI” is a ridiculous notion.

Skvlp@lemm.ee · 9 months ago

There has been synths that has been used to trigger vocal samples, among other things, for like 40(?) years, and this almost sounds like an evolution to that?

There are a lot of technological innovations in music (vax roll recording, tape recording, DAW recording, tube amps, transistor amps, amp modellers, Mellotron, analog synths, modular synths, digital synths, soft-synths, etc, etc, etc), and I think there’s surely more to come, and awesome new music to be made possible from the technological advantages.

I agree that the technology is not the problem, but how it’s used. If, let’s say, giant corporations feed all of human art into their closed, proprietary models only to churn out endless amounts of disposable entertainment, it would be detrimental to the creation of original art and I’d look upon that as a bad thing. But I guess we as a society has decided that we want to empower our corporate overlords at the expense of ourselves, to go far off topic of the original thread :/

Dojan@lemmy.world · 9 months ago

There has been synths that has been used to trigger vocal samples, among other things, for like 40(?) years, and this almost sounds like an evolution to that?

Kind of? But I think, particularly with SynthV offering such realistic vocals, it might be useful for producers that can’t easily get vocalists, or don’t want to/can’t sing themselves. You can also obviously use it to create backing vocals and fill things out if you realise that you need more vocals and your vocalist isn’t available.

Or, maybe you’re like me and just enjoy tinkering with the voice. Here’s an example song by someone that’s pretty talented at tuning these.

let’s say, giant corporations feed all of human art into their closed, proprietary models only to churn out endless amounts of disposable entertainment, it would be detrimental to the creation of original art and I’d look upon that as a bad thing. But I guess we as a society has decided that we want to empower our corporate overlords at the expense of ourselves, to go far off topic of the original thread :/

This is the road I fear we’re heading down and it’s so dystopic. 😭

PipedLinkBot@feddit.rocks · 9 months ago

Here is an alternative Piped link(s):

Here’s an example song by someone that’s pretty talented at tuning these.

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

Skvlp@lemm.ee · 9 months ago

Those vocals are pretty good for being computer generated. It’s no replacement for greats like Bowie, Simone, Jagger, Winehouse, Yorke, etc, etc, but it’s not supposed to be. Sometimes it’ll do the trick, sometimes it’ll be a necessity, it’ll work for some backing vocals, demos, sketches, songwriting experimentation, guide vocals, and so on. I hope we’ll see awesome AI tools being used to make awesome music.

I definitely have that fear myself, but I hope human resilience hangs in there. Besides, I don’t think I’d care if the masses listen to bland shit by 17 songwriters or bland shit by AI ;)

Dojan@lemmy.world · edit-2 9 months ago

The quality of the vocals are now honestly less dependant on the synthesis engine than on the skill of the original singer, and the intent of the production team. Hayden is a first-party library produced by Dreamtonics, and they tend to be very focused on having their voices do a specific thing. Ninezero for example is all-in on that gravelly rock type voice and won’t do soft ballads easily or with any particular quality.

This was true even for VOCALOID; most of the VOCALOID libraries are absolute bunk. YAMAHA’s (developer of VOCALOID) first signature English library, CYBER DIVA sounds so bad. The (in my opinion) best library for VOCALOID happens to be a Hello Kitty collaboration. For some reason they chose a traditional Japanese singer with an incredible vocal range to be the voice provider rather than a voice actor, and the quality of that voice is reflected in the voice library.

EclipsedSounds has three libraries now, and they’ve focused more on capturing the qualities of the original singer. Their first library SOLARIA is a Soprano whose voice is provided by Emma Rowley. Their second library ASTERIAN is a bass, voiced by Eric Hollaway (known as ‘thatbassvoice’). Their third, SAROS, is a tenor whose provider I don’t think has come forth yet. They are much more expressive than most libraries produced by Dreamtonics. SAROS’ second vocal demo is a great example.

One of the neat things about them being synthesized is that these libraries can sing in English, Japanese, Mandarin, Cantonese, and Spanish (and with some fiddling, likely in other languages too - I managed to get SAROS to perform in Norwegian thanks to the Spanish update). Where SynthV really falls short is the occasional glitches when you push the vocals, as well as the lack of vocal ornamentation; there’s no good way of performing say, growls at the moment.

I think ultimately human creativity will preservere. We’ll likely see a lot of AI generated garbage as people are getting used to the tools and finding ways of working with them in the next couple of years. After that, I don’t know. Even then there’ll be people that prefer to just do everything by themselves.

We manage to make garbage even without AI. Disney’s “Wish” was so bad people think AI was used, but I think it’s more a matter of “direction by corporate.” Corporate decided to seagull the entire project and the original creative vision was basically destroyed by corporate interests. You see it all the time in the games industry as well; creativity is set aside for proven established ideas, and market appeal. Risks are not allowed.

PipedLinkBot@feddit.rocks · 9 months ago

Here is an alternative Piped link(s):

Ninezero for example is all-in on that gravelly rock type voice

SAROS’ second vocal demo is a great example

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.