[AI 뉴스] GPT-4o Voice Mode, 휴머노이드 근황, 구글 1등 탈환, 오픈소스 이미지 1등 FLUX, 로봇 치과의사 성공, 수노 맞고소 등

조코딩

조코딩의 팟캐스트

GPT-4.5 보이스모드가 일부 사용자 대상으로 풀리기 시작했습니다.

The GPT-4.5 voice mode has started to be rolled out to some users.

가을쯤에는 다 풀린다고 합니다.

They say it will all be resolved by around autumn.

그래서 이제 데모들이 올라오고 있는데 너무 신기합니다.

So now the demos are coming out, and it's very fascinating.

동물 소리도 낼 수 있다고 합니다.

It is said that it can also make animal sounds.

Can you bark like a dog?

Can you oink like a pig?

Can you cluck like a chicken?

와 미쳤죠?

Wow, that's insane!

마지막이 진짜 미친 것 같아요.

The last one seems really crazy.

마치 진짜 사람처럼 동물 소리 내가지고

Making sounds like a real person imitating animal noises.

내가 좀 멋져 근데? 라는 그 느낌까지도 재현합니다.

It even captures the feeling of "Aren't I a bit cool?"

다시 한번 볼까요?

Shall we take a look again?

와 미쳤죠?

Wow, isn't it crazy?

그리고 이런 것도 됩니다.

And this is also possible.

노래도 부릅니다.

I also sing songs.

Twinkle twinkle little star

How I wonder what you are

How I wonder what you are.

와 노래도 불러줍니다.

Wow, they also sing!

그냥 노래 불러줘라고 하면 진짜 그 노래를 들을 수가 있어요.

If you just ask me to sing a song, you can really hear that song.

랩도 됩니다.

You can rap too.

랩이랑 비트박스도 합니다.

I also do rap and beatboxing.

미쳤죠?

Are you crazy?

이런 것도 가능합니다.

This is also possible.

와 미쳤죠?

Wow, that's crazy!

울어요.

I'm crying.

울먹울먹하면서 말하는 것까지 가능합니다.

It's possible to speak while choking up with emotion.

거의 사람 아니에요 이 정도면?

Isn't this almost a person?

연기하는 것도 됩니다.

You can also act.

벅스 버니, 요다, 호머 심슨

Bugs Bunny, Yoda, Homer Simpson

똑같이 따라해요.

Do exactly as I say.

이제 성대모사도 되고 연기도 되고 노래도 되고 랩도 되고 비트박스도 됩니다.

Now I can do impressions, acting, singing, rapping, and beatboxing.

다 됩니다.

Everything is possible.

헐 현실판이죠?

Wow, it's like the real thing, right?

기존에도 막 AI와 사랑에 빠지는 막 그런 분들도 있는데

There are already people who are suddenly falling in love with AI.

이제는 더 하지 않을까?

Aren't we done with this now?

같이 울고 같이 웃고 이런 게 다 되잖아요.

We can cry together and laugh together; all of this is possible.

인간이 돼버렸습니다 이제.

I have become human now.

네 이것뿐만 아닙니다.

Yes, it's not just this.

하나 더 보여드리면

If I show you one more thing...

1부터 10까지 빨리 세줘.

Count quickly from 1 to 10.

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Now even faster.

지금은 더 빨라졌습니다.

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Louder and faster and count up to 50.

1, 2, 3, 4, 5, 6, 7, 8, 9, 10

11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34

와 이거 들으셨나요?

Wow, have you heard this?

숨 한번 고르고 얘기하는 거.

Take a breath and then speak.

이것 또한 인간을 따라해요.

This also imitates humans.

사실상 인간이지 않을까 이 정도면.

Isn't it essentially human at this point?

그래서 혹시 풀리신 분들이 있다면

So if there are people who have resolved it,

한 번씩 써보시기 바랍니다.

Please try using it once.

그 다음에 오픈에이아에서 새로운 모델을 또 냈습니다.

Then, OpenAI released another new model.

출력 16배 늘린 롱 아웃풋이라는 새로운 모델을 출시를 했습니다.

We have launched a new model that increases the output to 16 times the long output.

보통 인풋을 많이 늘리거든요.

I usually increase the input a lot.

이번에는 아웃풋을 늘린 모델이 나왔습니다.

This time, a model with increased output has been released.

GPT-4 64K 푸푸드 알파

GPT-4 64K Poofed Alpha

이거는 쿼리당 최대 6만 4천 개의 출력 토큰을 제공한다고 합니다.

It is said that this provides up to 64,000 output tokens per query.

끝까지 생성할 수 있게 하면

If you allow it to be generated until the end.

좀 더 뭔가 코드 생성이나 이런 데서 유리하지 않을까.

Wouldn't it be more advantageous in something like code generation?

공식 홈페이지 보면 프라이싱이 나와 있는데 좀 비쌉니다.

If you check the official website, the pricing is listed, and it's a bit expensive.

6달러.

6 dollars.

1밀년 토큰에.

1 million tokens.

아웃풋이 기니까 그만큼 또 비싸요.

Since the output is long, it is also more expensive.

1밀년에 18달러.

$18 per million years.

그리고 이제 이렇게 GPT 어마어마한데

And now, GPT is just amazing like this.

이거를 적용한 로봇이 또 새롭게 나왔습니다.

A robot that has applied this has been newly released.

피규어 AI.

Figure AI.

휴머노이드 로봇을 만드는 기업이죠.

It's a company that makes humanoid robots.

여기서 새로운 로봇 피규어 2가 나온다고 합니다.

They say that two new robot figures are coming out here.

세계에서 가장 진보한 로봇이라고 해요.

They say it's the most advanced robot in the world.

이야 깔끔해진 것 같아요.

Wow, it seems to have become neat.

이전 버전보다.

Than the previous version.

이야 이 관절도 이렇게 싹.

Wow, this joint is like this too.

부드럽게 완전 깔끔해졌죠.

It has become completely clean and smooth.

오 야 이거 되게 부드럽게 싹 이렇게 움직일 수 있게 돼 있죠.

Oh wow, this is really designed to move so smoothly.

와 이게 티저만 나온 상황인데

Wow, this is just the teaser that has come out.

8월 6일 날 공개된다고 합니다.

It is said that it will be released on August 6th.

야 그러면 여기에 이제 노래 부르고 막 울고 감정 있는 GPT-4가

Yeah, then here we have GPT-4 singing, crying, and showing emotions.

여기에 들어가면 얼마나 진짜 사람 같을까.

I wonder how much it would feel like a real person if I go in here.

그래서 이제 브레트 에드콕 피규어 AI를 창립한 파운더인데

So now I’m the founder who created the Brett Edcock figure AI.

수 10억 대의 지능형 휴머노이드 로봇을 확장할 수 있는 기회 창이 열렸다.

An opportunity has opened up to expand the intelligent humanoid robot market to 1 billion units.

그래서 우리는 2024년에 살고 있으며

So we are living in 2024.

역사상 처음으로.

For the first time in history.

이것이 가능한 해라고 하면서

Saying that this is a possible year.

인생은 곧 SF영화로 바뀔 것입니다.

Life will soon turn into a science fiction movie.

휴머노이드 로봇이 이제 진짜 인간적으로 소통하고

Humanoid robots can now communicate in a truly human-like manner.

인간적으로 움직이고 잡을 수 있는 게

Something that can be moved and held in a human manner.

이제 실제 현실이 돼버리고 있습니다.

It is now becoming a reality.

굉장히 기대가 됩니다.

I'm really looking forward to it.

그리고 로봇 관련해서

And regarding robots,

시그래프라는 컨퍼런스가 최근에 열렸는데

A conference called Siggraph was recently held.

시그래프에서 엔비디아가 이제 휴머노이드 로봇 개발을 위한

In the Sigragraph, NVIDIA is now focused on developing humanoid robots.

주요 기술 이런 것도 공개를 했거든요.

They also disclosed major technologies like this.

비전 프로를 이용해서 이 로봇이랑 똑같이 움직입니다.

Using the Vision Pro, it moves exactly like this robot.

비전 프로를 끼고 이렇게 움직이니까 거의 실시간으로 따라하죠 이렇게.

When I wear the Vision Pro and move like this, it almost follows in real time.

이렇게 해서 로봇이.

This is how the robot is.

로봇에게 훈련을 시킬 수가 있다고 합니다.

It is said that robots can be trained.

이걸 잘 훈련하면은 알아서 수행할 수가 있겠죠.

If this is trained well, it will be able to perform on its own.

훈련 받은 걸로.

I'm trained for this.

그렇게 해서 이제 실제 현실에서 로봇이 이제 인간의 행동들을 배우고

In that way, now robots are learning human behaviors in the actual reality.

특정 태스크들을 수행할 수 있지 않을까.

Could we perform certain tasks?

이건 또 다른 업체인데

This is another company.

뉴라라는 업체에서

From the company Newrara.

For Anyone이라는 새로운 로봇의 영상을 공개했는데

A new video of a robot called For Anyone has been released.

우주에서 가장 많은 일을 처리한다라는 멘트가 있었습니다.

There was a comment that said it handles the most work in the universe.

이렇게 다림질하고 뭐 썰고

Ironing like this and cutting something.

뭐 이런 거 하는 데모를 보여주고 있습니다.

It's showing a demo of doing something like this.

옷걸이에 걸고 박스 옮기고 이런 데모를 보여주고 있어요.

I'm showing a demo like hanging it on a hanger and moving the box.

그래서 점점 이런 휴머노이드 로봇들이

So gradually, these humanoid robots are

엄청 발전이 되고 있는데

It's really making great progress.

그리고 심지어 인간형 로봇만 있는 게 아닙니다.

And it's not just humanoid robots.

말 로봇이 있습니다.

There is a talking robot.

너무 자연스러워요.

It's so natural.

말 로봇.

Talking robot.

오 야 눈 깜빡.

Oh, hey, blink.

스윽.

Swoosh.

아무튼 이런 로봇도 있고 점점 발전을 하고 있습니다.

Anyway, there are robots like this, and they are gradually evolving.

그다음에 또 자동 드론인데

Then there's the automatic drone again.

장거리 비행하려면 어딘가에서 충전하고 가야 되잖아요.

If you're going to take a long-distance flight, you need to charge somewhere before you go.

전선을 이용해서 충전을 하는 드론이 나왔습니다.

A drone that charges using power lines has been released.

네, 날아가다가 결합을 한다고 합니다.

Yes, they say it combines while flying.

싹 결합했죠, 지금.

It's all combined now.

그래가지고 충전을 합니다.

So I'm charging it.

전기 도둑질인 것 같긴 한데

I think it seems like electricity theft.

중간에 충전을 합니다.

I will charge in the middle.

아, 충전 끝났다 하면은

Ah, when the charging is done.

여기서 싹

Here sprouts.

빠져나갑니다.

I'm getting out.

그러면은

Then.

플라이 리차지 플라이 리차지 하면서

Fly recharge, fly recharge while doing so.

장거리 비행을 할 수 있다.

You can take long-distance flights.

미국 땅 워낙 넓으니까

The United States is so vast.

미국 끝에서 끝까지 바로 투입하고 갈 수도 있겠죠.

It could be deployed straight from one end of the United States to the other.

그다음에 완전 자동 로봇 치과 의사.

Then, a fully automated robot dentist.

세계 최초로 인간 시술을 성공했다고 합니다.

It is said that the first successful human procedure has been achieved in the world.

이런 로봇이 있다고 해요.

I've heard that there is a robot like this.

치과 의사 선생님의 미래 모습입니다.

This is the future image of the dentist.

네, 이거 지금 하고 있죠?

Yes, I'm doing this right now, right?

치잉 하고 치료하고 있어요.

I am chewing and treating.

그래서 인간 치과 의사보다

So, compared to human dentists.

약 8배 빠르게 작업 완료.

Completed tasks approximately 8 times faster.

2번 방문에 걸쳐 2시간이 걸리는 작업을

A task that takes 2 hours over 2 visits.

로봇 치과 의사는 약 15분 만에 완료했다.

The robot dentist completed it in about 15 minutes.

효율성 미쳤죠.

The efficiency is insane.

이런 고급 치과 의사 같은

Like this high-end dentist.

정교한 일들도 아예 치과 로봇 이런 게 다 수행을 하는

Sophisticated tasks are all performed by dental robots like these.

그런 시대가 되고 있는 것 같습니다.

It seems that we are entering such an era.

그다음에 런웨이 ML에서

Next, in Runway ML,

GEN3 알파의 이미지 투 비디오 기능을 새롭게 공개했습니다.

We have newly released the image-to-video feature of GEN3 Alpha.

예를 들어 이 이미지를 넣고 물이 떨어지는 영상이라고 하면

For example, if you put in this image and it's a video of water falling.

이렇게 물 떨어지게 영상 만드는 거.

Making a video where water droplets fall like this.

퀄리티가 엄청납니다.

The quality is amazing.

일관성이나 이런 게 잘 안 깨지고요.

Consistency and such are not easily broken.

나뭇잎이 떨어짐 하면 이렇게 싹

When the leaves fall, they sprout like this.

부드럽게 영상이 잘 나오는 것을 볼 수가 있습니다.

You can see that the video comes out smoothly.

이런 것도 퀄리티 좋죠.

This kind of thing has good quality, right?

이런 것도

Even this.

와

Wow

10렙스 돌린 것 같죠 마치

It feels like I've rolled a 10-level spin, doesn't it?

이게 또 놀라운 게 감정 표현도 잘 된다고 해요.

What's even more surprising is that it expresses emotions well too.

이거 이미지를 만들어 가지고

Make this into an image.

해피 앤 터킹이라고 하면은

When it comes to happy and talking...

와 웃으면서 이렇게 말하는 거 되고요.

Wow, you can say that while smiling.

새드, 슬픈 표정

Sad, unhappy expression

이렇게 프롬트만 쓰면 된다고 합니다.

It is said that you can just use prompts like this.

리스닝, 듣고 있는 거

Listening, what you are listening to.

샤이, 부끄러워하는 거

Shy, feeling embarrassed.

쇽트, 충격

Shock, impact

아 이런 게 된다고 합니다.

Oh, it says this can happen.

여기에 만약에 앞서 나왔던 GPT-4 보이스

If the GPT-4 voice mentioned earlier is here.

그 감정이 더해지면 샤이하게 웃는 거 할 때

When that feeling adds up, that's when I shyly smile.

약간 이렇게 수집게 할 거고

I will collect it like this.

말할 때 이렇게 경청하는 표정 나오고 할 수가 있겠죠.

When you speak, you can have a listening expression like this.

라이브 포트레이트를 쓰면은 입모양도 싹 맞출 수 있을 거고요.

Using live portrait, you can match the shape of the mouth perfectly as well.

기술들이 각각 올라오니까 이걸 융합하면은

As each technology emerges, if we fuse them together...

아 진짜 엄청날 것 같아요.

Oh, I think it’s going to be amazing.

젠3 알파 터보라는 것도 또 발표를 했대요.

They also announced something called Zen 3 Alpha Turbo.

그래서 11초 만에 10초 영상을 만들어 준다고 합니다.

So they say they can create a 10-second video in 11 seconds.

거의 실시간이에요.

It's almost real-time.

보면은 젠3 알파, 젠3 알파 터보의 속도입니다.

This is the speed of the Zen 3 Alpha and Zen 3 Alpha Turbo.

엄청 빨라요.

It's super fast.

와 그러면은 이제 진짜 실시간 스트리밍으로 생성도 가능하겠죠.

Wow, then it would be possible to generate it in real-time streaming!

엄청납니다.

It's amazing.

그다음에 이미지 생성 쪽도 확 발전을 했습니다.

Then, the field of image generation has also significantly advanced.

블랙 포레스트 랩스라는 곳에서

At a place called Black Forest Labs.

플럭스라는 이미지 생성 모델을 공개를 했습니다.

We have released an image generation model called Flux.

이게 기존의 이미지 생성 잘 됐다 하는 것들 있잖아요.

These are the existing image generation ones that have worked well.

미드전이 달리 스테이블 디퓨전 3 다 뛰어넘었습니다.

Midjourney has surpassed Stable Diffusion 3.

플럭스 프로가 LO 스코어 1060점 엄청 뛰어넘었죠.

Flux Pro has greatly surpassed the LO score of 1060 points.

기존의 미드전이 엄청나다 했는데

I heard that the existing mid-war is amazing.

훨씬 더 성능이 좋은 모델이 오픈소스로 나왔습니다.

A much more powerful model has been released as open source.

내 컴퓨터에서도 쓸 수가 있습니다.

I can use it on my computer as well.

퀄리티가 어떠냐.

How is the quality?

2015년경에 지루한 스냅챗 사진 이렇게 나온다고 합니다.

Around 2015, boring Snapchat photos came out like this.

2015년의 감성이 좀 담겨 있죠.

It contains a bit of the sentiment from 2015.

이렇게 나온다고 합니다.

It is said that it comes out like this.

글씨 표현도 어마어마하게 잘 돼요.

The writing expression is incredibly well done.

미드전이가 플럭스보다 낫다고 생각하지 않는다라는 이걸 들고

I don't think mid-transition is better than flux.

사람 표현도 되게 정확하죠.

The expression of people is also very accurate.

그리고 손발 표현이 너무 잘 나와요.

And the expression of the hands and feet is really well done.

손에 이상한 게 하나도 없어요.

There is nothing strange on my hand.

다 깔끔하게 잘 나옵니다.

Everything comes out neatly.

반지 낀 것도 깔끔.

The ring looks neat too.

만화 이런 것도 깔끔하죠.

This kind of comic is neat, isn't it?

이런 것도 가능하다고 해요.

They say this is also possible.

A page of graphic novel 그려달라 라고 하면

When you say "Please draw a page of a graphic novel,"

이게 그냥 한 방에 땅 생성된 거예요.

This was just created in one shot.

글씨가 조금 깨지긴 하는데

The text is a bit jagged.

컷이랑 이런 거 나누는 게 굉장히 잘 나와요.

Dividing things like cuts comes out really well.

예를 들어 미드전이에서 만든다.

For example, it is made in the mid-zone.

이렇게 나오거든요.

It comes out like this.

이렇게 말풍선도 제대로 안 나오고

The speech bubble doesn't come out properly like this.

약간 애매하게 나오죠.

It's a bit ambiguous.

아무튼 이미지 생성이 한 번에

Anyway, image generation happens all at once.

굉장히 잘 나옵니다.

It's coming out really well.

그래서 이게 완전히 공개가 됐습니다.

So this has been completely revealed.

레플리케이트 API로 쓸 수 있는 것도 이렇게 공개가 됐고요.

The things that can be used with the Replicate API have also been made public like this.

그래서 이거 한 번 써보면

So if you try this once,

Extreme close-up of a single tiger eye.

The world flux is painted over in the big white brush strokes with visible texture.

The world flux is painted over in big white brush strokes with visible texture.

와 너무 멋있어요.

Wow, that's really awesome.

호랑이의 눈 클로즈업에 글씨까지 완벽하게 나옵니다.

The text appears perfectly in the close-up of the tiger's eye.

이게 끝이 아닙니다.

This is not the end.

미드전이도 또 새로운 게 나왔습니다.

A new one has also come out for the mid-game.

미드전이 6.1이 나왔습니다.

Mid-Jeon 6.1 has been released.

여기 퀄리티도 만만치 않습니다.

The quality here is also quite impressive.

이제는 뭐 사진만 보면 이게 AI인지 아닌지를

Now, just by looking at a photo, you can tell whether it's AI or not.

구분할 수가 없을 것 같긴 합니다.

I don't think I will be able to distinguish them.

런웨이 젠3랑 결합하면 이런 것들을 만들 수가 있습니다.

By combining with the Runway Gen 3, you can create things like this.

이렇게 미드전이로 만든 거를 영상화할 수도 있고요.

You could also film something that was made like this with mid-journey.

미드전을 만들어서 업스케일하면

If you create a mid-game and upscale it,

와 이런 환상적인 사진도 만들 수 있고요.

Wow, you can create such fantastic photos too.

와 인테리어 이런 거 돌아가게 하는 것도 이런 게 되고요.

Wow, even things like running the interior can be done like this.

뮤직비디오처럼 이렇게 뭐 만드셨더라고요.

It seems you made something like a music video.

와 이렇게 페스티벌의 한 장면.

Ah, this is a scene from the festival.

와 DJ.

Wow DJ.

이야.

Wow.

이제 뭐 뮤직비디오 이런 거

Now, what about music videos or things like that?

그냥 이걸로 그냥 딸깍으로 다 뽑아낼 수 있지 않을까.

Can't we just click this to extract everything?

그다음에 스테빌리티 AI.

Then there is Stability AI.

Stable Fast 3D라는 걸 출시했습니다.

We have released something called Stable Fast 3D.

소개 영상을 보면

If you watch the introduction video,

이미지가 있으면 이걸 3D 오브젝트로 만들어주는 거예요.

If there is an image, it will create a 3D object from it.

근데 이게 놀랍게도 0.5초가 걸린대요.

But surprisingly, it takes 0.5 seconds.

그래서 실제로 돌려보면

So if you actually try it out...

얘를 런 돌리면

If you run this.

오 와 끝났어요.

Oh wow, it's over.

바로 3D가 싹 나옵니다.

It will immediately be in 3D.

와 이제 3D 게임 바로 그냥 딸깍으로 만들 수가 있을 것 같습니다.

Wow, it seems like we can now make 3D games just by clicking!

커뮤니티 라이센스니까

Because it's a community license.

1밀리언 달러 벌 때까지는 무료고

It's free until you earn 1 million dollars.

그 이상일 때 돈 내는 거.

Paying when it exceeds that.

즉 오픈소스입니다.

In other words, it is open source.

와 이제 그냥 뭐 이미지, 영상, 3D 엄청 발전하고 있습니다.

Wow, now images, videos, and 3D are really advancing a lot.

무브 AI라는 곳에서

At a place called Move AI.

3D 애니메이션 파워바이 AI라고 해서

It's called 3D Animation Power by AI.

쫄쫄이 입잖아요.

You're wearing tight clothes.

이런 거 입고 막 동작 이런 거 수행해야 되고 하는데

I have to wear something like this and perform actions like this.

이런 거를 이제 그냥 찍으면 된다고 합니다.

They say you can just take pictures of things like this now.

싱글 카메라로 찍으면

If you shoot with a single camera,

와 똑같이 가죠?

Let's go just like that, right?

게임 캐릭터랑

Game character and

네 게임 캐릭터의 이런 움직임 이런 거를

The movements of your game character like this.

그냥 영상만 녹화하면

If you just record the video...

바로 모션 캡스쳐에서 만들 수가 있다고 합니다.

It seems that it can be created right away in motion capture.

그다음에 조카소에서 댄스 AI가 드디어 나왔습니다.

Then the dance AI finally came out in the nephew's shop.

그냥 사진 한 장만 올리면

Just upload one picture.

춤 동작을 따라해서 나오게 됩니다.

You will come out by following the dance moves.

3000원 상당의 포인트로

With points worth 3000 won.

무료로 영상을 뽑아보실 수가 있습니다.

You can extract the video for free.

기본적으로 이 어금지송에 대한 템플릿을 만들어 놨고요.

Basically, I have created a template for this bite song.

아니면은 내가 따라하고 싶은 춤이 있다.

Otherwise, there is a dance that I want to follow.

더보기에 들어가서 플러스로 추가할 수가 있습니다.

You can go into the more section and add it as a plus.

여기에 URL을 넣습니다.

Insert the URL here.

세로형 이렇게 혼자 나온 거 좋아요.

I like that the vertical shape has come out like this on its own.

여러 명 나오면 이게 안 돼요.

It doesn't work if many people come out.

복사해가지고 등록을 하면은

If you copy it and register it, then...

이렇게 템플릿 등록이 됩니다.

The template will be registered like this.

템플릿을 클릭하고 사진 한 장만 있으면 돼요.

Just click on the template and you only need one photo.

그냥 전신이 나온 사진 딱 한 장만 있으면 됩니다.

I just need one photo that shows the whole body.

그러면 이렇게 등록이 되고 진행을 하면 됩니다.

Then you can register like this and proceed.

그래서 이것도 좀 다양하게 해놨어요.

So I've made this a bit more diverse as well.

가성비 있게 이용할 수 있는 거

Something that can be used for good value.

싼 거 그리고 퀄리티 좋은 거

Cheap and good quality.

베이직은 기본적으로 공짜로 한번 만들어 볼 수 있으니까요.

The basic version can be made for free once, after all.

재미로 만들어 보시기 바라겠습니다.

I hope you can make it for fun.

결과를 빨리 받고 싶다.

I want to receive the results quickly.

SNS

드리겠다라고 약속을 해서 빨리 받을 수 있게 해드립니다.

I promise to give it to you, so I will make sure you receive it quickly.

이렇게 하면 결과물을 이메일로 알림을 드립니다.

If you do this, you will be notified of the results via email.

많은 이용 부탁드리겠습니다.

Thank you for your support.

혹시 이 영상을 보시는 엔터 업계 관계자분이 계시다 하면은

If there are any people in the entertainment industry watching this video,

저희가 템플릿을 등록을 해드립니다.

We will register the template for you.

여기 협업 문의로 연락을 주시면은

If you contact us here for collaboration inquiries,

바로 템플릿 쓸 수 있게도 등록을 해드리니까요.

I will register it so that you can use the template right away.

연락을 주시면 감사드리겠습니다.

I would appreciate it if you could contact me.

그다음에 구글이 라지 랭기지 모델 선후를 탈환했습니다.

Then Google reclaimed the large language model precedence.

재미나의 1.5 프로가 최신 버전이 나왔고요.

The latest version of 재미나's 1.5 Pro has been released.

이게 1등을 찍었습니다.

This came in first place.

그래서 어떻게 1등이냐

So how is it that you're in first place?

챗봇 아레나 있죠.

There is a chatbot arena.

투표를 받았을 때 대중들의 픽이

When the votes were counted, the public's pick was...

재미나의 1.5 프로 8월 1일 버전이 1등을 찍었습니다.

The 1.5 Pro version of Jeminah topped the charts on August 1st.

근데 물론 뭐 벤치마크 따라 다르고

But of course, it depends on the benchmark.

태스크마다 다르겠지만 일단 성능이 확 올라간 것 같습니다.

It seems that the performance has significantly improved, though it may vary by task.

그리고 또 구글이 엄청난 게

And also, what's amazing about Google is...

초소형 오픈소스 모델인 젠마 2

Gemma 2, a miniature open-source model.

2 빌리언짜리를 냈거든요.

I paid 2 billion won.

굉장히 작은 거예요.

It's extremely small.

이 모델을 출시했는데

This model has been released.

GPT 3.5 미스트랄 MOE 구조 7 빌리언짜리 8개 있는 거

GPT 3.5 Mistral with 8 structures worth 7 billion.

이거를 다 앞섰다고 합니다.

They say they have surpassed all of this.

여기도 엘로스코어 기준으로 1등입니다.

Here, too, I'm ranked first based on the Eloscore.

미스트랄이나 GPT 3.5 훨씬 앞섰어요.

Mistral is far ahead of GPT 3.5.

2 빌리언짜리 모델인데

It's a 2 billion won model.

그러니까 GPT 3만 하더라도

So even with GPT 3 alone

175

빌리언이거든요.

It's a billion.

아마 3.5는 더 높지 않을까 싶은데

I think it might be higher than 3.5.

2 빌리언으로 앞섰습니다.

We are ahead by 2 billion.

엄청나죠.

It's amazing.

심지어 오픈소스입니다.

It's even open source.

구글도 엄청나게 좀 앞서갈 수도 있지 않을까라는 생각이 들고요.

I think Google could also be significantly ahead.

심지어 이런 뉴스도 있습니다.

There is even news like this.

캐릭터닷 AI가 AI 서비스 측면에서

Character Dot AI in terms of AI services.

가장 성공한 케이스가 아닐까 싶은데

I think it might be the most successful case.

이 캐릭터닷 AI의 공동 창립자

Co-founder of this Character Dot AI.

이분이 트랜스포머라는 논문을 쓰신

This person wrote a paper on transformers.

이 사람이 다시 구글로 돌아갔다고 합니다.

It is said that this person has returned to Google.

구글이 캐릭터닷 AI와 라이센스 계약을 맺고

Google has entered into a licensing agreement with Character.ai.

또 주요 인원을 흡수했다고 합니다.

It is said that they have absorbed key personnel as well.

사실상 인수가 아닐까 싶은데

I wonder if it's essentially an acquisition.

반도

Peninsula

독점법 이런 게 있으니까

Because there is something like antitrust law.

어떻게 돌려가지고 처리를 한 것 같아요.

It seems like they handled it by turning it around somehow.

이 캐릭터닷 AI를 노리는 기업들이 되게 많았거든요.

There were a lot of companies targeting this character dot AI.

메타도 직접 서비스도 하겠다 이런 얘기도 있었고

There were talks about Meta directly providing services as well.

일론 머스크도 여기에 되게 관심이 많았는데

Elon Musk was very interested in this as well.

구글이 결국에 흡수를 해서

Ultimately, Google absorbed it.

엄청난 서비스를 품게 됐습니다.

I have been blessed with an amazing service.

AI 업계에서 꽤 앞서갈 수 있지 않을까

We might be able to get quite ahead in the AI industry.

그다음에 이제 메타도 캐릭터닷 AI

Next, now Meta also has Character.ai.

비슷한 걸 냈습니다.

I released something similar.

맞춤형 챗봇 플랫폼을 새롭게 출시했다고 합니다.

It is said that a new customized chatbot platform has been launched.

사람 대신에 인스타 활동을 하는

Doing Instagram activities instead of a person.

그런 맞춤형 챗봇을 만드는 플랫폼을 출시했습니다.

We have launched a platform to create such custom chatbots.

그래서 이렇게 AI봇을 만들어가지고

So I created this AI bot.

직접 페르소나도 설정하고 채팅을 나눌 수 있게

You can set your own persona and chat directly.

만드는 도구를 냈습니다.

I brought out the tool for making.

마치 그 GPT-S 비슷하기도 하고

It feels a bit like that GPT-S.

그래서 어떤 특정한 인격이나

So, a specific personality or

어떤 특정한 성격을 가진 AI들을 만들어서

To create AI with specific characteristics.

DM이나 뭐 왓셉이나 이런 걸로 대화를 나눌 수가 있겠죠.

We can communicate through DM or something like WhatsApp.

근데 안타깝게도 한국에선 막혀있습니다.

But unfortunately, it is blocked in Korea.

조코딩 AI 이런 거 만들면은

If you make something like this with AI,

저 대신에 저의 성격이나 했던 말이나 이런 거를 다 담고 있다가

It contains all my personality, the things I've said, and such instead of me.

이걸 기반으로 뭐 대화하거나 하는

Based on this, we can have a conversation or something.

그런 것도 만들 수가 있겠죠.

You could make something like that, right?

게다가 유명 배우들과 음성 사용 협상을 했다고 합니다.

Furthermore, it is said that they negotiated voice usage with famous actors.

이 유명 배우들의 음성을 활용한

Using the voices of these famous actors.

이제 AI를 만들어서

Now let's create AI.

AI 페르소나를 만들려고 하는 그런 준비 과정이겠죠.

It must be the preparation process to create an AI persona.

그렇게 되면은 아까 GPT-455가 보여준 것처럼

If that happens, as shown by GPT-455 earlier,

유명 배우들과 자연스럽게 웃고 떠들고

Laughing and chatting naturally with famous actors.

막 감정적인 소통도 할 수 있는

It's a communication that can be quite emotional.

그런 시대가 될 수도 있지 않을까.

Could it be that such an era may come?

그리고 메타에서 비디오에서 객체를 따는 모델을 새롭게 공개를 했습니다.

And Meta has newly released a model for extracting objects from videos.

Segment Anything이라고 해서

It's called Segment Anything.

이전에도 1이 나왔을 때

When 1 came out before

오 굉장히 잘 된다 했었는데

Oh, I said it's going really well.

이제 2가 나왔습니다.

Now, 2 has come out.

훨씬 더 정확하게 됩니다.

It becomes much more accurate.

여기 지금 따라가는 거 보면 엄청나죠?

It's amazing to see how it's following right now, isn't it?

지금 손이랑 이렇게 누르고 있는데도

I'm pressing like this with my hand right now.

반죽의 경계를 싹 잘 구분하고 있어요.

You're clearly defining the boundaries of the dough.

직접 테스트도 해볼 수 있어요.

You can also test it out yourself.

공

Ball

이렇게 트래핑하는 거

Trapping like this.

공이 되게 빠르게 움직이고 막 이러잖아요.

The ball moves really quickly and stuff like that.

이렇게 공을 선택하면은 지정이 되죠.

By selecting the ball this way, it is designated.

얘를 계속 트래킹할 수가 있습니다.

You can continue to track this.

Track Object 하면은

If you do Track Object, then...

오 이렇게 영상에 막 공이 움직이는데도

Oh, the ball is moving around in the video like this.

다 트래킹이 돼요.

Everything can be tracked.

한 번 가리는 것도 있는데도 트래킹이 됩니다.

There are times when it gets covered, but tracking still works.

엄청나죠?

Isn't it amazing?

탁구공

Table tennis ball

엄청 빠르게 움직이는 거

Something that moves incredibly fast.

싹 지정해가지고 트랙하면은

If you designate it completely and track it...

오 이렇게

Oh, like this.

야 이렇게 빨리 움직이는 게 트래킹이 돼요.

Wow, moving this quickly allows for tracking.

그리고 이걸 활용한 예시들이 많습니다.

And there are many examples that utilize this.

Florence라고 이거랑 결합을 했는데

I combined this with Florence.

Florence가 뭐냐면

What is Florence?

이미지 태그들을 붙일 수 있는 거

You can attach image tags.

이거랑 결합하면은

If combined with this,

그래서 이렇게 영역을 구분 지어가지고

So, by delineating the areas like this

어떤 건지

What kind of one is it?

라벨링을

Labeling.

라벨링도 해주고

They also do the labeling.

이렇게 할 수가 있겠죠.

I can do it this way.

텍스트로 어 짚레스트라고 딱 지정을 하면은

If you specify it exactly as "Oh Zip List" in text,

이거를 계속 트래킹할 수 있습니다.

You can continue to track this.

영상에서

In the video

와 이렇게 자연스럽게

Wow, so naturally.

와

Wow

와 이런 것도

Wow, even this!

마이크 타이슨

Mike Tyson

마이크 타이슨을 누군지 인식을 해가지고

Recognizing who Mike Tyson is.

그걸 잡을 수 있는 거예요.

You can catch that.

편집을 하거나 AI 처리를 한다고 했을 때

When it is said that editing or AI processing is done

여기에 뭐 이렇게 이펙트를 넣거나 할 수도 있겠죠.

You can add effects like this here, right?

그리고 풍선 날아가는 거에

And the balloon is flying.

특정 풍선을 트래킹한다

Tracking a specific balloon.

하면은 쉽지 않을 텐데

It won't be easy if you do it.

이 풍선 딱 지정을 하면

If you just specify this balloon

얘를 쭉 트래킹합니다.

I will track this continuously.

와 엄청나죠?

Wow, isn't it amazing?

와 이렇게 들어가는 것까지

Wow, even going in like this.

아무튼 이렇게

Anyway, like this.

SAM2라는 것도 오픈소스로 공개가 됐다.

SAM2 has also been released as open source.

그 다음에

After that

라마4에 대한 정보가 나왔습니다.

Information about the Lama 4 has been released.

라마4 훈련에

Training for Llama 4

라마3.1보다

More than Lama 3.1

GPU 10배를 더 투입한다고 합니다.

They say they will invest 10 times more in GPUs.

라마4도 오픈소스로 풀리면

If LLaMA 4 is released as open source

진짜 어마어마할 것 같습니다.

It seems like it will be really amazing.

그리고 오픈소스 관련해서

And regarding open source...

의료 및 금융 산업용 LLM 오픈소스를 출시했는데

I have released an open source LLM for the healthcare and finance industries.

이게 뭐 GPT-4보다 우수하다

"This is superior to GPT-4."

이런 사례가 나왔다고 합니다.

It is said that such a case has emerged.

오픈소스 발전하면 좋은 점이

The good thing about the development of open source is

특정 분야에 맞게 파인튜닝을 하게 되면

If you fine-tune it for a specific area,

그 분야에 대한 성능이 확 올라가잖아요.

The performance in that field significantly increases.

특정 분야에서 그걸 받아다가

Taking that in a specific field.

파인튜닝 해가지고

Fine-tuning.

가공해서

Processed

좋아지는

getting better

그런 영역들이 점점 많아지지 않을까 싶어요.

I think there will be more and more areas like that.

구글 매드팜2

Google Mad Farm 2

그거를 오픈소스가 뛰어넘는

That surpasses open source.

이런 사례도 나오게 됐고요.

Such cases have also emerged.

파이낸스 측면에서도

From a financial perspective.

다른 클로스 소스 모델들보다

Compared to other close source models

훨씬 좋은 성능을 보여주고 있습니다.

It shows significantly better performance.

굉장히 활용도가 높겠죠.

It would be very versatile.

각각 분야마다 좋으면 이걸 융합해가지고

If it's good in each field, let's merge it.

멀티, 에이전트 이런 걸 쓸 수도 있으니까.

You can also use things like multi and agents.

그 다음에 애플은 AI 훈련에

After that, Apple is focusing on AI training.

엔비디아 GPU를 안 쓰고

Without using an NVIDIA GPU

구글 TPU를 채택을 했다고 합니다.

They say they have adopted Google TPUs.

엔비디아 GPU가 그냥 표준이라고 생각이 들었는데

I thought NVIDIA GPUs were just the standard.

TPU로도 훈련을 하고 있는 애플을 볼 수가 있습니다.

You can see Apple training with TPU as well.

그 다음에

After that

MS는 오픈 AI의 경쟁자

MS is a competitor of OpenAI.

이게 뭐냐면

What this is, is...

사실 MS가 오픈 AI에 엄청나게 많이 투자를 했잖아요.

In fact, Microsoft has invested a huge amount in OpenAI.

그래서 사실 동맹인가 싶었는데

So I was wondering if it was actually an alliance.

경쟁자라는 문건이 나와가지고 화제가 됐습니다.

The document titled "Competitor" has come out and caused a stir.

오픈 AI가 서치 GPT라는 검색엔진을 냈잖아요.

OpenAI released a search engine called Search GPT.

그래서 빙을 운영하는 MS와 완전 경쟁 관계가 되잖아요.

So it becomes a completely competitive relationship with MS, which operates Bing.

그래서 최근에 MS에서 공개한 문건에서

So, in a recent document released by MS,

경쟁사로 이렇게 추가를 했다고 합니다.

It is said that this has been added as a competitor.

투자는 투자고 또 비즈니스는 비즈니스인가

Investment is investment, and business is business, right?

이런 생각이 듭니다.

I have this thought.

그 다음에 위기론도 좀 있었잖아요.

After that, there was also some discussion on the theory of crisis.

AI 버블이다.

It's an AI bubble.

지난주에도 좀 소개해드렸는데

I briefly introduced it to you last week.

빅테크들은

Big tech companies

경쟁이 가속되고 있다고 합니다.

It is said that competition is accelerating.

우려는 있지만 계속 투자를 늘리고 있는 상황입니다.

There are concerns, but we are continuing to increase our investments.

너무 늦기보다 필요하기 전에

Before it's too late, rather than when it's needed.

관련 영역을 구축하는 게 낫다라고 하면서

It's better to build related areas.

투자를 늘리고 있습니다.

I am increasing my investments.

그래서 이런 뉴스도 있습니다.

So there is also news like this.

AI 지출 축소 없다.

No reduction in AI spending.

계속 투자를 확대하고

Continuously expand investments.

아직 달려가고 있는 그런 상황인 것 같습니다.

It seems like it's still a situation where we are running.

그 다음에 AI 올라 라는 곳에서

Next, at a place called AI Olla,

오픈 AI 위스퍼보다 50% 빠른 음성 인식 모델을 출시했다고 합니다.

It is said that they have released a speech recognition model that is 50% faster than OpenAI's Whisper.

이렇게 데모를 보여주고 있는데

I'm showing a demo like this.

이름이 위스퍼 메두사

The name is Whisper Medusa.

아마 이 위스퍼를 좀 가공해서 만든 것 같아요.

I think this was probably made by processing the whisper a bit.

속도를 보자면

Looking at the speed

굉장히

very

빠르게 나오고 있습니다.

It's coming out quickly.

이것도 오픈소스로 출시했다고 하니까요.

I heard that this has been released as open source too.

연구 및 상업적 용도로 사용이 가능합니다.

It can be used for research and commercial purposes.

한번 써봐도 좋지 않을까

Isn't it okay to try using it once?

그 다음에

After that.

기터브에서 AI 기능 테스트까지 제공하는

Providing everything from AI feature testing to GitHub.

기터브 모델이라는 것을 출시했습니다.

We have released a model called GitHub.

약간 허깅페이스 비슷한 것 같아요.

I think it's somewhat similar to Hugging Face.

이렇게 생겼다고 합니다.

It is said to look like this.

여러 모델들이 있어서

There are various models.

정보도 보고

I will check the information.

직접 이 모델을 바로 여기서 테스트해 볼 수도 있고

You can test this model right here.

허깅페이스 스페이스 이런 거랑 비슷하죠.

It's similar to things like Hugging Face Spaces.

어떻게 쓰면 되는지 이런 것도 보고

I'm looking at how to write this kind of thing.

코드 스페이스 통해서

Through Code Spaces

클라우드에서 코딩도 되고

You can code in the cloud.

지금 웨일리스트를 받고 있습니다.

We are currently accepting the waitlist.

이게 되면은 바로 쓸 수가 있다고 합니다.

They say it can be used immediately when this is done.

그 다음에

Then

수노 AI가

Suno AI

지금 음반사 대상으로 소송을 받았습니다.

I am currently facing a lawsuit from the record label.

멋대로 지금 학습에 사용한 거 아니냐

Aren't you using it for your own purposes right now?

라고 해서 소송을 받았는데

I was sued for that.

여기에 맞고소를 했다고 합니다.

It is said that a lawsuit has been filed here.

되게 궁금해요.

I'm very curious.

이거 어떻게 누가 이길지

How will we know who will win this?

놀랍게도

Surprisingly

수노가 인정을 했습니다.

Sunoga has admitted it.

AI 학습을 위해

For AI learning

수천만 건의 녹음을 활용했으며

Utilized tens of millions of recordings.

아마도 원고가 권리를 소유한

Perhaps the manuscript owns the rights.

녹음이 포함됐을 것

It must have included a recording.

이라고

It is.

밝혔다고 합니다.

It is said that it has been revealed.

저작권 침해한 거야

It's copyright infringement.

라고 인정하고 들어갔어요.

I admitted that and went in.

근데 주장하는 게

But what I'm asserting is

우리는 음악가나 교사

We are musicians or teachers.

일반인들이

non-professionals

새로운 도구를 활용해

Using new tools.

음악을 만드는 것을 지원한다.

Supports making music.

음반사들이 시장 점유율에 대한 위협으로

Record companies are threatened by market share.

이걸 바라보고 있는데

I'm looking at this.

너네 지금 밥그릇을 지키려고

You're trying to protect your rice bowl right now.

다른 사람이 쉽게 음악 만드는 거

Making music easily by others.

방해하는 거냐

Are you interrupting?

라고

said

이제 반박을 했다고 합니다.

It is said that a rebuttal has now been made.

어떻게 될지 궁금해요.

I wonder what will happen.

일반인 음악 생성이

The creation of music by ordinary people.

공정사용이다.

It is fair use.

이게 받아들여진다면

If this is accepted

음악 AI들이

Music AIs

어마어마하게 발전할 수도 있지 않을까

Could it possibly develop enormously?

재밌는 시대인 것 같아요.

I think it's a fascinating era.

과연 어떻게 될지

I wonder how it will turn out.

기대가 됩니다.

I am looking forward to it.

아무튼 이렇게 해서

Anyway, like this

이제 이번 주

Now this week

AI 소식들

AI news

한번 쭉 모아서

Gather them all at once.

전달해 드려봤습니다.

I have conveyed it to you.

좋아요 한 번씩만

Just like once.

꼭 부탁드리겠습니다.

I earnestly request your help.

감사합니다.

Thank you.

디지털 러브

Digital Love

너와 나의 사랑은

Our love is

디지털 러브

Digital Love

Continue listening and achieve fluency faster with podcasts and the latest language learning research.

Check out LangTurbo