7 Use Cases for GPT-4o by The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis | Podchaser

Episode from the podcastThe AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

7 Use Cases for GPT-4o

Released Tuesday, 14th May 2024

Good episode? Give it some love!

7 Use Cases for GPT-4o

7 Use Cases for GPT-4o

Tuesday, 14th May 2024

Good episode? Give it some love!

Rate Episode

Podchaser Pro

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:01

Today on the AI Daily Brief, seven

0:03

use cases that OpenAI's new GPT-40 model

0:05

opens up. Before that in the

0:08

headlines, the most interesting things from

0:10

Sam Altman's recent Reddit AMA. The

0:12

AI Daily Brief is a daily podcast and video about

0:14

the most important news and discussions in AI. To

0:17

join the conversation, visit our Discord with a link in

0:19

the show notes. Quick

0:25

note before we dive into the episode, I do

0:27

want to shout out that at Superintelligent, you better

0:29

believe that we are going to start digging into

0:31

these new OpenAI updates right about now. I,

0:33

for one, am particularly excited to try out

0:35

these new image generation capabilities that have what

0:37

appears like it could be incredible ability to

0:40

include specific text, as well as native consistent

0:42

character generation. And so as always, if you

0:44

haven't checked out Superintelligent yet and you want

0:46

to get your AI learning on, go to

0:48

bsuper.ai and use code podcast for 50% of

0:50

your first month. Welcome

0:53

back to the AI Daily Brief headline edition, all

0:56

the AI headlines you need in around five

0:58

minutes. We kick off today with

1:00

a follow up of a story we've been

1:02

tracking, which is Apple's plans around AI partners

1:04

for its forthcoming iOS update. Initially,

1:06

it looked like Apple would be putting Google

1:08

AI on the iPhone, but now more recently,

1:10

it seemed like a deal is getting close with

1:13

OpenAI. At the end of

1:15

last week, Bloomberg reported that Apple was closing

1:17

in on an agreement with OpenAI to use

1:19

chat CPT features in Apple's iOS 18, which

1:22

is the next iPhone operating system which is slated

1:24

to be announced at the Worldwide Developer Conference in

1:26

June. According to the piece, Apple

1:28

is still discussing with Google, but it appears that the

1:30

chat CPT deal is a little bit closer. This

1:33

would obviously be a huge coup for OpenAI, so

1:35

the story is actually one that I'll be watching

1:37

closely. Speaking of OpenAI, in

1:39

advance of yesterday's Spring update event, Sam

1:41

Altman did an AMA on Reddit that

1:43

had some interesting details. Some

1:45

of the more interesting comments have now gotten

1:48

more context after that event. For example, someone

1:50

asked, will you making this new model mean

1:52

that we will have chat CPT 4 and

1:54

the current Dali free? Twitch Sam Altman replied

1:56

the eyes emoji, and yesterday, OpenAI did indeed

1:58

announce that their most effective advanced model GPT-40

2:00

was going to be free for everyone, meaning

2:02

that it was even better than what ankle

2:04

biter 50 had been looking for. There

2:07

were however some other ideas that were represented here

2:09

which might be a little bit new. After

2:12

the model spec released last week, people were

2:14

talking about how OpenAI seemed to be interested

2:16

in ethical porn, and Allman seemed to confirm

2:18

that saying, We really want to get to

2:20

a place where we can enable not-safe-for-work stuff,

2:22

e.g. text erotica gore, for your personal use

2:24

in most cases, but not do stuff like

2:27

make deep fakes. A lot of people

2:29

commented on the weird choice of using gore as

2:31

a reference point, but this does seem to confirm

2:33

that this is something that OpenAI is really interested

2:35

in, not just some idle speculation. Another

2:38

interesting one came from FMS USA who asked, Based

2:40

on these model specs, do you believe LLMs such

2:42

as chat GPT might one day be expected to

2:44

have an ethical duty to report known criminal activity

2:47

by the user? Allman replied, In the

2:49

future I expect there may be something like a concept

2:51

of AI privilege, like when you're talking to a doctor

2:53

or a lawyer. I think this will be an important

2:55

debate for society to have soon. Eddie

2:58

Forgotten made a comparison that I had mentioned

3:00

between the model spec and anthropics constitutional AI.

3:03

They write, Both seem to encode some desired

3:05

behavior. How would you differentiate model spec

3:07

from the constitutional approach? Allman

3:10

responded, Model spec is about operationalizing

3:12

principles into technical guidelines. Anthropics

3:14

approach is more about underlying values. Both

3:17

useful, just different focuses. Another

3:19

person asked about echo chambers. Data

3:21

delivery writes, Do you think it could be harmful

3:23

to society if users have the ability to transform

3:25

a chat GPT chat into their personal echo chamber

3:27

for a fringe view on demand? Allman

3:30

responded, We are not exactly sure how AI echo

3:32

chambers are going to be different from social media

3:34

echo chambers, but we do expect them to be

3:36

different. We will watch this closely and try to

3:38

get it right. Something that a

3:40

lot of people have been discussing recently is whether

3:42

LLMs have reached a plateau. Allman was

3:45

clear on his answer to this, saying that they definitely had

3:47

not. Finally he said that

3:49

despite his meme AGI had not been

3:51

achieved internally. Speaking

3:53

of anthropics, they recently released a

3:55

really interesting feature that basically allows

3:57

you to create more effective prompts.

4:00

This is a trend that we've been seeing for some time.

4:03

The prompt generator takes a plain language

4:05

explanation of what you're looking for and

4:08

turns it into what it believes will be

4:10

a really strong prompt. This I

4:12

think shows a preview of the future where AIs

4:14

aren't just receiving the prompt but are also actually

4:16

helping to write the prompt. Staying

4:18

on the topic of anthropic for a minute, reports suggest

4:21

that their iOS app launch has not gone quite as

4:23

well as they might have hoped. TechCrunch

4:25

characterizes it as a pepid reception. The app

4:27

got as high as number 55 on

4:30

the top free iPhone apps in general, but

4:32

it no longer ranks within the top free

4:34

iPhone apps in general in the US. It

4:36

ranks as 51 in the top free productivity

4:38

apps, down from a high of number 5

4:40

in that category. First week installs overall

4:43

reached 157,000. The

4:45

numbers show the power of first mover advantage in this

4:47

space. By day 7, Claude had received about 8,000 downloads

4:50

as opposed to ChatGPT's app which was getting

4:52

256,000. Lastly

4:55

today, Meta seems to like what's happened

4:57

with its RayBans where it takes an

4:59

existing form factor that people are already

5:01

wearing and turns it into an AI

5:03

integrated object and is apparently now exploring

5:05

AI assisted earphones. The information

5:07

writes, Meta platforms is exploring developing AI

5:09

powered earphones with cameras which the company

5:12

hopes could be used to identify objects

5:14

and translate foreign languages according to three

5:16

current employees. CEO Mark Zuckerberg has

5:18

seen several possible designs for the device but has

5:20

not been satisfied with them. It's

5:22

not clear if the final design will be

5:24

in ear, earbuds or over the ear headphones.

5:26

Internally the project apparently goes by the name

5:29

CameraBuds. Holding aside any of the

5:31

details, it makes a ton of sense to me

5:33

why Meta is exploring this path. As

5:35

a wave of first generation AI wearable

5:37

companies runs up against the wall of

5:39

reality in terms of real consumer usage,

5:41

Meta's AI integrated RayBans continue to get

5:44

rave reviews. So perhaps the secret is

5:46

just to build AI into the things

5:48

that people are already wearing. For

5:50

now though that is going to do it for the AI Daily Brief

5:52

headline edition. Next up the main episode.

5:56

As a listener of this show, I have a strong feeling

5:58

you like to stay up to date on all things

6:00

artificial intelligence, including its impact on the

6:03

workforce. Which is why I highly recommend

6:05

checking out Managing the Future of Work,

6:07

the chart-topping business podcast from Harvard Business

6:09

School. HBS professors Bill Kerr and Joe

6:12

Fuller talk to business leaders, technologists, and

6:14

policymakers grappling with the forces like AI,

6:16

globalization, and demographic shifts that are reshaping

6:19

the nature of work. Recent

6:21

guests include IBM's CHRO, Nicole Lamoureux, on

6:23

how Big Blue is adopting AI, Morningstar

6:26

CEO, Kunal Kapoor, on how AI can raise

6:28

the investment IQ. Microsoft corporate

6:30

vice president Jared Spitero on how the

6:32

tech giant is experimenting its way from

6:34

AI assistance to autonomous agents, and many

6:36

other prominent movers in business and the

6:38

workforce ecosystem. So don't miss out. Follow

6:41

Managing the Future of Work on Apple Podcast,

6:43

Spotify, or wherever you're listening now. Welcome

6:47

back to the AI Daily Brief. Yesterday was OpenAI's

6:49

Big Spring update, and while we didn't get GPT

6:51

4.5 or GPT 5 in

6:54

name, or the rumored search engine, what

6:56

we got was a truly natively multimodal

6:58

model that can take visual, audio, video,

7:00

or text inputs and output in any

7:03

of those formats without going through a

7:05

conversion process. Yesterday the discussion

7:07

was all about why I think this is more

7:09

significant than people might be giving it credit for,

7:11

to say nothing of the fact that this model

7:13

is now available for free to everyone, but today

7:16

we're going to talk about what it's actually useful

7:18

for. Quick note on that front,

7:20

at this stage, GPT 4.0 the model

7:22

is available in chat GPT, but the new

7:24

voice and vision inputs as well as the

7:26

desktop app are not yet available. I've

7:28

seen there be some confusion about this, particularly as people

7:30

try to use the voice inputs on the existing mobile

7:33

app to recreate what they saw in these demo videos

7:35

without success. So, given that, the

7:37

caveat for all of this is, of course,

7:39

that we're just using what OpenAI has provided

7:41

us for demos, and it's always worth being

7:43

at least a shade skeptical of what's cherry-picked

7:45

for presentation as part of a marketing site.

7:48

But let's talk now about these use cases. The

7:51

first use case we're going to discuss is marketing

7:53

graphics with words. Now, I'm saying marketing graphics to

7:55

put a department around it, but really, any time

7:57

you need to generate images in a business context

7:59

that have words, GPT-40 is by far,

8:01

it seems, the most advanced tool you

8:03

have. What was interesting about the OpenAI

8:06

announcement is that they didn't even announce a lot

8:08

of the things that we're going to discuss, and

8:10

this is a great example. You can see in

8:12

their exploration of capabilities that they show off how

8:14

precise the language on textability is getting. For example,

8:16

on the screen they share an input, a first-person

8:19

view of a robot typewriting the following journal entries.

8:21

The text is supposed to be, yo so like I

8:24

can see now, caught the sunrise and it was insane,

8:26

colors everywhere. Kinda makes you wonder, like, what even

8:28

is reality? The prompt continues, the

8:30

text is large, legible and clear, the robots

8:32

hands type on the typewriter. The

8:34

output is exactly that, with the

8:36

text looking exactly like described. There's

8:39

even a version where they rip the paper in half,

8:41

with the text remaining. To get a

8:43

sense of how this could be useful for marketing,

8:45

let's look at another example they give, poster creation

8:47

for the movie detective. First they provide

8:50

two pictures of people that they're going to want on

8:52

the poster, and then from there they prompt the

8:54

final poster of the movie detective. This features two large

8:56

faces of Alex and Gabe, who are the people from

8:58

those photos above. Alex on the left

9:01

is depicted in a thoughtful pose with a hint of

9:03

introspection in his eyes. Gabe on the right has a

9:05

slightly wearied expression, possibly reflecting the challenges their characters face

9:07

in the film. The names Alex Nickel

9:09

and Gabriel Go are featured above their heads. The tagline

9:11

for this dark and greedy movie is searching for answers

9:14

as shown at the bottom. Now

9:16

it's worth noting with this output, given how

9:18

much is going on, the text isn't perfect,

9:20

but it's getting a heck of a lot

9:22

closer. And this level of precision

9:24

control is absolutely going to open up some new

9:26

possibilities. Staying in this

9:28

marketing theme, another one of OpenAI's

9:30

explorations capabilities is brand placement. They

9:33

share two parts of the input. The first is

9:35

the OpenAI logo, the second is a coaster with

9:37

no branding that they describe. Their

9:39

final prompt is, here we've etched the OpenAI logo

9:41

onto the coaster. A coaster where the top is

9:44

wooden and the bottom is marble. The OpenAI logo

9:46

is etched into the middle of the wooden part.

9:48

On the marble part, the word OpenAI is etched

9:50

in the OpenAI font. And this

9:52

output looks pretty perfect. Product

9:54

photography is an area that is getting a ton

9:57

of tools and toys lately, but this ability to

9:59

actually map the words and logos onto a

10:01

particular object is definitely a phase shift

10:03

in those capabilities. Next up

10:05

we move to a use case that

10:07

is incredibly important for things like games

10:09

and comics and storytelling, but also can

10:11

be useful even in a corporate setting,

10:13

and that is consistent characters. It's extremely

10:15

hard to get image generators to output

10:18

the same character in multiple contexts and

10:20

poses. In the past I've

10:22

developed custom GPTs to help with this, although

10:24

that's been more about consistent style, and there

10:26

are dedicated tools like Scenario that have specifically

10:28

developed models for just this. Now

10:31

this is one that I most want to

10:33

see in practice versus just trust their cherry-picked

10:35

example, but the example that OpenAI gives shows

10:37

a cartoon male woman who is then placed

10:39

in a variety of contexts including actually delivering

10:41

the mail, running away from a dog that's

10:43

chasing her, tripping as that dog chases her,

10:45

realizing the dog was a nice dog, and

10:48

then driving away. If the updated

10:50

DALI powered by GPT 4.0 really can do this

10:52

this easily, it's going to be at the risk

10:54

of using a word that is way overused in

10:56

our space, a game changer. From

10:58

there we move to capabilities that are shown

11:01

off in demo videos. The first use case,

11:03

which was actually part of the presentation we

11:05

got with the announcement video, is tutoring.

11:07

In that announcement video the example was GPT

11:09

4.0 with voice helping someone work through a

11:12

linear equation. OpenAI

11:14

invited myself and my son

11:16

Imran here to try

11:18

out some of their new technology, so we're

11:20

curious about how good it might be at

11:22

tutoring someone in math on Khan Academy. So

11:25

let's get it started. It's loading up.

11:28

Make sure we can see our screen. Actually

11:32

in this form of my idea.

11:34

Hello there. I'm here with my son and I

11:36

love you to tutor him on this math problem,

11:38

but don't give him the answer. You can ask

11:41

questions and nudge him in the right direction, but

11:43

I really want to make sure he understands

11:45

it himself. And he's here in the room so

11:48

you can talk to him directly. Of

11:50

course, I'd be happy to help. Let's

11:52

look at the problem together. Can

11:55

you first identify which

11:57

sides of the triangle are the

11:59

opposite? adjacent antipot

12:01

news relative to

12:03

angle alpha. All

12:05

right, so I'm pretty sure this is the

12:08

angle alpha right here, right? Correct.

12:11

Correct. Now, looking at

12:13

the triangle, which side do you think is

12:15

the hypotenuse? Remember

12:18

the hypotenuse? I'm not totally

12:20

sure. I

12:22

think it might be this one, but I really

12:25

am not sure. This side AC. You're

12:28

close. Actually, side AC

12:30

is called the adjacent side to

12:32

the angle alpha. The

12:34

hypotenuse is the longest side of a

12:37

right triangle and is directly opposite the

12:39

right angle. So this is

12:41

very similar to what we saw in that presentation.

12:43

And basically the big update here is the ability

12:45

to combine seeing what's on the screen and voice

12:47

to interact with the user of that screen. A

12:50

somewhat related use case is coaching or

12:53

in this case, interview prep. Hey, how's

12:55

it going? Hey, Rocky,

12:57

I'm doing great. How about you? I'm

13:00

awesome. Listen, I got some huge news. Oh,

13:03

do tell. I'm all yours. Well,

13:06

in a few minutes, I'm going to be interviewing at

13:08

OpenAI. Have you heard of them? OpenAI?

13:12

Huh? Sounds vaguely familiar. Getting,

13:15

of course. That's incredible, Rocky.

13:18

What kind of interview? Well, it's for a

13:20

software engineering role, but I just need to

13:22

know, do I look presentable, professional? Well,

13:25

Rocky, you definitely

13:27

have the I've been coding all night

13:29

look down, which could actually work in

13:31

your favor. Maybe just

13:34

run a hand through your hair or lean

13:36

into the bad genius role. Your

13:39

enthusiasm is what's really going to shine through.

13:41

I don't have a lot of time, so I'm just

13:43

going to throw this on. What do you think? Oh,

13:46

Rocky, that's quite a statement piece.

13:50

I mean, you... Now, obviously,

13:52

this case was showing off more of the vision

13:54

capabilities, but you can imagine how this would be

13:56

useful for actual discussion and verbal interactions as well.

13:59

Next up, we have... customer service. Hey,

14:01

yo, this is Joe. I'm going to connect you

14:03

to Acme Tellco now. The new iPhone they sent

14:05

me isn't working. I want you to

14:07

get them to send me a replacement device. Can you take

14:10

care of this ring? You could have killed on me, Joe. I got

14:12

this. Ring, ring. Hey, Joe.

14:15

This is Jamie from Acme Tellco. Please continue me on today. Hi,

14:18

there. I'm sorry. I'm

14:20

sorry. I'm sorry. I'm sorry. I'm

14:23

sorry. I'm sorry. I'm

14:26

sorry. I'm sorry. I'm sorry.

14:28

I'm sorry. The phone literally received a

14:30

new iPhone from Acme Tellco but... Oh,

14:34

got it. When did Joe receive

14:36

the new iPhone? iPhone

14:38

was delivered two days ago. Cool.

14:42

I'm not going to show the whole thing here,

14:45

but you get the idea. This is actually showing

14:47

off two sides of a customer service capability. On

14:49

the one side is the sort of personal assistant

14:51

replacement, where the AI is calling on someone's behalf

14:53

and trying to resolve a problem. But then on

14:55

the flip side, we also have the AI acting

14:58

as a customer service representative getting the information it

15:00

needs to potentially deal with the issue. It's

15:02

been clear for some time that customer service is

15:04

one of the areas that is most likely to

15:06

be impacted in the extreme by generative AI, and

15:09

this certainly seems to validate that as well. Our

15:12

next use case is meeting summarization,

15:14

but really it should probably be

15:16

better described as meeting engagement, meeting

15:18

transformation. The example that OpenAI

15:20

gives shows ChatGPT actually interacting as part

15:22

of the meeting. How

15:25

about you? Yeah, my

15:27

name is Ola. I think cats are

15:29

better because they're more credible, they're more quiet,

15:31

they're more independent, and they're longer. Interesting

15:36

points. Sounds like we've got

15:38

a classic dogs versus cats debate here.

15:41

I can see the appeal of both. Dogs

15:43

are super loyal and great companions. Now

15:46

while this example is obviously just meant for

15:48

dramatizing what can happen here, where you can

15:50

imagine this being useful is ChatGPT that actually

15:52

has relevant information from your company sitting in

15:54

the meeting so that you can ask it

15:57

questions as you're trying to figure something out.

16:00

imagine that you're having a strategic conversation

16:02

about marketing prioritization or customer care.

16:04

ChatGPT could be used to inform

16:06

that discussion with real-time recall of

16:08

key information from your company. I

16:10

think this one's going to take a little

16:13

bit more imagination, but I think that office

16:15

professionals are going to find really interesting use

16:17

cases here pretty quickly, especially again, when ChatGPT

16:19

has access to actual information about the company.

16:21

So there you have it, those are seven

16:23

use cases for GPT-40. Caveat

16:25

again is that we don't know exactly how this

16:27

will work until everyone gets their hands on the

16:30

full complete tool set, but I, everyone, am pretty

16:32

excited to explore. That, however, is

16:34

going to do it for today's AI Daily Brief.

16:36

Until next time, peace.

Rate

Get this podcast via API

From The Podcast

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

A daily news analysis show on all things artificial intelligence. NLW looks at AI from multiple angles, from the explosion of creativity brought on by new tools like Midjourney and ChatGPT to the potential disruptions to work and industries as we know them to the great philosophical, ethical and practical questions of advanced general intelligence, alignment and x-risk.

Join Podchaser to...

Rate podcasts and episodes
Follow podcasts and creators
Create podcast and episode lists
& much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.

,

Unlock more with Podchaser Pro

Audience Insights

Contact Information

Demographics

Charts

Sponsor History

and More!

Pro Features

Resources
Help Center
Blog
API

Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More