Podchaser Logo
Home
7 Use Cases for GPT-4o

7 Use Cases for GPT-4o

Released Tuesday, 14th May 2024
Good episode? Give it some love!
7 Use Cases for GPT-4o

7 Use Cases for GPT-4o

7 Use Cases for GPT-4o

7 Use Cases for GPT-4o

Tuesday, 14th May 2024
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:01

Today on the AI Daily Brief, seven

0:03

use cases that OpenAI's new GPT-40 model

0:05

opens up. Before that in the

0:08

headlines, the most interesting things from

0:10

Sam Altman's recent Reddit AMA. The

0:12

AI Daily Brief is a daily podcast and video about

0:14

the most important news and discussions in AI. To

0:17

join the conversation, visit our Discord with a link in

0:19

the show notes. Quick

0:25

note before we dive into the episode, I do

0:27

want to shout out that at Superintelligent, you better

0:29

believe that we are going to start digging into

0:31

these new OpenAI updates right about now. I,

0:33

for one, am particularly excited to try out

0:35

these new image generation capabilities that have what

0:37

appears like it could be incredible ability to

0:40

include specific text, as well as native consistent

0:42

character generation. And so as always, if you

0:44

haven't checked out Superintelligent yet and you want

0:46

to get your AI learning on, go to

0:48

bsuper.ai and use code podcast for 50% of

0:50

your first month. Welcome

0:53

back to the AI Daily Brief headline edition, all

0:56

the AI headlines you need in around five

0:58

minutes. We kick off today with

1:00

a follow up of a story we've been

1:02

tracking, which is Apple's plans around AI partners

1:04

for its forthcoming iOS update. Initially,

1:06

it looked like Apple would be putting Google

1:08

AI on the iPhone, but now more recently,

1:10

it seemed like a deal is getting close with

1:13

OpenAI. At the end of

1:15

last week, Bloomberg reported that Apple was closing

1:17

in on an agreement with OpenAI to use

1:19

chat CPT features in Apple's iOS 18, which

1:22

is the next iPhone operating system which is slated

1:24

to be announced at the Worldwide Developer Conference in

1:26

June. According to the piece, Apple

1:28

is still discussing with Google, but it appears that the

1:30

chat CPT deal is a little bit closer. This

1:33

would obviously be a huge coup for OpenAI, so

1:35

the story is actually one that I'll be watching

1:37

closely. Speaking of OpenAI, in

1:39

advance of yesterday's Spring update event, Sam

1:41

Altman did an AMA on Reddit that

1:43

had some interesting details. Some

1:45

of the more interesting comments have now gotten

1:48

more context after that event. For example, someone

1:50

asked, will you making this new model mean

1:52

that we will have chat CPT 4 and

1:54

the current Dali free? Twitch Sam Altman replied

1:56

the eyes emoji, and yesterday, OpenAI did indeed

1:58

announce that their most effective advanced model GPT-40

2:00

was going to be free for everyone, meaning

2:02

that it was even better than what ankle

2:04

biter 50 had been looking for. There

2:07

were however some other ideas that were represented here

2:09

which might be a little bit new. After

2:12

the model spec released last week, people were

2:14

talking about how OpenAI seemed to be interested

2:16

in ethical porn, and Allman seemed to confirm

2:18

that saying, We really want to get to

2:20

a place where we can enable not-safe-for-work stuff,

2:22

e.g. text erotica gore, for your personal use

2:24

in most cases, but not do stuff like

2:27

make deep fakes. A lot of people

2:29

commented on the weird choice of using gore as

2:31

a reference point, but this does seem to confirm

2:33

that this is something that OpenAI is really interested

2:35

in, not just some idle speculation. Another

2:38

interesting one came from FMS USA who asked, Based

2:40

on these model specs, do you believe LLMs such

2:42

as chat GPT might one day be expected to

2:44

have an ethical duty to report known criminal activity

2:47

by the user? Allman replied, In the

2:49

future I expect there may be something like a concept

2:51

of AI privilege, like when you're talking to a doctor

2:53

or a lawyer. I think this will be an important

2:55

debate for society to have soon. Eddie

2:58

Forgotten made a comparison that I had mentioned

3:00

between the model spec and anthropics constitutional AI.

3:03

They write, Both seem to encode some desired

3:05

behavior. How would you differentiate model spec

3:07

from the constitutional approach? Allman

3:10

responded, Model spec is about operationalizing

3:12

principles into technical guidelines. Anthropics

3:14

approach is more about underlying values. Both

3:17

useful, just different focuses. Another

3:19

person asked about echo chambers. Data

3:21

delivery writes, Do you think it could be harmful

3:23

to society if users have the ability to transform

3:25

a chat GPT chat into their personal echo chamber

3:27

for a fringe view on demand? Allman

3:30

responded, We are not exactly sure how AI echo

3:32

chambers are going to be different from social media

3:34

echo chambers, but we do expect them to be

3:36

different. We will watch this closely and try to

3:38

get it right. Something that a

3:40

lot of people have been discussing recently is whether

3:42

LLMs have reached a plateau. Allman was

3:45

clear on his answer to this, saying that they definitely had

3:47

not. Finally he said that

3:49

despite his meme AGI had not been

3:51

achieved internally. Speaking

3:53

of anthropics, they recently released a

3:55

really interesting feature that basically allows

3:57

you to create more effective prompts.

4:00

This is a trend that we've been seeing for some time.

4:03

The prompt generator takes a plain language

4:05

explanation of what you're looking for and

4:08

turns it into what it believes will be

4:10

a really strong prompt. This I

4:12

think shows a preview of the future where AIs

4:14

aren't just receiving the prompt but are also actually

4:16

helping to write the prompt. Staying

4:18

on the topic of anthropic for a minute, reports suggest

4:21

that their iOS app launch has not gone quite as

4:23

well as they might have hoped. TechCrunch

4:25

characterizes it as a pepid reception. The app

4:27

got as high as number 55 on

4:30

the top free iPhone apps in general, but

4:32

it no longer ranks within the top free

4:34

iPhone apps in general in the US. It

4:36

ranks as 51 in the top free productivity

4:38

apps, down from a high of number 5

4:40

in that category. First week installs overall

4:43

reached 157,000. The

4:45

numbers show the power of first mover advantage in this

4:47

space. By day 7, Claude had received about 8,000 downloads

4:50

as opposed to ChatGPT's app which was getting

4:52

256,000. Lastly

4:55

today, Meta seems to like what's happened

4:57

with its RayBans where it takes an

4:59

existing form factor that people are already

5:01

wearing and turns it into an AI

5:03

integrated object and is apparently now exploring

5:05

AI assisted earphones. The information

5:07

writes, Meta platforms is exploring developing AI

5:09

powered earphones with cameras which the company

5:12

hopes could be used to identify objects

5:14

and translate foreign languages according to three

5:16

current employees. CEO Mark Zuckerberg has

5:18

seen several possible designs for the device but has

5:20

not been satisfied with them. It's

5:22

not clear if the final design will be

5:24

in ear, earbuds or over the ear headphones.

5:26

Internally the project apparently goes by the name

5:29

CameraBuds. Holding aside any of the

5:31

details, it makes a ton of sense to me

5:33

why Meta is exploring this path. As

5:35

a wave of first generation AI wearable

5:37

companies runs up against the wall of

5:39

reality in terms of real consumer usage,

5:41

Meta's AI integrated RayBans continue to get

5:44

rave reviews. So perhaps the secret is

5:46

just to build AI into the things

5:48

that people are already wearing. For

5:50

now though that is going to do it for the AI Daily Brief

5:52

headline edition. Next up the main episode.

5:56

As a listener of this show, I have a strong feeling

5:58

you like to stay up to date on all things

6:00

artificial intelligence, including its impact on the

6:03

workforce. Which is why I highly recommend

6:05

checking out Managing the Future of Work,

6:07

the chart-topping business podcast from Harvard Business

6:09

School. HBS professors Bill Kerr and Joe

6:12

Fuller talk to business leaders, technologists, and

6:14

policymakers grappling with the forces like AI,

6:16

globalization, and demographic shifts that are reshaping

6:19

the nature of work. Recent

6:21

guests include IBM's CHRO, Nicole Lamoureux, on

6:23

how Big Blue is adopting AI, Morningstar

6:26

CEO, Kunal Kapoor, on how AI can raise

6:28

the investment IQ. Microsoft corporate

6:30

vice president Jared Spitero on how the

6:32

tech giant is experimenting its way from

6:34

AI assistance to autonomous agents, and many

6:36

other prominent movers in business and the

6:38

workforce ecosystem. So don't miss out. Follow

6:41

Managing the Future of Work on Apple Podcast,

6:43

Spotify, or wherever you're listening now. Welcome

6:47

back to the AI Daily Brief. Yesterday was OpenAI's

6:49

Big Spring update, and while we didn't get GPT

6:51

4.5 or GPT 5 in

6:54

name, or the rumored search engine, what

6:56

we got was a truly natively multimodal

6:58

model that can take visual, audio, video,

7:00

or text inputs and output in any

7:03

of those formats without going through a

7:05

conversion process. Yesterday the discussion

7:07

was all about why I think this is more

7:09

significant than people might be giving it credit for,

7:11

to say nothing of the fact that this model

7:13

is now available for free to everyone, but today

7:16

we're going to talk about what it's actually useful

7:18

for. Quick note on that front,

7:20

at this stage, GPT 4.0 the model

7:22

is available in chat GPT, but the new

7:24

voice and vision inputs as well as the

7:26

desktop app are not yet available. I've

7:28

seen there be some confusion about this, particularly as people

7:30

try to use the voice inputs on the existing mobile

7:33

app to recreate what they saw in these demo videos

7:35

without success. So, given that, the

7:37

caveat for all of this is, of course,

7:39

that we're just using what OpenAI has provided

7:41

us for demos, and it's always worth being

7:43

at least a shade skeptical of what's cherry-picked

7:45

for presentation as part of a marketing site.

7:48

But let's talk now about these use cases. The

7:51

first use case we're going to discuss is marketing

7:53

graphics with words. Now, I'm saying marketing graphics to

7:55

put a department around it, but really, any time

7:57

you need to generate images in a business context

7:59

that have words, GPT-40 is by far,

8:01

it seems, the most advanced tool you

8:03

have. What was interesting about the OpenAI

8:06

announcement is that they didn't even announce a lot

8:08

of the things that we're going to discuss, and

8:10

this is a great example. You can see in

8:12

their exploration of capabilities that they show off how

8:14

precise the language on textability is getting. For example,

8:16

on the screen they share an input, a first-person

8:19

view of a robot typewriting the following journal entries.

8:21

The text is supposed to be, yo so like I

8:24

can see now, caught the sunrise and it was insane,

8:26

colors everywhere. Kinda makes you wonder, like, what even

8:28

is reality? The prompt continues, the

8:30

text is large, legible and clear, the robots

8:32

hands type on the typewriter. The

8:34

output is exactly that, with the

8:36

text looking exactly like described. There's

8:39

even a version where they rip the paper in half,

8:41

with the text remaining. To get a

8:43

sense of how this could be useful for marketing,

8:45

let's look at another example they give, poster creation

8:47

for the movie detective. First they provide

8:50

two pictures of people that they're going to want on

8:52

the poster, and then from there they prompt the

8:54

final poster of the movie detective. This features two large

8:56

faces of Alex and Gabe, who are the people from

8:58

those photos above. Alex on the left

9:01

is depicted in a thoughtful pose with a hint of

9:03

introspection in his eyes. Gabe on the right has a

9:05

slightly wearied expression, possibly reflecting the challenges their characters face

9:07

in the film. The names Alex Nickel

9:09

and Gabriel Go are featured above their heads. The tagline

9:11

for this dark and greedy movie is searching for answers

9:14

as shown at the bottom. Now

9:16

it's worth noting with this output, given how

9:18

much is going on, the text isn't perfect,

9:20

but it's getting a heck of a lot

9:22

closer. And this level of precision

9:24

control is absolutely going to open up some new

9:26

possibilities. Staying in this

9:28

marketing theme, another one of OpenAI's

9:30

explorations capabilities is brand placement. They

9:33

share two parts of the input. The first is

9:35

the OpenAI logo, the second is a coaster with

9:37

no branding that they describe. Their

9:39

final prompt is, here we've etched the OpenAI logo

9:41

onto the coaster. A coaster where the top is

9:44

wooden and the bottom is marble. The OpenAI logo

9:46

is etched into the middle of the wooden part.

9:48

On the marble part, the word OpenAI is etched

9:50

in the OpenAI font. And this

9:52

output looks pretty perfect. Product

9:54

photography is an area that is getting a ton

9:57

of tools and toys lately, but this ability to

9:59

actually map the words and logos onto a

10:01

particular object is definitely a phase shift

10:03

in those capabilities. Next up

10:05

we move to a use case that

10:07

is incredibly important for things like games

10:09

and comics and storytelling, but also can

10:11

be useful even in a corporate setting,

10:13

and that is consistent characters. It's extremely

10:15

hard to get image generators to output

10:18

the same character in multiple contexts and

10:20

poses. In the past I've

10:22

developed custom GPTs to help with this, although

10:24

that's been more about consistent style, and there

10:26

are dedicated tools like Scenario that have specifically

10:28

developed models for just this. Now

10:31

this is one that I most want to

10:33

see in practice versus just trust their cherry-picked

10:35

example, but the example that OpenAI gives shows

10:37

a cartoon male woman who is then placed

10:39

in a variety of contexts including actually delivering

10:41

the mail, running away from a dog that's

10:43

chasing her, tripping as that dog chases her,

10:45

realizing the dog was a nice dog, and

10:48

then driving away. If the updated

10:50

DALI powered by GPT 4.0 really can do this

10:52

this easily, it's going to be at the risk

10:54

of using a word that is way overused in

10:56

our space, a game changer. From

10:58

there we move to capabilities that are shown

11:01

off in demo videos. The first use case,

11:03

which was actually part of the presentation we

11:05

got with the announcement video, is tutoring.

11:07

In that announcement video the example was GPT

11:09

4.0 with voice helping someone work through a

11:12

linear equation. OpenAI

11:14

invited myself and my son

11:16

Imran here to try

11:18

out some of their new technology, so we're

11:20

curious about how good it might be at

11:22

tutoring someone in math on Khan Academy. So

11:25

let's get it started. It's loading up.

11:28

Make sure we can see our screen. Actually

11:32

in this form of my idea.

11:34

Hello there. I'm here with my son and I

11:36

love you to tutor him on this math problem,

11:38

but don't give him the answer. You can ask

11:41

questions and nudge him in the right direction, but

11:43

I really want to make sure he understands

11:45

it himself. And he's here in the room so

11:48

you can talk to him directly. Of

11:50

course, I'd be happy to help. Let's

11:52

look at the problem together. Can

11:55

you first identify which

11:57

sides of the triangle are the

11:59

opposite? adjacent antipot

12:01

news relative to

12:03

angle alpha. All

12:05

right, so I'm pretty sure this is the

12:08

angle alpha right here, right? Correct.

12:11

Correct. Now, looking at

12:13

the triangle, which side do you think is

12:15

the hypotenuse? Remember

12:18

the hypotenuse? I'm not totally

12:20

sure. I

12:22

think it might be this one, but I really

12:25

am not sure. This side AC. You're

12:28

close. Actually, side AC

12:30

is called the adjacent side to

12:32

the angle alpha. The

12:34

hypotenuse is the longest side of a

12:37

right triangle and is directly opposite the

12:39

right angle. So this is

12:41

very similar to what we saw in that presentation.

12:43

And basically the big update here is the ability

12:45

to combine seeing what's on the screen and voice

12:47

to interact with the user of that screen. A

12:50

somewhat related use case is coaching or

12:53

in this case, interview prep. Hey, how's

12:55

it going? Hey, Rocky,

12:57

I'm doing great. How about you? I'm

13:00

awesome. Listen, I got some huge news. Oh,

13:03

do tell. I'm all yours. Well,

13:06

in a few minutes, I'm going to be interviewing at

13:08

OpenAI. Have you heard of them? OpenAI?

13:12

Huh? Sounds vaguely familiar. Getting,

13:15

of course. That's incredible, Rocky.

13:18

What kind of interview? Well, it's for a

13:20

software engineering role, but I just need to

13:22

know, do I look presentable, professional? Well,

13:25

Rocky, you definitely

13:27

have the I've been coding all night

13:29

look down, which could actually work in

13:31

your favor. Maybe just

13:34

run a hand through your hair or lean

13:36

into the bad genius role. Your

13:39

enthusiasm is what's really going to shine through.

13:41

I don't have a lot of time, so I'm just

13:43

going to throw this on. What do you think? Oh,

13:46

Rocky, that's quite a statement piece.

13:50

I mean, you... Now, obviously,

13:52

this case was showing off more of the vision

13:54

capabilities, but you can imagine how this would be

13:56

useful for actual discussion and verbal interactions as well.

13:59

Next up, we have... customer service. Hey,

14:01

yo, this is Joe. I'm going to connect you

14:03

to Acme Tellco now. The new iPhone they sent

14:05

me isn't working. I want you to

14:07

get them to send me a replacement device. Can you take

14:10

care of this ring? You could have killed on me, Joe. I got

14:12

this. Ring, ring. Hey, Joe.

14:15

This is Jamie from Acme Tellco. Please continue me on today. Hi,

14:18

there. I'm sorry. I'm

14:20

sorry. I'm sorry. I'm sorry. I'm

14:23

sorry. I'm sorry. I'm

14:26

sorry. I'm sorry. I'm sorry.

14:28

I'm sorry. The phone literally received a

14:30

new iPhone from Acme Tellco but... Oh,

14:34

got it. When did Joe receive

14:36

the new iPhone? iPhone

14:38

was delivered two days ago. Cool.

14:42

I'm not going to show the whole thing here,

14:45

but you get the idea. This is actually showing

14:47

off two sides of a customer service capability. On

14:49

the one side is the sort of personal assistant

14:51

replacement, where the AI is calling on someone's behalf

14:53

and trying to resolve a problem. But then on

14:55

the flip side, we also have the AI acting

14:58

as a customer service representative getting the information it

15:00

needs to potentially deal with the issue. It's

15:02

been clear for some time that customer service is

15:04

one of the areas that is most likely to

15:06

be impacted in the extreme by generative AI, and

15:09

this certainly seems to validate that as well. Our

15:12

next use case is meeting summarization,

15:14

but really it should probably be

15:16

better described as meeting engagement, meeting

15:18

transformation. The example that OpenAI

15:20

gives shows ChatGPT actually interacting as part

15:22

of the meeting. How

15:25

about you? Yeah, my

15:27

name is Ola. I think cats are

15:29

better because they're more credible, they're more quiet,

15:31

they're more independent, and they're longer. Interesting

15:36

points. Sounds like we've got

15:38

a classic dogs versus cats debate here.

15:41

I can see the appeal of both. Dogs

15:43

are super loyal and great companions. Now

15:46

while this example is obviously just meant for

15:48

dramatizing what can happen here, where you can

15:50

imagine this being useful is ChatGPT that actually

15:52

has relevant information from your company sitting in

15:54

the meeting so that you can ask it

15:57

questions as you're trying to figure something out.

16:00

imagine that you're having a strategic conversation

16:02

about marketing prioritization or customer care.

16:04

ChatGPT could be used to inform

16:06

that discussion with real-time recall of

16:08

key information from your company. I

16:10

think this one's going to take a little

16:13

bit more imagination, but I think that office

16:15

professionals are going to find really interesting use

16:17

cases here pretty quickly, especially again, when ChatGPT

16:19

has access to actual information about the company.

16:21

So there you have it, those are seven

16:23

use cases for GPT-40. Caveat

16:25

again is that we don't know exactly how this

16:27

will work until everyone gets their hands on the

16:30

full complete tool set, but I, everyone, am pretty

16:32

excited to explore. That, however, is

16:34

going to do it for today's AI Daily Brief.

16:36

Until next time, peace.

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features