It probably won’t surprise you that I’m not just working on robotics, I’m working on a few different kinds of automation.  I was always a bit bearish on AI, but sometime around GPT3 I started to realize that these things solve a lot of problems that have traditionally required a human.

A few years ago, I spent a lot of staff time and money putting together ‘The Bug Patrol’, this little software testing service that I was hopeful could turn into a new line of revenue for my company.  Unfortunately, when we actually put it in front of customers, it was just too expensive.  We needed human beings to run tests, because they had to actually recognize what was on screen and take appropriate action.

But GPT4 is doing a bang up job.  Here’s a screenshot of my todo list app:

 

IMG_6218E226AB2C 1.

 

When I ask ChatGPT 4o the following query:

List the elements you see on this mobile app screenshot. Then list all the elements that you believe should be tappable.

Here’s the result I get back:

Elements on the Mobile App Screenshot:

 

Daily Todos Section:

 

Today

History

 

Projects Section:

 

Personal

Comedy Writing

BCG

Empathable

Christina’s Campaign

Construction Co

Tapestry

Writing

Wish List

Campaign Software

Farm

Garden

Speak No Evil

 

Elements that Should be Tappable:

 

Daily Todos Section:

 

Today

History

 

Projects Section:

 

Personal

Comedy Writing

BCG

Empathable

Christina’s Campaign

Construction Co

Tapestry

Writing

Wish List

Campaign Software

Farm

Garden

Speak No Evil

Other Interface Elements:

 

The plus (+) button at the top right corner of the screen (likely to add a new item).

 

 

That’s pretty darn good, even though it did miss the ‘sync’ button at the bottom of the screen.

 

Let’s try again with another screenshot.  This is an app called “PictureThis” that I use to recognize plants in my garden.

IMG_F80D3AAABF4B 1.

 

Again, I ask ChatGPT 4o the following query:

List the elements you see on this mobile app screenshot. Then list all the elements that you believe should be tappable.

 

And it returns:

Elements on the Mobile App Screenshot:

 

Location:

 

Canada (with a dropdown arrow)

Search Bar:

 

Search plants

Navigation Icons:

 

Identify (camera icon)

Diagnose (first aid icon)

Reminders (clock icon)

My Garden (plant icon)

Section: Seasonal Tips

 

May (tag)

Tips:

Now’s the best time for weed prevention: Tips inside.

Grow avocado now, hydroponically or in soil!

Save money and give your plants a treat by collecting rainwater.

View All (link)

Section: Trending in Canada

 

Superhot (label on a blue flower image)

Hot (label on a sunflower image)

Hot (label on a red leaves image)

Bottom Navigation Bar:

 

Home (home icon)

Diagnose (first aid icon)

My Plants (plant icon)

More (menu icon)

Elements that Should be Tappable:

Location:

 

Canada (with a dropdown arrow)

Search Bar:

 

Search plants

Navigation Icons:

 

Identify

Diagnose

Reminders

My Garden

Section: Seasonal Tips

 

Now’s the best time for weed prevention: Tips inside.

Grow avocado now, hydroponically or in soil!

Save money and give your plants a treat by collecting rainwater.

View All

Section: Trending in Canada

 

Superhot (image)

Hot (sunflower image)

Hot (red leaves image)

Bottom Navigation Bar:

 

Home

Diagnose

My Plants

More

 

 

 

It missed the camera icon at the bottom, but given an incredibly unoptimized query, that’s a great set of results.

The long and the short of this is that we might be able to use AI image recognition to help map out apps and create better automated testing.  I’m working on implementing some Bluetooth code to mimic keyboard and mouse events, just so I can see how far it can go.

 

Related Post

Leave a Reply

Your email address will not be published. Required fields are marked *