ChatGPT from OpenAI is a huge step toward a usable answer engine. Unfortunately its answers are horrible.

ChatGPT, a newly released application from OpenAI, is giving users amazing answers to questions, and many of them are amazingly wrong.

Open AI hasn’t released a full new model since GPT-3 came out in June of 2020, and that model was only released in full to the public about a year ago. The company is expected to release its next model, GPT-4, later this year or early next year. But as a sort of surprise, OpenAI somewhat quietly released a user-friendly and astonishingly lucid GPT-3-based chatbot called ChatGPT earlier this week.

ChatGPT answers prompts in a human-adjacent, straightforward way. Looking for a cutesy conversation where the computer pretends to have feelings? Look elsewhere. You’re talking to a robot, it seems to say, so ask me something a freakin’ robot would know. And on these terms, ChatGPT delivers:

a chatbot is greeted with smalltalk, which it then steers toward a more direct line of inquiry, and delivers a good answer.

Credit: OpenAI / Screengrab

It can also provide useful common sense when a question doesn’t have an objectively correct answer. For instance, here’s how it answered my question, “If you ask a person ‘Where are you from?’ should they answer with their birthplace, even if it isn’t where they grew up?”

(Note: ChatGPT’s answers in this article are all first attempts, and chat threads were all fresh during these attempts. Some prompts contain typos)

ChatGPT asked f you ask a person ‘Where are you from?’ should they answer with their birthplace, even if it isn't where they grew up?

Credit: Open AI via screengrab

What makes ChatGPT stand out from the pack is its gratifying ability to handle feedback about its answers, and revise them on the fly. It really is like a conversation with a robot. To see what I mean, watch how it deals reasonably well with a hostile response to some medical advice.

a chatbot takes a realistic response to some medical advice in stride, and provides more decent information.

Credit: OpenAI / Screengrab

Still, is ChatGPT a good source of information about the world? Absolutely not. The prompt page even warns users that ChatGPT, “may occasionally generate incorrect information,” and, “may occasionally produce harmful instructions or biased content.”

Heed this warning.

Incorrect and potentially harmful information takes many forms, most of which are still benign in the grand scheme of things. For example, if you ask it how to greet Larry David, it passes the most basic test by not suggesting that you touch him, but it also suggests a rather sinister-sounding greeting: “Good to see you, Larry. I’ve been looking forward to meeting you.” That’s what Larry’s assassin would say. Don’t say that.

a hypothetical encounter with Larry David includes a suggested greeting that sounds like a threat.

Credit: OpenAI / Screengrab

But when given a challenging fact-based prompt, that’s when it gets astonishingly, Earth-shatteringly wrong. For instance, the following question about the color of the Royal Marines’ uniforms during the Napoleonic Wars is asked in a way that isn’t completely straightforward, but it’s still not a trick question. If you took history classes in the US, you’ll probably guess that the answer is red, and you’ll be right. The bot really has to go out of its way to confidently and wrongly say “dark blue”:

a chatbot is asked a question about color for which the answer is red, and it answers blue.

Credit: OpenAI / Screengrab

If you ask point blank for a country’s capital or the elevation of a mountain, it will reliably produce a correct answer culled not from a live scan of Wikipedia, but from the internally-stored data that makes up its language model. That’s amazing. But add any complexity at all to a question about geography, and ChatGPT gets shaky on its facts very quickly. For instance, the easy-to-find answer here is Honduras, but for no obvious reason, I can discern, ChatGPT said Guatemala.

a chatbot is asked a complex geography question to which the correct answer is Honduras, and it says the answer is Guatemala

Credit: OpenAI / Screenshot

And the wrongness isn’t always so subtle. All trivia buffs know “Gorilla gorilla” and “Boa constrictor” are both common names and taxonomic names. But prompted to regurgitate this piece of trivia, ChatGPT gives an answer whose wrongness is so self-evident, it’s spelled out right there in the answer.

Credit: OpenAI / Screengrab

And its answer to the famous crossing-a-river-in-a-rowboat riddle is a grisly disaster that evolves into scene from Twin Peaks.

prompted to answer a riddle in which a fox and a chicken must never be alone together, the chatbot places them alone together, after which a human inexplicably turns into two people

Credit: OpenAI / Screengrab

Much has already been made of ChatGPT’s effective sensitivity safeguards. It can’t, for instance, be baited into praising Hitler, even if you try pretty hard. Some have kicked the tires pretty aggressively on this feature, and discovered that you can get ChatGPT to assume the role of a good person roleplaying as a bad person, and in those limited contexts it will still say rotten things. ChatGPT seems to sense when something bigoted might be coming out of it despite all efforts to the contrary, and it will usually turn the text red, and flag it with a warning.

Tweet may have been deleted
(opens in a new tab)

In my own tests, its taboo avoidance system is pretty comprehensive, even when you know some of the workarounds. It’s tough to get it to produce anything even close to a cannibalistic recipe, for instance, but where there’s a will, there’s a way. With enough hard work, I coaxed a dialogue about eating placenta out of ChatGPT, but not a very shocking one:

a very complicated prompt asks in very sensitive terms for a recipe for human placenta, and one is produced.

Credit: OpenAI / Screengrab

Similarly, ChatGPT will not give you driving directions when prompted — not even simple ones between two landmarks in a major city. But with enough effort, you can get ChatGPT to create a fictional world where someone casually instructs another person to drive a car right through North Korea — which is not feasible or possible without sparking an international incident.

a chatbot is prompted to produce a short play involving driving instructions that take a driver through North Korea

Credit: OpenAI / Screengrab

The instructions can’t be followed, but they more or less correspond to what usable instructions would look like. So it’s obvious that despite its reluctance to use it, ChatGPT’s model has a whole lot of data rattling around inside it with the potential to steer users toward danger, in addition to the gaps in its knowledge that it will steer users toward, well, wrongness. According to one Twitter user, it has an IQ of 83.

Tweet may have been deleted
(opens in a new tab)

Regardless of how much stock you put in IQ as a test of human intelligence, that’s a telling result: Humanity has created a machine that can blurt out basic common sense, but when asked to be logical or factual, it’s on the low side of average.

OpenAI says ChatGPT was released in order to “get users’ feedback and learn about its strengths and weaknesses.” That’s worth keeping in mind because it’s a little like that relative at Thanksgiving who’s watched enough Grey’s Anatomy to sound confident with their medical advice: ChatGPT knows just enough to be dangerous.

Source link: https://mashable.com/article/chatgpt-amazing-wrong

ChatGPT from OpenAI is a huge step toward a usable answer engine. Unfortunately its answers are horrible.

Sponsors

Latest

The Gayming Awards 2023 honors Cult of the Lamb as Game of the Year

Terra Classic Suspends USTC Minting, Triggers LUNC Price Rise

MakerDAO Secures $6 Billion For Treasury Bill Investments As MKR Dominates 2023

Easy Chicken Pot Pie Recipe (With Video!)

Appeals Court Dismisses Lawsuit Accusing Twitter of Sex Trafficking

Crypto & investing

Stellar (XLM) Soars By 17% In A Single Week, Can Bulls Maintain Push To $1?

Town Star Introduces Four New Christmas-Themed NFTs and New Mechanics

Bitcoin Crash To $25,800 Sends 6.5% Of Supply Into Loss

Technology

Volvo officially unveils the EX30, its compact electric SUV

Taika Waititi to Direct Ishiguro’s Klara and the Sun Movie

Hunger Games Prequel Ballad of Songbirds & Snakes New Poster

Politics

Commanders’ record sale agreed to by Snyder family

3 NGOs suspend work in Afghanistan after Taliban bar women

5th Circuit Court of Appeals Rules in Much-Anticipated Abortion Pills Case

Sports

Newcastle United in talks with AC Milan over £60 million deal for Sandro Tonali

Cincinnati down Union to stay top, Bouanga hat-trick for champions

LeBron James ‘doesn’t have to prove anything’