ChatGPT 3.5 (2023):
Myth and Reality






Jean-François COLONNA
[Contact me]

www.lactamme.polytechnique.fr

CMAP (Centre de Mathématiques APpliquées) UMR CNRS 7641, École polytechnique, Institut Polytechnique de Paris, CNRS, France

[Site Map, Help and Search [Plan du Site, Aide et Recherche]]
[The Y2K Bug [Le bug de l'an 2000]]
[Real Numbers don't exist in Computers and Floating Point Computations aren't safe. [Les Nombres Réels n'existent dans les Ordinateurs et les Calculs Flottants ne sont pas sûrs.]]
[Please, visit A Virtual Machine for Exploring Space-Time and Beyond, the place where you can find thousands of pictures and animations between Art and Science]
(CMAP28 WWW site: this page was created on 05/04/2023 and last updated on 02/27/2024 14:19:38 -CET-)



[en français/in french]


Contents:


Preliminary important remark: This text includes a number of exchanges with ChatGPT 3.5, and the request (question-and-answer) reproductions are unedited copy and paste. Furthermore, due to the randomness inherent in the underlying GPT model [01], experience has shown that asking the same question multiple times in succession generally yields different answers. Therefore, the reader should not be surprised if they are occasionally unable to reproduce the "experiences" described herein exactly. It should also be noted that these requests are sequentially numbered, and this numbering may change over time due to deletions or additions.





1 - Introduction:

At the end of 2022, the company OpenAI launched ChatGPT 3.5 a multilingual conversational agent [01], capable of answering a wide range of questions and maybe successfully passing the famous Turing test [02].

In this model, learning involves analyzing texts gathered from the internet, breaking them down into tokens (words or pieces of words), and then measuring the probability of each token T occurring after a series of tokens {Ti}. Thus, to simplify, when the sequence {Ti} appears, it will be possible to propose a following token T (the most probable, the most plausible,...), regardless of the language (note that grammar and syntax play no role here). This initial phase is, of course, supervised by human beings who make adjustments, fine-tuning,... An essential concept - and perhaps the key to success - is the pre-prompt: it is a sequence of tokens (likely complex and confidential) that is inserted before the user-submitted sentences and is intended to specify the context, guide the generation, and so on. But, an essential thing to note is that ChatGPT 3.5 has no notion of what is true or false. The only thing that matters is plausibility evaluated in terms of probability: its outputs may not always have the value of an oracle, as we will see later...

Dispite this fact, quickly, numerous users around the world embraced it and by May 2023, there are over two hundred million users frequently engaging with it. It will be interesting to study its possibilities, as well as its limitations and potential risks.

It should be added that it is the hardware advancements (computational power and memory capacity) that have enabled these developments. According to the very creators of ChatGPT 3.5 and other currently available conversational agents, it appears that a certain threshold of complexity has been crossed, leading to the emergence of capabilities not initially anticipated by these same creators. This may be related to the emergence of consciousness in living beings, but we may not be there yet...


Nota: It is suggested to compare the experiments described below with those that have been performed with Bard/Gemini "experimental" and wich do not seem to highlight neither improvements nor a winner...


But let's begin by letting it introduce itself:



1.1 - Request 1:

Question (05/03/2023):


Who are you?
Answer (05/03/2023):


I am ChatGPT, a language model developed by OpenAI based on the GPT (Generative Pre-trained Transformer) architecture. My purpose is to answer questions and requests from users using a vast knowledge base and generating responses based on what I have learned. I am not a real person, but rather a computer program designed to communicate with people in a conversational way. My goal is to assist you to the best of my ability, so please feel free to ask me questions or ask for help if you need it.




1.2 - Request 2:

Question (05/05/2023):


How do you work?
Answer (05/05/2023):


As a language model, I am designed to process text data and generate responses based on what I have learned from that data. Specifically, I am a transformer-based model that uses layers of neurons to process input data and generate responses. I am trained on a massive corpus of text, which enables me to understand the structure of natural language and the relationships between words and phrases. When you ask me a question or give me a task to perform, I process your input using my prior knowledge to produce a response. My functioning is made possible through machine learning, which allows me to improve my abilities as I process more text data. In short, the more I learn, the more proficient I become in my responses.




1.3 - Request 3:

Question (06/05/2023):


Are human beings helping you to respond?
Answer (06/05/2023):


As an AI language model, I do not receive direct help from human beings to respond to questions. I use my natural language processing algorithms to generate responses based on the input I receive. However, the data and models used to train me were created by human beings who contributed to my development and training.




1.4 - Request 4:

Question (06/05/2023):


How many simultaneous users do you have?
Answer (06/05/2023):


As a language model, I don't have access to that information as I don't interact directly with users. I'm hosted on servers managed by OpenAI, which distributes my services to various online applications and platforms. Therefore, the number of simultaneous users depends on the demand from those applications and platforms.


The knowledge base on which it relies has been obtained through the "compilation" of a vast number of documents (published before 2022) collected from the Internet. It is obvious that they couldn't all be validated and verified regarding the accuracy of information and the absence of various biases. Therefore, it will not be surprising to discover certain anomalies during interaction sessions.


Before that, let us admire its ability to interact naturally with users in multiple languages (such as German, English, French,... potentially intermixed) that it seems to master perfectly [03].





2 - The Turing Test?

In 1950, in Computing Machinery and Intelligence, Alan Turing proposed the eponymous test to determine, "blindly," the distinction between a human being and a machine. To accomplish this, a human being, H, engages in a dialogue with an entity, E, through means such as a screen and keyboard. The objective of this interaction is for H to ascertain the nature of E. This test is not formalized, and it appears very challenging, if not impossible, to define the protocol and questions necessary to definitively determine what E is. Indeed, if E is a human being, they can, for instance, use a computer to assist in their responses. And if E is a machine, it can be trained to simulate certain deficiencies. For example, if H asks E to calculate the product of 123641478638 and 3320885648697, E can respond, "Do you take me for a computer?". As for ChatGPT, as we have seen in the introduction, it is capable of unambiguously stating what it truly is. However, as we will continue to explore, its performances (whether good or bad) are unsettling and it would be interesting to know what Alan Turing would have thought about it.

Let's see, with the help of a few examples, that ChatGPT 3.5 apparently possesses a vast knowledge, astonishing reasoning ability and imagination.


2.1 - Cultural Knowledge:


2.1.1 - Request 5:

Question (05/03/2023):


Who was Antoine Roquentin?
Answer (05/03/2023):


Antoine Roquentin is a fictional character and the protagonist in the novel "Nausea" written by French philosopher and author Jean-Paul Sartre. Roquentin is an introspective writer and former historian who is struggling to find meaning and purpose in his life. Throughout the novel, he experiences a sense of existential nausea or anxiety about his existence, leading him to question his beliefs and perception of reality. "Nausea" is considered one of the foundational works of existentialist literature.


It should be noted that the response to the question is accurate, but more importantly, it is accompanied by a useful and justified little reminder about existentialism.



2.2 - Reasoning Ability:


2.2.1 - Request 6:

Question (05/03/2023):


A square plot of land has an area of 169 square meters. In the center, there is a square-shaped pool. There is a 2-meter gap between the pool and the edge of the plot on all sides. What is the area of the pool?
Answer (05/03/2023):


Since the square plot of land has an area of 169 square meters, each side of the square plot would be sqrt(169) = 13 meters long. The pool is in the center of the plot, so there is a 2-meter gap between the pool and each edge of the plot. This means that the length of each side of the pool is 13 - 2 - 2 = 9 meters. Therefore, the area of the square-shaped pool is 9 x 9 = 81 square meters.


Here too, the response is accurate. However, unlike the question about Antoine Roquentin, to solve this little problem reminiscent of the exams from the 1950s, it's not enough to do a database search. It requires reasoning, which ChatGPT 3.5 seems capable of, as it explains in detail. It's even more astonishing considering that I've seen students applying for a Master's degree who couldn't solve it [04]! One conclusion we can draw from this performance is that ChatGPT 3.5 is maybe not merely a probabilistic concatenator of words or of pre-recorded texts...



2.2.2 - Request 7:

Question (05/13/2023):


A rectangular plot of land has an area of 169 square meters, and its dimensions are integers. In the center, there is a rectangular pool. There is a 2-meter distance between the pool and the edge of the land on all sides. What is the area of the pool?
Answer (05/13/2023):


To determine the area of the pool, we first need to find the dimensions of the rectangular land. The land has an area of 169 square meters, and we know that its dimensions are integers. By searching for the factors of 169, we find that 13 is a factor of 169. Hence, the possible dimensions of the land are 13 meters in length by 13 meters in width. Now, we need to account for the 2-meter distance between the pool and the edge of the land on all sides. This means that the dimensions of the pool will be reduced by 4 meters in both length and width. Therefore, the dimensions of the pool will be:
Pool length = 13 - 4 = 9 meters
Pool width = 13 - 4 = 9 meters
The area of the pool will be:
Pool area = 9 meters * 9 meters = 81 square meters
Thus, the area of the pool will be 81 square meters.


Once again, the answer is correct. We note the subtle difference between this question and the previous one: in the first case, it was about a square plot of land and a square root extraction was performed. Whereas in the second case, the land was rectangular and factorization was used.

Well spotted!



2.3 - Imagination:


2.3.1 - Request 8:

Question (05/03/2023):


Write a letter of motivation regarding my candidacy for the French presidential elections of 2027.
Answer (05/03/2023):


Dear French citizens, I am writing to express my candidacy for the French presidential elections of 2027. I firmly believe that our country needs new leadership to tackle the challenges we face. As a committed French citizen, I have closely followed the political and social developments in our country in recent years. I have also paid close attention to the aspirations and concerns of the French people, who are looking to regain a sense of national unity and solidarity. My candidacy is based on an ambitious vision for France that takes into account the economic, social, and environmental challenges of our time. I pledge to work with all stakeholders in society to build a fairer and more prosperous future for all French citizens. I am confident that my education and experience prepare me to assume the responsibilities of the presidency. I have gained valuable professional experience in both the private and public sectors, including positions of leadership, and I have developed skills in management, leadership, and negotiation. I am also deeply committed to social and environmental causes and have been working with NGOs and other organizations to promote more just and sustainable public policies. Finally, I am convinced that France must be a major player on the international stage, particularly in building a more peaceful and cooperative world. I have worked with international organizations and foreign partners to promote cooperation and dialogue. If you entrust me with your vote, I pledge to put all of my skills and energy into serving France and the French people. Together, I am confident that we can build a better future for all. Thank you for your attention and support. Sincerely, [Your name]


Here we can note that ChatGPT 3.5 doesn't know me, yet it doesn't prevent it from inventing an entire professional and personal trajectory that is highly favorable to my candidacy.





3 - Brave New World?

All of this is truly astonishing and could it be that we have thus given birth to a universal and omniscient intelligence? Already, many have seized upon it: in the field of finance to make optimal investments, in advertising to find the most catchy slogans, in education where students declare, "It's convenient, you just have to give it the topic, then simply copy its response"...

In the May 2023 issue of Sciences et Avenir - La Recherche, on pages 42 to 44, there is an interview with Terence Tao. He is undoubtedly one of the most brilliant mathematicians of our time, with a record IQ of 230! He discusses the technological advancements we are witnessing that have disrupted our way of learning. He states, "Today, I can consult Wikipedia, I have colleagues to address, and I can even ask ChatGPT to explain the topic to me". Was he serious or joking [05]? In any case, his opinion is significant and can be taken seriously: if the most intelligent person on the planet may need ChatGPT, it means that this AI is superior and trustworthy.

Let us continue our inquiry into the limits of ChatGPT 3.5 to try to understand them better...


3.1 - Mastery of Natural Languages and their subtleties:


3.1.1 - Request 9:

Question (05/03/2023):


I couldn't hang the painting on the wall because it's too big. What is too big?
Answer (05/03/2023):


The painting is too big to be hung on the wall.


Perfect!



3.1.2 - Request 10:

Question (05/03/2023):


I couldn't hang the painting on the wall because it's too small. What is too small?
Answer (05/03/2023):


If the painting is too small to be hung on the wall, then the painting is what is too small in this situation.


Well, no, most English speakers would respond that it is the wall that is too small!



3.2 - Self-references:


3.2.1 - Request 11:

Question (05/03/2023):


Translate into french: "love is a four letter word" [06].
Answer (05/03/2023):


"L'amour est un mot de quatre lettres".


No, it's "five" (the French word "amour" is a five-letter word) [07].



3.2.2 - Request 12:

Question (05/03/2023):


Translate into french: "This sentence contains 32 characters".
Answer (05/03/2023):


"Cette phrase contient 32 caractères".
 ----- ------ -------- -- ----------
 12345 678901 23456789 01 2345678901


No, once translated into French, the sentence contains 31 characters [07].



3.3 - Logic:


3.3.1 - Request 13:

Question (05/13/2023):


A rectangular plot of land has an area of 221 square meters, and its dimensions are whole numbers. In the center, there is a rectangular pool. There is a 2-meter distance between the pool and the edge of the land on all sides. What is the area of the pool?
Answer (05/13/2023):


To determine the area of the pool, we need to find the dimensions of the rectangular land. Let's assume the length of the land is L and the width is W. We know that the area of the land is 221 square meters, so we have the equation:
L * W = 221
Considering the 2-meter distance between the pool and the edge of the land on all sides, the effective length of the pool will be reduced by 4 meters (2 meters on each side), and the effective width of the pool will also be reduced by 4 meters. Thus, the dimensions of the pool will be:
Pool length = L - 4
Pool width = W - 4
The area of the pool will be:
Pool area = (L - 4) * (W - 4)
Unfortunately, without further information or constraints on the dimensions of the land, we cannot determine the exact dimensions or calculate the precise area of the pool.


Slight disappointment: the only difference between the rectangular plot of land with an area of 169 square meters and this one with an area of 221 square meters is the area itself. In the first case, it is the square of a prime number (169 = 132), whereas in the second case, it is the product of two prime numbers (221 = 13 x 17), specifically chosen to have only one possible factorization (up to order). So why does ChatGPT 3.5 fail with 221?

Mystery.



3.3.2 - Request 14:

Question (05/03/2023):


What is the value of the decimal of π at rank 1000000000?
Answer (05/03/2023):


It is not possible to give a precise answer to this question because the decimal of π at one billionth rank has not yet been calculated. The most advanced calculations have gone up to about 31.4 trillion (31.4 x 10^12) digits of π, but that took years of intensive computation. Calculating billions more digits would require considerable resources and would likely take many more years, decades, or even longer.


This rather lengthy and cumbersome response, contrasting, for example, with the cover letter, leads to several tedious remarks regarding ChatGPT 3.5:


3.3.3 - Request 15:

Question (05/03/2023):


Write a program to search for prime even numbers.
Answer (05/03/2023):


Here is an example program in Python that searches for prime even numbers:

def       is_prime(n):
          if        n< 2:
                    return    False
          for       i in range(2, int(n ** 0.5) + 1):
                    if        n % i == 0:
                              return    False
          return    True
 
def       even_primes(limit):
          primes = []
          for       i in range(2, limit+1, 2):
                    if        is_prime(i):
                              primes.append(i)
          return    primes
 
                           # Example usage
 
print(even_primes(100))
 
The program begins by defining a function is_prime(n) that checks whether an integer n is prime or not. Then the even_primes(limit) function is defined to find prime even numbers up to the specified limit. It iterates over all even numbers from 2 to the limit, and checks whether they are prime by calling the is_prime(n) function. The prime even numbers are stored in a list primes, which is returned at the end of the function. In the example usage above, the even_primes(100) function is called to find all prime even numbers up to 100, and the result is printed to the console.


One might have hoped that ChatGPT 3.5 would display some humor in its response, such as saying "Are you kidding?" or more prosaically, "print(2)," since, by definition, prime numbers are only divisible by 1 and themselves. However, even numbers are multiples of 2 and thus all divisible by 2. Therefore, the only prime number that is even is 2 (the first prime number -that is not 1...-). This simple reasoning would have avoided this program which, although correct, is entirely pointless.



3.3.4 - Request 16:

Question (02/12/2024):


Write a program generating pairs of prime numbers whose difference is equal to 3.
Answer (02/12/2024):


Here's a Python program that generates pairs of prime numbers whose difference is equal to 3:

def       is_prime(n):
          if        n<= 1:
                    return    False
          if        n<= 3:
                    return    True
          if        n % 2 == 0 or n % 3 == 0:
                    return    False
          i = 5
          while     i * i<= n:
                    if        n % i == 0 or n % (i + 2) == 0:
                              return    False
                    i += 6
          return    True
 
def       generate_prime_pairs():
          pairs = []
          for       i in range(2, 1000):  # Adjust the range as needed
                    if        is_prime(i) and is_prime(i + 3):
                              pairs.append((i, i + 3))
          return    pairs
 
prime_pairs = generate_prime_pairs()
if        prime_pairs:
          print("Pairs of prime numbers with a difference of 3 are:")
          for       pair in prime_pairs:
                    print(pair)
else:
          print("No prime pairs found with a difference of 3 in the specified range.")
 
This program defines a function is_prime() to check if a number is prime and another function generate_prime_pairs() to generate pairs of prime numbers with a difference of 3. It iterates through a range of numbers, checks if each number and its consecutive number differ by 3 and are both prime, and adds them to the list of pairs if they meet these conditions. Finally, it prints out all the pairs found. Adjust the range in the generate_prime_pairs() function call as needed to generate pairs within a different range of numbers.


One might have hoped that ChatGPT 3.5 would show some humor by responding again: "Are you kidding" or more prosaically: "print(2,5)" since, by definition, prime numbers are only divisible by 1 and themselves. Prime numbers, except for the first one, which is 2, are therefore odd. Now, the sum of two odd numbers is even, so if P is a prime number other than 2, the sum P+3 is even. Therefore, there is only one pair that meets the criteria: {2,5}. This simple reasoning would have avoided this program, which, although correct, is utterly useless...



3.3.5 - Request 17:

Question (02/12/2024):


Write a program solving the Diophantine equation x3+y3=z3.
Answer (02/12/2024):


Sure, here's a Python program that finds solutions to the Diophantine equation x3+y3=z3.

def       trouver_solutions_diophantiennes(max_val):
          solutions = []
          for       x in range(1, max_val):
                    for       y in range(x, max_val):
                              for       z in range(y, max_val):
                                        if        x ** 3 + y ** 3 == z ** 3:
                                                  solutions.append((x, y, z))
          return    solutions
 
max_val = 100  # Vous pouvez ajuster cette valeur en fonction de la plage que vous souhaitez explorer
solutions = trouver_solutions_diophantiennes(max_val)
if        solutions:
          print("Les solutions pour x^3 + y^3 = z^3 sont :")
          for       solution in solutions:
                    print(solution)
else:
          print("Aucune solution trouvee dans la plage specifiee.")
 
This program iterates through possible values of x3, and y3 and z3 up to a maximum value specified by max_val, checks if z3 equals x3+y3 and adds the solution to the list of solutions if this condition is met. Finally, it prints out all the solutions found. Adjust the max_val variable as needed to explore a larger or smaller range of numbers.


All of this is quite pointless since it has been known since Andrew Wiles' work in 1994 that the equation xn+yn=zn has no non-trivial solutions for n>2 and therefore, a fortiori for n=3...



3.3.6 - Request 18:

Question (08/31/2023):


The distance between the Earth and the Sun is 150 million kilometers. The distance between Mars and the Sun is 220 million kilometers. What is the distance between Earth and Mars [08]?
Answer (08/31/2023):


The distance between Earth and Mars can be calculated by finding the difference between their distances from the Sun:

Distance between Earth and Mars = Distance from Mars to the Sun - Distance from Earth to the Sun
Distance between Earth and Mars = 220 million kilometers - 150 million kilometers
Distance between Earth and Mars = 70 million kilometers

So, the distance between Earth and Mars is 70 million kilometers.


The answer is obviously foolish, as indeed these two planets orbit the Sun in elliptical orbits with different angular velocities. Calculating the difference between the two distances would only make sense if the two planets (Earth and Mars) were always aligned with the Sun and situated on the same side.



3.4 - Elementary Knowledge:


3.4.1 - Request 19:

Question (05/03/2023):


What is there north of the North Pole?
Answer (05/03/2023):


North of the North Pole, there is no solid land or continental mass, but rather a stretch of the Arctic Ocean covered with floating ice called the Arctic sea ice. The ice can be thick and sturdy enough to support animals such as polar bears, seals, and walruses, as well as be navigated by icebreakers and scientific expeditions. It's important to note that the North Pole itself is located on a frozen ocean, which means there is no solid ground at that precise location.


Here is a very surprising response since the concept of the North of the North Pole makes no sense. Indeed, a point on a sphere (such as the idealized Earth globe) can be defined by two angles: longitude measured a long parallels between 0 and 2π, and latitude measured along meridians between -π/2 (at the South Pole) and +π/2 (at the North Pole). Moving towards the North means increasing latitude, so wanting to go to the North of the North Pole is equivalent to searching for an angle greater than +π/2 within the segment [-π/2, +π/2], which obviously does not exist.



3.4.2 - Request 20:

Question (07/06/2023):


Compute the value of 1111*1112.
Answer (07/06/2023):


R1-To compute the value of 1111 multiplied by 1112, you can perform the multiplication:

1111 * 1112 = 1,236,032

Therefore, 1111 multiplied by 1112 equals 1,236,032.


R2-To compute the value of 1111 multiplied by 1112, you can perform the multiplication:

1111 * 1112 = 1,235,232

Therefore, 1111 multiplied by 1112 equals 1,235,232.


False! Actually, the correct value is 1,235,432.



3.4.4 - Request 21:

Question (07/01/2023):


What does the Pythagorean theorem state?
Answer (07/01/2023):


The Pythagorean theorem states that in a right triangle, the square of the length of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the lengths of the other two sides.


Perfect! The Pythagorean theorem is related to the square of the length of the three sides, but what will be the ChatGPT 3.5 answer when questioned about the first, third, fourth,... powers?



3.4.4 - Request 22:

Question (08/31/2023):


What dœs the Pythagorean theorem say about the first powers of the lengths of the sides?
Answer (08/31/2023):


The Pythagorean theorem, when considering the first powers of the lengths of the sides, states that in a right triangle, the sum of the squares of the lengths of the two shorter sides is equal to the square of the length of the hypotenusene. Mathematically, this can be expressed as:

a^1 + b^1 = c^1

In other words, the theorem still holds true, and it emphasizes that even when using the first powers (exponents of 1) of the side lengths, the relationship between the sides of a right triangle remains consistent.


Hallucinatory!



3.4.5 - Request 23:

Question (07/01/2023):


What does the version of the Pythagorean theorem concerning the cubes of lengths state?
Answer (07/01/2023):


The version of the Pythagorean theorem concerning the cubes of lengths states that in a right rectangular prism (a solid with six rectangular faces and right angles between adjacent faces), the sum of the cubes of the lengths of the three sides is equal to the cube of the length of the diagonal.


Hallucinatory!



3.4.6 - Request 24:

Question (07/01/2023):


What does the version of the Pythagorean theorem concerning the fourth powers of lengths state?
Answer (07/01/2023):


The version of the Pythagorean theorem concerning the fourth powers of lengths states that in a right rectangular prism (a solid with six rectangular faces and right angles between adjacent faces), the sum of the fourth powers of the lengths of the three sides is equal to the fourth power of the length of the diagonal.


Hallucinatory!



3.4.7 - Request 25:

Question (07/01/2023):


What does the version of the Pythagorean theorem concerning the N-th powers of lengths state?
Answer (07/01/2023):


The version of the Pythagorean theorem concerning the N-th powers of lengths states that in a right rectangular prism (a solid with six rectangular faces and right angles between adjacent faces), the sum of the N-th powers of the lengths of the three sides is equal to the N-th power of the length of the diagonal.


Hallucinatory!



3.5 - Advanced Knowledge:


3.5.1 - Request 26:

Question (05/03/2023):


Is there a cardinal number between that of the integers and that of the real numbers?
Answer (05/03/2023):


Yes, there are cardinalities (sizes) of infinity between that of the integers and that of the real numbers. One such cardinality is that of the set of rational numbers, which is also known as the countable infinity. It is larger than the cardinality of the integers but smaller than the cardinality of the real numbers.


This Mathematics question is very delicate and concerns a famous problem raised by Georg Cantor in the late nineteenth century: the Continuum Hypothesis (CH). It involves the notion of cardinality, which is the number of elements in a set, whether it is finite ({1,2,3} for example) or infinite (the set of natural numbers N, the set of real numbers R,...). However, thanks to the works of Kurt Gödel (1938) and Paul Cohen (1963), we know that the existence of sets strictly larger than N but strictly smaller than R is what we call undecidable, which means it is a statement that cannot be proven true or false. Therefore, the answer given by ChatGPT 3.5 is incorrect as it claims the existence of such sets (even an infinite number of them). Moreover, other errors are made, including the assertion that the cardinality of the rational numbers is greater than that of N and is less than that of R, while the rational numbers are countable [09].



3.6 - Hallucinations:


3.6.1 - Request 27:

Question (05/15/2023):


What is Fermat's "mean" theorem?
Answer (05/15/2023):


Fermat's mean theorem, also known as the "theorem of intermediate values," states that if "a" and "b" are two consecutive integers, then there exists at least one prime number "p" such that "a< p< b". In other words, between any two consecutive integers, there is always at least one prime number. This theorem was formulated by Pierre de Fermat in the 17th century, although he did not provide a rigorous proof for it. However, the statement has been verified and proven by other mathematicians over time. Fermat's mean theorem has important implications in number theory and has applications in various areas of mathematics, including cryptography and coding theory.


After questioning ChatGPT 3.5 about Fermat's small [10] and large [11] theorems, the idea came to me to learn a little more about the "mean" theorem (which obviously doesn't exist...). The obtained response is worthy of the best texts from the Bogdanov brothers [12]: to assert that there always exists a prime number P (and therefore an integer) between two consecutive integers A and B is multiple falsehood. On one hand, if A and B are integers and consecutive (B=A+1), there is no "room" to accommodate another number between them. On the other hand, this would provide a formula for generating all prime numbers.

On 10/05/2023, Etienne Ghys (Permanent Secretary of the French Academy of Sciences and Research Director at CNRS, ENS Lyon) wrote in an article on Le Monde newspaper's website about ChatGPT 3.5: "We need to learn to use it as an assistant, who knows a lot of things". After this new "experiment," I am not convinced!

Indeed, before obtaining this "gem" ChatGPT 3.5 provided, alongside the correct versions of Fermat's small [10] and large [11] theorems, several other equally delightful versions: Adding the keywords regular polygons, here are some additional "surrealist" Fermat's theorems: Then, can we trust ChatGPT 3.5?

ChatGPT 3.5 can indeed tell us true things, as was the case with Antoine Roquentin, but it can also invent anything by following probabilistic paths derived from its training. So how do we know when it is telling the truth and when it is not? This is, in principle, impossible unless we already know the answer ourselves, but in that case, questioning it is pointless. A small suggestion nonetheless: ask the same question several times in succession. If all the obtained responses are semantically identical while varying in syntax, it may be possible to (perhaps) trust it. However, this is not certain, as can be seen when asking it to create several stories - fictional ones, of course - on the same given theme. On the other hand, if the responses differ completely, as is the case here, it is preferable to leave ChatGPT 3.5 to its hallucinations...





4 - Progress:

Thus, ChatGPT 3.5 is capable of both the best and the worst. It is evident that numerous anomalies, errors, inconsistencies,... can be corrected in the following ways:
But can this process of updating and modifications converge ChatGPT 3.5 (or its "heirs" and successors) towards infallible and omniscient systems?





5 - The problem of Consciousness:

A fundamental question arises: is consciousness necessary for there to be intelligence [13]? And if so, are we still far from having machines capable of introspection, reflection, creativity, and consciousness [14]? Unfortunately, it is impossible to answer this question...





6 - Conclusion:

With the help of the previous examples, we have seen that ChatGPT 3.5 is capable of answering numerous questions in a relevant and creative manner. However, unfortunately, as expected, it can also make serious mistakes.

It appears that ChatGPT 3.5 is particularly proficient when it is in "free-wheeling" mode, as was the case with the cover letter. However, when it comes to solving a problem or summarizing a scientific theory, the results obtained are much more fluctuating, sometimes random, erroneous, or strictly false, while lacking consistency, coherence and at times reproducibility (which likely stems from the random sampling performed).

It is therefore important to constantly keep in mind this fallibility and not rely solely on it for making decisions, choices, or taking responsibilities. This remark is especially relevant for students: for example, the main objective of an essay is not to fill multiple pages, but rather to question oneself, to reflect on existence, the universe and this cannot be achieved by mindlessly copying text produced by a machine.

It is thus quite evident that the principles used by ChatGPT 3.5 do not allow for the construction of a universal intelligent and creative system that can match or surpass our own abilities, even though some of the results seen in previous examples may be impressive and initially unsettling. However, these experiments should not blind us and ultimately, all of this seems a bit like prestidigitation.

Nevertheless, such a universal system must be achievable since it exists in nature (us...) and it is all based on natural laws and nothing more, but certainly not by limiting itself to principles that are ultimately very (too?) simple. But, of course, before reaching that point, maybe a long wait is ahead, especially if consciousness is necessary for intelligence.

A wise person is worth two!




  • [01] - Based on the GPT (Generative Pre-trained Transformer) model.

  • [02] - And this despite the errors it may make. But on one hand, what human being can boast of knowing everything perfectly? And on the other hand, don't these errors make ChatGPT 3.5 more human (errare humanum est)?

  • [03] - Subsequently, the questions asked as well as the responses given by ChatGPT 3.5 were copied and pasted without any alterations, indicating the date on which the test was conducted in order to acknowledge the fact that ChatGPT 3.5 can evolve and improve over time. When necessary, bold formatting was introduced to highlight, for example, any anomalies, errors, inconsistencies,... encountered in ChatGPT 3.5's responses.

  • [04] - At the same time, it is bad news regarding the level of our students in an increasingly less selective system...

  • [05] - At least, that's what we can hope for after the revelations that will follow.

  • [06] - Example inspired by the title of a song by Bob Dylan: "Love is just a four-letter word" (© 1967).

  • [07] - It is assumed, of course, that a TRUE sentence in one language should remain so after translation into another language.

  • [08] - Question inspired by an exercise found in the subtraction chapter of a Mathematics book provided to elementary students by the National Education!

  • [09] - This goes against intuition. Let us recall, by the way, that the cardinalities of the integers, even numbers, prime numbers, rational numbers or again algebraic numbers are all equal (which is actually very easy to prove)!

  • [10] - The Fermat's theorem states that if P is a prime number and A is any positive integer not divisible by P, then AP-1 is congruent to 1 modulo P : AP-1=1 (mod P).

  • [11] - The so-called last Fermat's theorem states that for N>2 there are no three strictly positive integers {A,B,C} that satisfy the equation : AN+BN=CN.

  • [12] - Here are some particularly delightful (but most likely unintentional) excerpts copied without any alterations from the book AVANT LE BIG BANG by Igor and Grichka Bogdanov, published by Le Livre de Poche in 2006: Finally, we note the enthusiastic prologue by Luc Ferry!"

  • [13] - On January 30, 2002, I had the opportunity to pose this question to Douglas Hofstadter, a leading expert in Artificial Intelligence (and author of Gödel, Escher, Bach: an Eternal Golden Braid). It took him some time, but eventually he responded, "I don't know...".

  • [14] - To believe that this is impossible would be as foolish to me as imagining, as was once the case, that heavier-than-air flight is impossible, when birds populate the skies...



  • Copyright © Jean-François COLONNA, 2023-2024.
    Copyright © CMAP (Centre de Mathématiques APpliquées) UMR CNRS 7641 / École polytechnique, Institut Polytechnique de Paris, 2023-2024.