Is Alan Turing’s statement correct that machines need to imitate humans in order to be called intelligent
In 1950, Alan Turing published his paper on “Imitation Game” by precisely defining artificial intelligence for machines. He put forth a test famously known as the “Turing Test” (from here on referred as TT), to determine whether a machine is intelligent or not— in which a computer will be considered intelligent, if its conversation cannot be distinguished from humans. Many critics for decades have debated the validity of TT and these criticisms make profound and interesting points which forces the reader to think that—is the Turing’s hypothesis flawed? This essay will discuss three major assessments that challenge Turing’s statement and my opinion as to why most of these are impractical.
Firstly, one of the well known critiques of the TT is the “Chinese Room” thought experiment by John R. Searle . Searle explains that the TT could be easily passed by the brute force machine in the Chinese room which is “not” intelligent. This machines have a large look up dictionary data, which for a certain input of Chinese characters, has set of Chinese characters output. The output makes sense to a native Chinese speaker, but the machine has no understanding of either the input or the output. This machine fools the native Chinese speaker that he is in-fact talking to a machine which has an understanding of Chinese and therefore this machine passes the TT. Searle criticises that these machines could easily fool judges and that the TT is flawed because it only examines a machine output, and not the understanding of the machine.
Nonetheless, the Chinese Room experiment or in general any other brute-force machine is impractical to make. This has been proved most comprehensively by Levesque , who showed that the look up dictionary/table in Searle’s Chinese room could not exist in the universe. He then points out that there is a book that could potentially pass the TT, which is the Chinese-English learning manual—a book which teaches you Chinese. Therefore, in order to pass the TT, machines will have to somehow learn the language, instead of using brute force.
Secondly, in a TT, the human judge only interacts through text messages, and have no visual or physical contact with the subjects. While on one hand, this method may seem fair because we should not judge one’s intelligence based on how it looks but instead on their conversation skills, on the other hand, few philosophers have the opinion that language may not capture all types of intelligence that humans have and hence TT is not comprehensive . Gunderson gives a good comparison between TT and “the Toe-Stepping Game” (referred as TSG)— in which the judge’s toe’s will be stepped on either by a human or by a rock and lever mechanism (which drops the rock on the judge and then removes the rock quickly, just like human removes its toe). Finally, the judge has to tell whether he was hit by a human or the rock. Gunderson claims that it is very easy for the lever mechanism to pass the test— one might think that a rock is actually a human. Gunderson later concludes that the TSG is actually flawed because it only reflects one facet of the rock’s abilities. He then draws an analogy between TSG and TT—stating that just like TSG, TT is also flawed since TT takes into consideration only the linguistic capabilities of the machine.
However, Gunderson argument is mistaken in one major assumption that the act of toestepping in TSG is analogous to language imitation in TT. The TSG does not entail any other capability of a machine other than to step on a toe. However, linguistic capabilities include a broad range of abilities. As a matter of fact, humans use language to learn almost everything in their lives and language is a general tool for learning . In other words, how do we know whether a human is intelligent or not ? Through language. So in essence, TSG and TT are not comparable.
Finally, many philosophers claim that TT cannot assess the intelligence unknown to humans— since the judge in TT is a human as well as the computer is competing against another human’s messages, so TT could just test for “human intelligence” and not for general intelligence. To explain this, French  gives an example of the “Seagull test” in which the isolated native inhabitants of a Nordic Island want to capture the notion of flight. But the only flying object they have seen is the Nordic seagull, which is native to their island. So, the inhabitants develop the test in which the inhabitants are shown two 3D radar screens, one containing the figure of a Seagull and the other having a potentially flying object. If the inhabitants cannot tell which one of the two is the seagull, only then the potentially flying object can actually fly. So, even if the inhabitants were shown the Seagull and an airplane on the 2 radars, they will deem that the airplane could not fly, which is wrong. French uses Seagull test to refute TT, stating that just like seagull test is not testing general purpose flight but it is testing for flying nordic seagulls, same is the case with TT which is not testing general intelligence but only human intelligence.
However, French has one unjustified assumption that if the Nordic inhabitants saw an airplane on the radar, they would directly classify it as a non flying object. But this is not likely,. Inhabitants may also classify the plane as an unknown object— which is doing something, they are not aware of and might go out and research the plane and then refine their test accordingly. The same thing could happen to TT when it is presented with totally new intelligence unaware to humans, after experiencing the new way of conversation with this new intelligence, the judge instead of outrightly disregarding the conversation as unintelligent, can also recognise a totally new intelligence .
To conclude, my main focus of this essay has been that even though there are many arguments against the TT but they do not make solid practical examples to refute TT. One can argue that TT can only test human intelligence, but we currently do not know the existence of other types of intelligence. And we have no way of knowing if these criticisms are correct until we have access to more intelligent machines than we have today or if we research about intelligence in more detail. The TT may not always remain an ideal test in the future but it is by far one of the best tests to measure the intelligence of a machine.
 J. R. Searle. Minds, Brains, and Programs. In Behavioural and Brain Sciences, 1980
 H. J. Levesque. Is it Enough to Get the Behaviour Right? 1989
 K. Gunderson. The Imitation Game. In Mind, 1964
 Katrina LaCurts. Criticisms of the Turing Test and Why You Should Ignore (Most of) Them, pages 4-8, 2011
 R. M. French. Subcognition and the Limits of the Turing Test. In Mind, 1990