One of the difficulties experienced by those enforcing Britain’s immigration controls (and the controls of any advanced country) is the problem posed by immigrants whose origin is unknown . This may be because the immigrant does not speak English or any other widely used language or they wish to disguise where they come from, most commonly because they are illegal immigrants, by refusing to give any information about their origins. (The deliberate destruction by immigrants of documents which could identify them and their origins is commonplace ). There are also frequent claims by immigrants to come from somewhere other than their actual place of origin. The upshot of all this is that many immigrants who would be deported in principle cannot be deported because there is no way of being certain where they come from in most cases.
What is needed is a cheap, efficient and fast way of identifying from where immigrants come. How do we find that? By considering what language, dialect and accent says about an individual. The way a person speaks tells you a great deal about their origins. It will tell you not only which country they come from, but which part of the country because of the differences in accent and dialect. (The Received Pronunciation (RP form of English commonly used by the better off in England has no regional identifier but is probably unique in being an accent of a class rather than a place or region). It will also give pointers to their social status through such things as the vocabulary and syntax they use. An educated person may speak with a more standardised as well as a more extensive vocabulary than the person with no more than an elementary education or no formal education at all.
For the vast majority of people the way they speak is set sometime in childhood. Children are flexible in their accents until the age of eleven or twelve. After that the accent normally becomes fixed. This means that if a person as an adult speaks with a certain accent it is very probable that they will have spent all or much of their formative years in the place where they acquired the accent.
Someone who was raised in the same community, be it village, town or city, as someone whose place of origin is unknown should be able to readily identify the person’s origins. Those coming from the same general region as someone of unknown origin would probably be able to place the person as coming from the region, although they might not know from exactly where. For example, a person living in Birmingham might not be able to distinguish between a Lancastrian and a Yorkshireman, but they would know they were English and from the north. Move to the arena of a nation of state of some size and the ability to pin down the exact origin will weaken further, but people will still, in the overwhelming majority of cases, be able to know whether a person is of the same nationality as they are, although there will always be points of confusion where national boundaries have changed, for example, those of Alsace-Lorraine, nation states have swallowed ethnically different states, for example, China and Tibet, or there are enclaves of long established ethnic difference within a state, for example, the Basques. Obviously , the size of both the territory and population within a nation state will put a limit to how far identification of fellow nationals by speech can be carried. However, that is not a great hurdle for our purposes. Taking my own experience as a British citizen raised in England as an example, I would have no difficulty recognising those raised in this country and could place almost all within specific regions (the only people I might possibly confuse would be some Scottish accents with some Northern Irish ones, but even there I would still have them placed within British territory).
The problem with identifying people whose origins are unknown from their speech is the lack of information about their speech. No country is going to have a sufficient supply of natives from all parts of the world to even begin to identify the variety of people of unknown origin with which immigration authorities in the developed world have to deal. What is needed is a substitute for such human witnesses. This could be achieved by developing a computer program which would do the job instead.
Such a program would require a database of voices from as many places as possible in the world. The database would contain examples of speech from particular places. The samples of speech on the database would need to be drawn from across the social spectrum from the population of the place. The speech of someone of unknown or of suspected false claimed origin could then be run against the database to search for a match.
The proposal would not seem to be beyond current IT technical skills. All that would be required is a program to compare sounds and words against the database. The general capacity and speed of modern computers should make storage of the database information and the running of the program to compare data with the data base simple enough. Millions of examples of speech from different places and classes could be held on the database – databases of such a size with efficient search functions already exist, for example, the UK police computer already holds several million records including DNA and fingerprints. There is, consequently, a realistic prospect of creating a system using existing technology,which would allow examples of speech from almost every place on Earth to be stored and compared with other examples of speech. The database examples could be culled from existing recordings where the origins of the speakers are known or through the interviewing of people whose origins are known.
Apart from identifying those who cannot or will not reveal any identity or origin to the immigration authorities, such a system could also be used to test the stories of those who give an identity which is false or those who give their correct name and country but claim a false place of exact origin, for example, a town or village. An asylum seeker might correctly claim they came from Nigeria but give the wrong state or tribal group because they have done something criminal in their real place of origin. The program could reveal the lie.
The utility of the programme would go beyond the identification of people of unknown national origin for immigration purposes. For example, those who claim they cannot speak English and use a language which is not readily identifiable to be identified make criminal investigations related to them effectively impossible. Such a program would identify some , probably most, of them and once identified interpreters could be brought in and an investigation of the alleged crime begun.
Let me stress that what I am suggesting is not a machine for translating what someone says, merely identifying the language they are speaking and the place where they were raised. However, the use of machine translations of voices is progressing and that is also worth investigating, although at present the translations are nowhere good enough to substitute for a human interpreter. When machine recognition of voices and translation becomes trustworthy both to and from the person requiring a translator, it should in principle be possible to do away with much of the need for human interpreters.