A solution to identifying the origin of immigrants who cannot or will not say who they are

Robert Henderson

One of the difficulties experienced by those enforcing Britain’s immigration controls (and the controls of any advanced country)  is the problem posed by immigrants whose origin is unknown . This may be because  the immigrant does not speak English or any other widely used language or they wish to disguise where they come from, most commonly because they are illegal immigrants, by refusing to give any information about their origins. (The deliberate destruction by immigrants of  documents which could identify them and their origins is commonplace ). There are also frequent claims by immigrants to come from somewhere other than their  actual place of origin.  The upshot of all this is that many immigrants who  would be deported in principle cannot be deported because there is no way of being certain where they come from in most cases.

What is needed is a cheap, efficient  and fast way of identifying from where  immigrants  come.   How do we find that?  By  considering what language, dialect and accent says about an individual.  The way a person speaks tells you a great deal about their origins.  It will tell you  not only which country they come from, but which part of the country because of the differences in accent and dialect. (The Received Pronunciation  (RP form of English commonly used by the better off in England has no regional identifier but is probably unique in being an accent of a class rather than a place or region).  It will also give pointers to their social status  through such things as the vocabulary and  syntax they use.  An  educated person may speak with a more standardised as well as a  more extensive  vocabulary than the person with no more than an elementary education or no formal education at all.

For the vast majority of people the way they speak is set sometime in childhood. Children are flexible in their accents until the age of eleven or twelve. After that the accent normally becomes fixed. This means that if a person as an adult speaks with a certain accent it is very probable that they will have spent all or much of their formative years in the place where they acquired the accent.

Someone who was raised in the same community, be it village, town or city,  as someone whose place of origin is unknown should be able to readily identify the person’s origins.   Those coming from the same general region as someone of unknown origin  would probably be able to place the person as coming from the region, although they might not know  from exactly where. For example, a person living in Birmingham  might  not be able to distinguish between a Lancastrian and a Yorkshireman,  but they would know they were English and from the north. Move to the arena of a  nation of state of some size and the ability to pin down the exact origin will weaken further,  but people will still, in the overwhelming majority of  cases, be able to know whether a person is of the same nationality as they are, although  there will always be points of confusion where national boundaries have changed, for example,  those of Alsace-Lorraine,  nation states have swallowed ethnically different states, for example, China and Tibet, or there are enclaves of long established ethnic difference within a state, for example, the Basques.  Obviously , the size of both the territory and population within a nation state will put a limit to how far identification of fellow nationals by speech  can be carried.  However, that is not a great hurdle for our purposes. Taking my own experience as a British citizen raised in England as an example, I would have no difficulty recognising those raised in this country  and could place almost all  within specific regions (the only  people I might possibly confuse would be some Scottish accents with some Northern Irish ones, but even there I would still have them placed within British territory).

The problem with identifying people whose origins are unknown from their speech  is the lack of information about their speech.  No country is going to have a sufficient supply of natives from all parts of the world to even begin to identify  the variety of people of unknown origin with which  immigration authorities in the developed world have to deal.   What is needed is a substitute for such human witnesses. This could be achieved by developing a computer program which would do the job instead.

Such a program  would require a database of  voices from as many places as  possible in the world.  The database would  contain examples of speech from particular places.  The samples of speech on the database would need to be drawn from across the social spectrum from the population  of the place.   The speech of someone of unknown or of suspected false claimed origin could then be run against the database to search for a match.

The proposal would not seem to be beyond  current IT technical skills. All that would be required is a program to compare sounds and words against the database.  The general capacity and speed of modern computers  should make storage of the database information and the running of the program to compare data with the data base simple enough. Millions of  examples of speech from different places and classes  could be held on the database  – databases of such a size with efficient search functions already exist, for example, the  UK police computer already holds several million records including DNA and fingerprints.   There is, consequently,  a realistic prospect of   creating a system using existing technology,which would allow examples of  speech from almost every place on Earth to be  stored and compared with other examples of speech.   The database  examples  could be culled from existing recordings where the origins of the speakers are known or through the interviewing of people whose origins are known.

Apart from identifying those who cannot or will not reveal  any identity or origin to the immigration authorities,  such a system could also be used to test the stories of those who give an identity which is false or those who give their correct name  and country but claim a false place of exact origin, for example, a town or village.  An asylum seeker might correctly claim they came from Nigeria but give the wrong state or tribal group because they have done something criminal in their real place of origin.   The program could reveal the lie.

The utility of the programme would go beyond the identification of people  of unknown national origin for immigration purposes. For example, those who claim they cannot speak English and use a language which is not readily identifiable to be identified make criminal investigations related to them effectively impossible.   Such a program would  identify some , probably most, of them and once identified  interpreters could be brought in and an investigation of the alleged crime begun.

Let me stress that what I am suggesting is not a machine for translating what someone says, merely identifying the language they are speaking and the place where they were raised. However, the use of machine translations of voices is progressing and that is also worth investigating, although at present the translations are nowhere good enough to substitute for a human interpreter. When machine recognition of voices and translation becomes trustworthy both to and from the person requiring a translator,  it should in principle be possible to do away with much of the need for human interpreters.

