If you deal with last names for any largish set of people, you ought to consider the following cases. For another (similar, overlapping but more culture-oriented than programming oriented) take, see Personal Names Around the World.

Family Names

These are some kinds of last names programmers I know have mangled or rejected:

  • MacDonald (included capital)
  • O'Brien (included punctuation)
  • Armstrong Zwicky (internal space, too long)
  • d'Aulaire (no initial cap, included punctuation)
  • van den Berg (two missing capitals, internal spaces)
  • Smith y Ibarra (missing capital internal, internal spaces)
  • O (one character)
  • Satyanarayanan (too long)
  • St. Pierre (different included punctuation, internal spaces)
  • ffollett (no capitals at all)
If you want to know a family name, don't ask for a "last name" as some people come from cultures where the family name comes first. You will get the names backwards and disentangling it will be painful.

In general, initial words that aren't capitalized aren't considered when alphabetizing, either. This only matters if your alphabetized list is going to be used primarily by people who know this; if your company has one "van Beethoven", alphabetizing correctly is probably not useful, since the person with the name is more likely to understand the difference in conventions than anybody else. Alphabetize in a way that works for the searchers. (Note that this may mean alphabetizing in different ways in different countries. Internationalization is a tough problem. Sorry.)

First and Middle Names

First names can also have spaces in them. Some people with the name "Marie Claire Johnson" have the first name "Marie" and the middle name "Claire". Some of them have the first name "Marie Claire" and of those, many come from places where you would just know that, so they may not find it important to tell you. Oh, and if poor Marie Claire is tired of being called "Marie", she may write her name "Marie-Claire" so be prepared for hyphens. In fact, be prepared for all the same punctuation and length issues you saw with family names.

Not everybody has a middle name. Some people have more than one. Some people have a middle initial but no name to go with it. Some people use their middle name and not their first name, so taking the official name in human resources may not work.

Not everybody has a first name. Some people use only one name, which usually gets assigned to the "family name" slot. Some people use two initials.

Some people are normally called something which has no relationship to their legal name at all.

Changing Names

People, of both genders, change names (and occasionally genders!) for a variety of reasons. If you have a process where somebody shows up, you ask them for a name once, and that gets pushed to individual entries in some large number of databases which have no further co-ordination, plan what you are going to do for changes. Particularly changes caused by somebody (possibly you) getting things wrong the first time. Not that I spent four years fixing my name in various databases in college after a data-entry clerk typoed it, or anything.

Additions to Names

Being able to tell Mr. John Smith Sr. from Mr. John Smith Jr. from Mr. John Smith III could save you from 10 years of shipping shareholder reports to a customer. Or it could have done so for one company my father bought from and his father held stock in. But apparently they thought those were decorative.

Accented Characters

If you cannot deal with accents, be sure you don't get them as input, because Mr. Le Carré may be okay with becoming Le Carre, but will not be happy with becoming Le Carr or "invalid character".

Also be aware that if your accent handling is inconsistent, you have no way of knowing whether Ms. Snör, Ms. Snoer, and Ms. Snor are three people, two people, or one person (in German they should be two, as "Snör" and "Snoer" are valid ways of writing the same name, but "Snor" isn't, but that isn't true for every language that uses ö).

Conflicts and Special Conflicts

Names of people that are problematic:

  • Root
  • Roger Oot
  • Core
  • True
  • Test
  • Cron (and Mr. Cron's first name began with 'A', too).
For all I know, there are people named "Administrator". Or "Oracle". There are people with the initials "DBA" who may not want to administer your databases. Make sure you know all the cases you need to exclude if you are auto-generating usernames.

The world is full of people who have the same first and last names, and it's not that uncommon to have a full-on collision with the names identical including the middle one. This only gets worse the more you truncate the name. No matter what you do, you need a policy for collisions. Murphy's law dictates that the collision will always involve somebody famous, infamous, or powerful. For instance, I once worked at a startup where the CTO's first name was the Marketing VP's last name. The Marketing VP wanted his last name as his email address until he'd tried it for about a week. Then he got tired of getting the CTO's email. (The CTO didn't use his first name as he email address, but it made no difference.) There were two "Dan Farmer"s at SGI when one of them was famous for a program usually known as SATAN. The other one rapidly got tired of both the abuse and the fan mail.

There are people for whom truncating their name wrong or tacking an initial on produces laughable or insulting effects. Or just confusing ones. Dilbert's Brenda Utthead (who objects to having her first initial and last name as a username) is the best apocryphal example, but real ones occur. Mary Ellen Cummings was not happy with "first 6 letters of last name, first initial, middle initial".

Honorifics

You are best off letting people write in what they want, with a few options (Mr., Miss, Ms, Mrs., Dr.) presented as easy choices. If you allow people to leave it blank, do not automatically fill in a choice. I get mail for "Mr. Zwicky" and my father gets mail for "Ms. Zwicky" because of people's assumptions about what a blank means. If you do not allow people to leave it blank, provide a gender-neutral option.

Do not assume honorifics translate automatically between languages. For a long time, I was Mme. Zwicky in French but Ms. Zwicky in English (because in English it's all about whether or not you're married, but in French it has to do with how grown up you are). If you are capable of handling honorifics in multiple languages, do not assume that you can pick a language based on the country people live in; I asked one of my French-speaking Swiss colleagues once about a notice I got in German, and he shrugged and said "I wouldn't know; I just throw out anything that addresses me as 'Herr', it can't really be for me."