If you are going to use their usernames to generate URLs etc, I would then try to stick with the standard ASCII set of characters, no special characters. A simple approach could be to check that every character is between 0-9, a-z or A-Z. Some other characters, like underscore is also allowed in URLs.
Here is a quick link that describes URLs and allowed characters etc.
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm