Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Removing accents on both unicode and iso format

Status
Not open for further replies.

Sleidia

Technical User
May 4, 2001
1,284
FR

Hello,

I need to edit my function so that it can
work also when special characters are submitted
on unicode format.

Any idea on how I should do that?

The changes I've done below don't work at all :(

Code:
function hd_dir_safe($string) {

$string = strtr($string, 
"!#$%&'()*+,_./:;<=>?@[\]^`{|}~", 
"                              "
);

    if ($_SESSION["site_all"]["charmode"] == "UTF-8") {
    
    // this doesn't work when the site content is in UTF-8 format
    $string = strtr($string, 
    utf8_encode("ÀÁÂÃÄÅàáâãäåÈÉÊËèéêëÌÍÎÏìíîïÒÓÔÕÖòóôõöÙÚÛÜùúûüÑñ"), 
    "aaaaaaaaaaaaeeeeeeeeiiiiiiiioooooooooouuuuuuuunn"
    );
    
    } else {
    
    This works when the site content is in ISO format
    $string = strtr($string, 
    "ÀÁÂÃÄÅàáâãäåÈÉÊËèéêëÌÍÎÏìíîïÒÓÔÕÖòóôõöÙÚÛÜùúûüÑñ", 
    "aaaaaaaaaaaaeeeeeeeeiiiiiiiioooooooooouuuuuuuunn"
    );
    
    }


//$string = strtolower($string);
$string = str_replace(" ", "", $string);
$string = str_replace("\"", "", $string);
$string = trim($string);

return $string;

}

Note : the PHP code is in ANSI format (not UTF-8) and
I would like to keep it this way.

Many thanks!

 

Might be important to add that, with my function above, a UTF-8 string "aéro" is converted into "anoro" instead of "aero".

How can I solve that?

 
OK, I've found a solution here :


Code:
function wd_remove_accents($str, $charset='utf-8')
{
    $str = htmlentities($str, ENT_NOQUOTES, $charset);
    
    $str = preg_replace('#\&([A-za-z])(?:acute|cedil|circ|grave|ring|tilde|uml)\;#', '\1', $str);
    $str = preg_replace('#\&([A-za-z]{2})(?:lig)\;#', '\1', $str); // pour les ligatures e.g. '&oelig;'
    $str = preg_replace('#\&[^;]+\;#', '', $str); // supprime les autres caractères
    
    return $str;
}
 
would this also work?
Code:
$text = iconv('UTF-8', 'US-ASCII//TRANSLIT', $text);
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top