How To Replace Accented Characters With Regular Latin Characters

How To Replace Accented Characters With Regular latin Characters ?

We got a request from a client, to convert the accented characters of multiple Excel files to plain Latin.

Excel VBA is great and really handy but there is no method to perform such a task using VB code. On the other hand C# inherits all beautiful .Net libraries, and it’s so easy to solve the above problem. Why not bring the .Net world close to the “vintage” world of VBA?

So here is what we did. We prepared a method in C# (added also other languages samples)  that fixes the problem and then made it public to the VBA IDE:

				
					C#
------------------------
string accentedStr = "Staré Město";
byte[] tempBytes;
tempBytes = Encoding.GetEncoding("ISO-8859-8").GetBytes(accentedStr);
string asciiStr = Encoding.UTF8.GetString(tempBytes);
  
   
            
PHP
------------------------
$accentedStr = "Staré Město";
$tempBytes = iconv("UTF-8", "ISO-8859-8", $accentedStr);
$asciiStr = iconv("ISO-8859-8", "UTF-8", $tempBytes);

echo $asciiStr;




Javascript#1
------------------------
const accentedStr = "Staré Město";

// Encoding the accented string to ISO-8859-8
const utf8Encoder = new TextEncoder();
const utf8Buffer = utf8Encoder.encode(accentedStr);
const latin1Decoder = new TextDecoder("ISO-8859-8", {fatal: true});
const latin1Str = latin1Decoder.decode(utf8Buffer);

// Decoding the ISO-8859-8 string back to UTF-8
const latin1Encoder = new TextEncoder("ISO-8859-8", {NONSTANDARD_allowLegacyEncoding: true});
const latin1Buffer = latin1Encoder.encode(latin1Str);
const utf8Decoder = new TextDecoder();
const asciiStr = utf8Decoder.decode(latin1Buffer);

console.log(asciiStr);



Javascript#2
------------------------
function removeAccents(str) {
		  return str.normalize('NFD').replace(/[\u0300-\u036f]/g, '');
		}


python
------------------------
accented_str = "Staré Město"
temp_bytes = accented_str.encode("ISO-8859-8", errors="replace")
ascii_str = temp_bytes.decode("ISO-8859-8")

print(ascii_str)



Powershell
------------------------
$accentedStr = "Staré Město"

$sourceEncoding = [System.Text.Encoding]::UTF8
$targetEncoding = [System.Text.Encoding]::GetEncoding("ISO-8859-8")

$byteArray = $sourceEncoding.GetBytes($accentedStr)
$tempBytes = $targetEncoding.GetString($byteArray)
$asciiStr = $sourceEncoding.GetString($targetEncoding.GetBytes($tempBytes))

Write-Host $asciiStr




MySQL
------------------------
-- Create a table to store the original and converted strings
CREATE TABLE string_conversion (
    id INT AUTO_INCREMENT PRIMARY KEY,
    original_str VARCHAR(255),
    converted_str VARCHAR(255)
);

-- Insert the accented string into the table
INSERT INTO string_conversion (original_str)
VALUES ('Staré Město');

-- Update the table, converting the original string from utf8mb4 (default charset in MySQL) to latin1 (ISO-8859-1)
-- Note that ISO-8859-8 is not supported in MySQL, so we'll use ISO-8859-1 as an example
UPDATE string_conversion
SET converted_str = CONVERT(CONVERT(original_str USING latin1) USING utf8mb4)
WHERE id = 1;

-- Select the converted string from the table
SELECT converted_str FROM string_conversion WHERE id = 1;




Java
------------------------
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;

public class Main {
    public static void main(String[] args) {
        String accentedStr = "Staré Město";

        byte[] tempBytes;
        try {
            tempBytes = accentedStr.getBytes("ISO-8859-8");
            String asciiStr = new String(tempBytes, StandardCharsets.UTF_8);
            System.out.println(asciiStr);
        } catch (java.io.UnsupportedEncodingException e) {
            System.err.println("Unsupported encoding: " + e.getMessage());
        }
    }
}


				
			

And from the Excel VBA IDE:
We registered the C# Library from Tools | References, and we added the following method:

				
					'
' VBA
'
Public Sub Convert()
                    
    Application.Cursor = xlWait
    
    'Create Object from C#
    Dim obj As CSharpTools.CSharpTools
    Set obj = New CSharpTools.CSharpTools
        
    Dim CountRows As Long
    Dim ActiveCol
    ActiveCol = ActiveCell.Column
    Cells(1, ActiveCol).Select
    
    'Get the last row in the current region
    CountRows = Cells(1, ActiveCol).End(xlDown).Row
    
    Dim i As Long
    For i = 1 To CountRows
    
        'Convert each cell to Latin characters
        Cells(i, ActiveCol).Value = obj.ConvertFromUTF8ToLatin(Cells(i, ActiveCol).Value)
    
    Next i
    
    Application.Cursor = xlDefault
    
End Sub

'''''
''''' With the above code we can convert a whole Excel column at no time at all !
'''''
				
			

The code below is a complete tested C# console app using .Net Framewok 4.8.
You can run this app from the command line.

Give it an input text file with your own text. Leave a space and add the name of the output file and it will auto convert to latin for you at no time at all.

				
					using System;
using System.IO;
using System.Text;

namespace AccentedStringConverter
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            if (args.Length != 2)
            {
                Console.WriteLine("Usage: AccentedStringConverter.exe <inputFilePath> <outputFilePath>");
                return;
            }

            string inputFilePath = args[0];
            string outputFilePath = args[1];

            if (!File.Exists(inputFilePath))
            {
                Console.WriteLine("Error: Input file does not exist.");
                return;
            }

            string inputString = File.ReadAllText(inputFilePath, Encoding.UTF8);
            Console.WriteLine("Original string: " + inputString);

            string asciiStr = ConvertToASCII(inputString);
            Console.WriteLine("Converted string: " + asciiStr);

            File.WriteAllText(outputFilePath, asciiStr, Encoding.UTF8);
            Console.WriteLine("Converted string saved to: " + outputFilePath);
        }

        private static string ConvertToASCII(string accentedStr)
        {
            byte[] tempBytes;
            tempBytes = Encoding.GetEncoding("ISO-8859-8").GetBytes(accentedStr);
            string asciiStr = Encoding.UTF8.GetString(tempBytes);
            return asciiStr;
        }
    }
}
				
			


replace your accented characters using the form below:

Replace Accented Characters

WE also BUILT A STAND ALONE QUICK TOOL TO LET YOU CONVERT THOSE ACCENTED CHARS:

Synergy-Replace-Accented-To-Latin
Download your free CONVERSION TOOL

latest posts

Are you moving away from VMware due to the new pricing method ? – Why not test Microsoft free Hyper-V as a VMware alternative

Since 2010, we've been using Hyper-V, starting from its 2008 version. Hyper-V essentially operates as a Windows Server Core, without a Graphics User Interface, and we deploy it as a...

Seven + 1 Effective Measures for Cybersecurity in Businesses

Cyberattacks are a big risk for all businesses, big and small. They can disrupt how things run and may cause a loss of money and customers. If a company's cybersecurity...

Unveiling Hidden Dangers: The Perils of UPnP and Protecting Your Digital Homestead – Apple Time Capsule

The Dangers of Default Settings: A Narrative on UPnP and Data Vulnerability During one of my routine client visits, I encountered a situation that underscored the criticality of cybersecurity in...

Ransomware attacks in 2022 – Protection Measures

The majority of ransomware attacks in 2022 began with the exploitation of public-facing applications, data retrieval from compromised user accounts, and malicious emails, a new report by Kaspersky reveals.According to...

OTHER PRODUCTS