How To Replace Accented Characters With Regular Latin Characters

How To Replace Accented Characters With Regular latin Characters ?

We got a request from a client, to convert the accented characters of multiple Excel files to plain Latin. Excel VBA is great and really handy but there is no method to perform such a task using VB code. On the other hand C# inherits all beautiful .Net libraries, and it’s so easy to solve the above problem. Why not bring the .Net world close to the “vintage” world of VBA?

So here is what we did. We prepared a method in C# (added also other languages samples)  that fixes the problem and then made it public to the VBA IDE:

				
					C#
------------------------
string accentedStr = "Staré Město";
byte[] tempBytes;
tempBytes = Encoding.GetEncoding("ISO-8859-8").GetBytes(accentedStr);
string asciiStr = Encoding.UTF8.GetString(tempBytes);
  
   
            
PHP
------------------------
$accentedStr = "Staré Město";
$tempBytes = iconv("UTF-8", "ISO-8859-8", $accentedStr);
$asciiStr = iconv("ISO-8859-8", "UTF-8", $tempBytes);

echo $asciiStr;




Javascript#1
------------------------
const accentedStr = "Staré Město";

// Encoding the accented string to ISO-8859-8
const utf8Encoder = new TextEncoder();
const utf8Buffer = utf8Encoder.encode(accentedStr);
const latin1Decoder = new TextDecoder("ISO-8859-8", {fatal: true});
const latin1Str = latin1Decoder.decode(utf8Buffer);

// Decoding the ISO-8859-8 string back to UTF-8
const latin1Encoder = new TextEncoder("ISO-8859-8", {NONSTANDARD_allowLegacyEncoding: true});
const latin1Buffer = latin1Encoder.encode(latin1Str);
const utf8Decoder = new TextDecoder();
const asciiStr = utf8Decoder.decode(latin1Buffer);

console.log(asciiStr);



Javascript#2
------------------------
function removeAccents(str) {
		  return str.normalize('NFD').replace(/[\u0300-\u036f]/g, '');
		}


python
------------------------
accented_str = "Staré Město"
temp_bytes = accented_str.encode("ISO-8859-8", errors="replace")
ascii_str = temp_bytes.decode("ISO-8859-8")

print(ascii_str)



Powershell
------------------------
$accentedStr = "Staré Město"

$sourceEncoding = [System.Text.Encoding]::UTF8
$targetEncoding = [System.Text.Encoding]::GetEncoding("ISO-8859-8")

$byteArray = $sourceEncoding.GetBytes($accentedStr)
$tempBytes = $targetEncoding.GetString($byteArray)
$asciiStr = $sourceEncoding.GetString($targetEncoding.GetBytes($tempBytes))

Write-Host $asciiStr




MySQL
------------------------
-- Create a table to store the original and converted strings
CREATE TABLE string_conversion (
    id INT AUTO_INCREMENT PRIMARY KEY,
    original_str VARCHAR(255),
    converted_str VARCHAR(255)
);

-- Insert the accented string into the table
INSERT INTO string_conversion (original_str)
VALUES ('Staré Město');

-- Update the table, converting the original string from utf8mb4 (default charset in MySQL) to latin1 (ISO-8859-1)
-- Note that ISO-8859-8 is not supported in MySQL, so we'll use ISO-8859-1 as an example
UPDATE string_conversion
SET converted_str = CONVERT(CONVERT(original_str USING latin1) USING utf8mb4)
WHERE id = 1;

-- Select the converted string from the table
SELECT converted_str FROM string_conversion WHERE id = 1;




Java
------------------------
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;

public class Main {
    public static void main(String[] args) {
        String accentedStr = "Staré Město";

        byte[] tempBytes;
        try {
            tempBytes = accentedStr.getBytes("ISO-8859-8");
            String asciiStr = new String(tempBytes, StandardCharsets.UTF_8);
            System.out.println(asciiStr);
        } catch (java.io.UnsupportedEncodingException e) {
            System.err.println("Unsupported encoding: " + e.getMessage());
        }
    }
}


				
			

And from the Excel VBA IDE:
We registered the C# Library from Tools | References, and we added the following method:

				
					'
' VBA
'
Public Sub Convert()
                    
    Application.Cursor = xlWait
    
    'Create Object from C#
    Dim obj As CSharpTools.CSharpTools
    Set obj = New CSharpTools.CSharpTools
        
    Dim CountRows As Long
    Dim ActiveCol
    ActiveCol = ActiveCell.Column
    Cells(1, ActiveCol).Select
    
    'Get the last row in the current region
    CountRows = Cells(1, ActiveCol).End(xlDown).Row
    
    Dim i As Long
    For i = 1 To CountRows
    
        'Convert each cell to Latin characters
        Cells(i, ActiveCol).Value = obj.ConvertFromUTF8ToLatin(Cells(i, ActiveCol).Value)
    
    Next i
    
    Application.Cursor = xlDefault
    
End Sub
				
			

With the above code we can convert a whole Excel column at no time at all !

 

WE also BUILT A STAND ALONE QUICK TOOL TO LET YOU CONVERT THOSE ACCENTED CHARS:

Synergy-Replace-Accented-To-Latin
Download your free CONVERSION TOOL

 

The code below is a complete tested C# console app using .Net Framewok 4.8.
You can run this app from the command line.

Give it an input text file with your own text. Leave a space and add the name of the output file and it will auto convert to latin for you at no time at all.

				
					using System;
using System.IO;
using System.Text;

namespace AccentedStringConverter
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            if (args.Length != 2)
            {
                Console.WriteLine("Usage: AccentedStringConverter.exe <inputFilePath> <outputFilePath>");
                return;
            }

            string inputFilePath = args[0];
            string outputFilePath = args[1];

            if (!File.Exists(inputFilePath))
            {
                Console.WriteLine("Error: Input file does not exist.");
                return;
            }

            string inputString = File.ReadAllText(inputFilePath, Encoding.UTF8);
            Console.WriteLine("Original string: " + inputString);

            string asciiStr = ConvertToASCII(inputString);
            Console.WriteLine("Converted string: " + asciiStr);

            File.WriteAllText(outputFilePath, asciiStr, Encoding.UTF8);
            Console.WriteLine("Converted string saved to: " + outputFilePath);
        }

        private static string ConvertToASCII(string accentedStr)
        {
            byte[] tempBytes;
            tempBytes = Encoding.GetEncoding("ISO-8859-8").GetBytes(accentedStr);
            string asciiStr = Encoding.UTF8.GetString(tempBytes);
            return asciiStr;
        }
    }
}
				
			


replace your accented characters using the form below:

Replace Accented Characters

latest posts

Ransomware attacks in 2022 – Protection Measures

The majority of ransomware attacks in 2022 began with the exploitation of public-facing applications, data retrieval from compromised user accounts, and malicious emails, a new report by Kaspersky reveals.According to...

GPT-3 vs GPT-4 improvements, differences, thoughts and examples

GPT-4 builds upon the success of its predecessor, GPT-3, by offering several improvements in various aspects. While both models are based on the Transformer architecture, GPT-4 has several advancements over...

Ransomware severs 1,000 ships from on-shore servers

A Norwegian company specializing in maritime risk management faced a ransomware attack on January 7th 2023 that caused its ShipManager software to go offline, resulting in 1,000 ships losing their...

How important is to keep data backups isolated from malware.

Keeping regular backups of your data is essential in today's digital age. There are many reasons why data backups are important, including: Data Loss Prevention: Accidents happen, and hard drives...

OTHER PRODUCTS