Fix Broken Text Encoding

You have a text file with broken encoding? You want to strip it from all invalid characters?

Here is how to do it:

iconv -c -t ASCII input.txt

The result will be printed to stdout. The -c switch does the stripping. Using -t you can select every target encoding you like.



I had a text from a webpage with weird encoding. It showed the typical characters you see when viewing a latin1 text in an UTF8 encoding, but trying to save the text didn't work.

Finally, I copied and pasted the text into a file and made:

iconv -c -t latin1 something.txt

That worked except for two characters, I think they were the only ones in UTF8.

I think you English speakers don't imagine how big this problem can be, but an small text in Spanish usually contains plenty of non ASCII characters.

The thing is that this post saved my day. Thank you very much.

