How To Solve Unicod Character Error In Ae

I have read through similar questions on stack overflow, however non of them solve the unicode problem I have: 'ascii' codec can't decode byte 0xc3 in position 302.

Have tried: import sys reload(sys) sys.setdefaultencoding('utf-8')

Note: If you inspect the source code of a html document you may also see that the character set used is stated in a so called 'meta tag'. It seems however that computers prefer to look at the HTTP header, so don't be confused by this. Ensure that the encoding standard of the web server matches the encoding used in your documents and you'll be fine.

however receive an error: NameError: name 'reload' is not defined

I try to read file with danish vowels: æ, ø, å. In return receive 'UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 position 302 etc. Position 302 and further on include danish vowels. Is there a way to fix this?

So far I have tried putting a specially-formatted comment as the first line of the source code: # -*- coding: <ascii> -*-. Did not give any result.

Also tried: f = open(fname, encoding='ascii', errors='surrogate escape'). But instead of reading file with characters as they are for example in the word 'Europæiske' I get 'Europudcc3udca6iske'.

Then I tried suggestions from the blog (lost a link to that blog) to 'import unicodedata', however, it was not well explained where to take it form there.

How To Solve Unicod Character Error In Ae

Mr Lister

36.1k10 gold badges79 silver badges121 bronze badges

Nadia SNadia S

closed as off-topic by Bhargav Rao♦, aschipfl, Brendan Abel, Rick Smith, Pierre LafortuneMar 15 '16 at 19:32

This question appears to be off-topic. The users who voted to close gave this specific reason:

'Questions seeking debugging help ('why isn't this code working?') must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: How to create a Minimal, Reproducible Example.' – Bhargav Rao, aschipfl, Rick Smith, Pierre Lafortune

If this question can be reworded to fit the rules in the help center, please edit the question.

2 Answers

Simply open with the correct encoding. You have to know the encoding that the file was saved in. Western versions of Windows might be Windows-1252, or perhaps utf8. Modules such as chardet can perform an educated guess. Also, for for csv module, open with newline=' as well (see documentation for using csv.reader:

Mark TolonenMark Tolonen

101k14 gold badges121 silver badges180 bronze badges

that #-- coding: thing is only for what's being used in the program itself, for example if you define a variable or function with Danish characters.

what you're dealing with is I/O, so remember the rule: bytes on the edges, Unicode inside. this means use str.decode when reading in, and unicode.encode when writing out.

jcomeau_ictxjcomeau_ictx

31.1k5 gold badges71 silver badges90 bronze badges

closed as off-topic by Bhargav Rao♦, aschipfl, Brendan Abel, Rick Smith, Pierre LafortuneMar 15 '16 at 19:32

2 Answers

Not the answer you're looking for? Browse other questions tagged pythonpython-3.xunicodeunicode-normalization or ask your own question.