c# - ReadText from file in ANSII encoding -


i use q42.winrt library download html file cache. when use readtextasync have exception:

no mapping unicode character exists in target multi-byte code page. (exception hresult: 0x80070459)

my code simple

var parsedpage = await webdatacache.getasync(new uri(string.format("http://someurl.here"))); var parsedstream = await fileio.readtextasync(parsedpage); 

i open downloaded file , there ansii encoding. think need convert utf-8 don't know how.

the problem encoding of original page not in unicode, it's windows-1251, , readtextasync function handles unicode or utf8. way around read file binary , use encoding.getencoding interpret bytes 1251 code page , produce string (which unicode).

for example,

        string parsedstream;         var parsedpage = await webdatacache.getasync(new uri(string.format("http://bash.im")));          var buffer = await fileio.readbufferasync(parsedpage);         using (var dr = datareader.frombuffer(buffer))         {             var bytes1251 = new byte[buffer.length];             dr.readbytes(bytes1251);              parsedstream = encoding.getencoding("windows-1251").getstring(bytes1251, 0, bytes1251.length);         } 

the challenge don't know stored bytes code page is, works here may not work other sites. generally, utf-8 you'll web, not always. content-type response header of page shows code page, information isn't stored in file.


Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

javascript - jQuery .height() return 0 when visible but non-0 when hidden -