python - How can i decode JSON-like string in cyrillic? -


i'm trying create simple spider in scrapy adverts site. problem adverts in cyrillic strings that:

1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430 

here's spider's code:

def parse_advert(self, response):     x = htmlxpathselector(response)      advert = advertitem()      advert['title'] = x.select("//h1/text()").extract()     advert['phone'] = "111111111111"     advert['text'] = "text text text text text text"     filename = response.url.split("/")[-2]     open(filename, 'wb').write(str(advert['title'])) 

is there way "translate" string on fly?

thanks.

use str.decode('unicode-escape'):

>>> print r'1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430' 1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430 >>> print r'1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430'.decode('unicode-escape') 1-комнатная квартира 

Comments

Popular posts from this blog

image - ClassNotFoundException when add a prebuilt apk into system.img in android -

I need to import mysql 5.1 to 5.5? -

Java, Hibernate, MySQL - store UTC date-time -