Issue Details (XML | Word | Printable)

Key: MBH-70
Type: Improvement Improvement
Status: Open Open
Priority: Normal Normal
Assignee: Unassigned
Reporter: nikki
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
MusicBrainz Hosting

Don't convert valid UTF-8 lines in the chat logs

Created: 10/Jan/11 11:55 PM   Updated: 10/Jun/12 11:27 PM
Component/s: None
Affects Version/s: None
Fix Version/s: None


 Description  « Hide

copied from http://bugs.musicbrainz.org/ticket/4396

It would be really really nice if lines in the chat logs which are actually UTF-8 were displayed as such. At the moment all the lines are converted from latin1, it would be better if only lines which aren't valid UTF-8 are converted.

UTF-8 displaying as complete gibberish: http://chatlogs.musicbrainz.org/2009/2009-01/2009-01-09.html#T01-01-59-679352



Sort Order: Ascending order - Click to sort in descending order
patate12 added a comment - 11/Jan/11 10:38 AM

Hi Nikki,
But now this problem is already solved, isn't it ?
I don't see any more garbage text in the chatlogs. Do you ?


nikki added a comment - 23/Jan/11 12:21 AM

Looking at the Trac ticket, it seems to have been left open because the archives didn't get converted. If we're not going to convert the archives, then it's fixed. If we are, then it's not.


Ian McEwen added a comment - 10/Jun/12 11:27 PM

re trac: "If anyone has any suggestions for how to auto-decode byte-soup using the correct encoding (and thence to utf-8), then this can be fixed more fully." – mozilla's chardet might do this for us.

Also, a cleanup script should definitely be run on the archives. If someone can point me at where to look for this I can look into it.