Backing up the Yahoo Groups #archive


Aranel Took
 

The Yahoo Groups are backed up, though it's a bit of a mess. I used a Chrome plug-in to grab the messages in mbox format, which is an email format. They're just one great big text file. I am still looking for a better way to preserve the forums, but we at least have these info dumps as a last-resort backup while I take the time to find a better solution. 

I have also copied all of the photos and files from the Groups. I'm adding dates and the username of who posted them to the files.

So everything is now backed up from the Groups, not necessarily in the best way at this point, but it's not gone forever if Yahoo decided to pull the plug tomorrow. 


Aranel Took
 
Edited

Success! I used PG Offline to import the MEFA Yahoo Groups as MySQL. PGO did a great job, except for some incorrectly escaped characters causing some errors, but a find/replace fixed that. The file was ginormous so I had to chop it up into 20 individual SQL files to import it into the database, keeping each one under 8MB. But it worked! I can read the posts on my test site!  

Now I need to add pagination, some filters, and make them pretty. I guess this is my push to finally get the MEFA Archive site overhauled (top priority is making it responsive so it's readable on a phone). 


Aranel Took
 
Edited

Command line importing of MySQL files means no more breaking up ginormous files!

I had to fight with Yahoo's formatting—if you want to call it that—to make all of the messages readable. Some have HTML tags, some do not. Some use <p></p> for paragraphs, some use <br>. Then there's the random <p><span></span></p> tags scattered throughout, making weird spacing between paragraphs. Yikes! When importing Yahoo messages, preg_replace and str_replace are your friends! There's still some wonky formatting, but they're readable. 

I'm trying to decide whether to use jQuery for pagination or just write it myself. jQuery is quick and easy, but I'm not sure it will do what I need it to do. 

I'm planning to filter by Thread and Author. I think I'm going to copy Yahoo's "calendar" method for filtering by dates. The subjects and messages will also be searchable by keyword. 

Have some things I need to do this week, then I'll get back to working on getting the Groups up on MEFAwards.org.


Aranel Took
 

Got distracted by MORE THINGS this month, so I won't be able to get the groups onto the website until later this month. Or January. :-\


Aranel Took
 

The MEFA Yahoo Group has been successfully added to my test site! For now, messages can be selected by month (I used the same calendar method that Yahoo had) and by Topic ID from individual posts. I decided against filtering by author and keyword, at least for now. If it's something people really want, I can add it later, but for now the ability to filter by date and topic makes the forum very readable. 

I also decided against pagination. It's just too much work for something that probably isn't going to be looked at much. Loading an entire month or Topic at a time will also allow you to easily search by keywords using your web browser. 

The Yahoo Archive will be added to the live site with the next update, hopefully this week. 


Aranel Took
 

The MEFA Yahoo Group archive is now available on the MEFA Archive website. http://localhost:8888/mefawards/forum.php

You can read posts by month or by topic thread.