Archive for November 4th, 2007

akrish.net revamped with wordpress

Sunday, November 4th, 2007

I said I would be moving akrish over to wordpress, and here it is. I’ve imported all of my old articles and have categories that represent the feeds I used to have. Naturally I’m going to have to mess around with settings and configurations a bit, but ultimately there won’t be many other changes to this blog. Please let me know if anything doesn’t seem to be working as you’d expect and I’ll take a look at it.

October Music Recap

Sunday, November 4th, 2007

I didn’t do this for September but my music hasn’t been changing that frequently (mostly due to school). Here’s the recap:

Albums Chiodos – Bone Palace Ballet

Daphne Loves Derby – Good Night, Witness Light

Brand New – The Devil and God are Raging Inside Me

Brand New – Deja Entendu

Brand New – Your Favorite Weapon

Scary Kids Scaring Kids – Scary Kids Scaring Kids

Nural – Weight of the World

What I’m playing

Coheed and Cambria – Always and Never

Dashboard Confessional – The Good Fight

And some other songs that I can’t think of off the top of my head. I’ve also been spending a lot more time just free-styling rather than playing songs. It’s a good way for me to be creative and it’s also moving me in the right direction toward writing my own songs.

Unfortunately I haven’t had too much time to play guitar, but I’ve started using music as a way for me to relax which I’m really happy about.

WordPress Automated Tagging

Sunday, November 4th, 2007

First of all, I’m going to be switching this blog into a wordpress blog very soon. It’s just too much maintenance for me and it takes me unnecessary time to add features that I want. With wordpress everything will look a lot nicer and work a lot better and it’ll just make things easier for me. I’ve gotten to the point where I’d rather focus on writing than doing site maintenance and debugging.

The other main reason for doing this is that I’d like to make an add-on for wordpress that tries to automatically generate reasonable tags for the writer’s entry. I’m switching over so that I can test this on wordpress. I would do it on my own framework except I haven’t really finished my tagging system and I don’t plan to so I have nowhere to test. So it’ll be a lot easier to just build off of a more robust system like wordpress and not have to worry about all of the details that they’ve taken care of for me, so that I can focus on the project at hand.

I came up with the idea for this project after talking to a company called Metaweb at an internship fair on campus. They’ve built an online query-able database of the world’s information (or some of the world’s information) so that developers can build application on top of it and take advantage of all of the nicely structure data. The database is Freebase. It’s fun to play around with, but I see it as pretty useful for building applications that require information for external sources (like the automated tagger).

So I was playing around with Freebase, and I thought that it would be really cool if I could automate tag generation on my blog (mostly because I usually forget to do it anyway). With this structure data I thought that maybe I can take advantage of the tagging that takes place on Freebase, and all I’d have to do is find out which tags to take from there. I could just look for keywords in the post, send queries to Freebase, look at the significant tags, and suggest some of the tags for the post. Sounds pretty straightforward.

The general plan of attack is: keep a postings file (just a word -> entry relationship ) of all the significant words in all the posts of this user. Then when a new post is published, look at the significant words and find the ones that seem like good candidates for tags (via some text search algorithm like TF-IDF). Then for each of these, look at the Freebase entry for them and pull out the tags on Freebase for these words. Suggest these tags (or some subset of these tags) as tags for the article. Then allow the user to choose which of the suggested tags he/she likes.

Obviously this is pretty high level plan of attack and I’ve been told that it’s not going to be that easy, but I certainly think it would be cool. Also, it occurred to me that it didn’t have to be Freebase that I’m querying. Maybe I could just look at previous posts from this blogger that contain a similar set of candidate words. There are definitely variations of this that may end up being better solutions, but I’ll play around with all of that when I actually start working on it.

I see this as useful because it not only allows my tags to be thought up for me (which is very convenient), but it could also keep my tags consistent with each other (I won’t have tags for “photos” and “photography” for example). That would drastically improve the readability of a blog as there would be less tags and stories would be categorized better. In these respects I think it’s a pretty worthwhile project to take on.

Also I think it’s more challenging than the stuff that I usually do. It deals with text search algorithms, efficient and appropriate data storage, and a fair amount of artificial intelligence (as in “is this word significant based on what this user has previously written about and what’s he’s writing about in this article?”). Fortunately, I’m currently taking a databases course where I’ve already learned about text search and of course data storage. I’m also planning to take an AI course next semester where I may learn some concepts that I can put to use in this project. Again, I haven’t really spent too much time on the project yet, apart from downloading wordpress and going through some of the code, and I may never get around to it if something more important comes up, but I think it’s a pretty cool project that I would enjoy working on. Hopefully I’ll find some time to get it done sometime soon.