Posts Tagged ‘blogging’

Back in action!

Friday, April 24th, 2009

Yup it’s almost finals time and you know what that means… productivity is going to plummet and I’m going to start blogging again.Maybe a few of you noticed but my blog has been down for quite some time. I was originally hosting on a free 1-year facebook accelerator from joyent, but the problem was, naturally, that I only get it for a year. Anyway, my contract expired, and I’ve been super busy with school and stuff to re-set my blog on my older hosting service. This last week, I’ve somehow had a bunch of free time (or I’ve been really good at not working on important things) so I decided to set things up again. So yup we’re back. I think I will probably be blogging a lot over the summer (as I did last summer), mostly because I hope to have a bit more free time. I should also start posting lots of pictures.I guess I’ll start things off with an interesting article I found a couple of days ago. These researchers at Yale came up with a new way to analyze the running time of algorithms. The technique is called Smoothed Analysis and the article is here. The actual paper is ridiculously long and I don’t have enough statistics background (or patience) to read the whole thing, but the basic idea was really cool. Also, we just learned about the Simplex method in my algorithms class, and we talked about how in the worst case the algorithm is exponential, but in practice it’s quite efficient. Smoothed analysis is the explanation!

Etnanine

Wednesday, November 28th, 2007

For those who don’t know, I live in a house with some of my friends this year. A couple of weeks ago we decided to start a house blog that all of us would co-author. Since then, some of us have been writing articles on there and we’ve gotten a decent number of hits (not anything like the abundance of readers I get on here though). At any rate, I thought I’d advertise for that site here to maybe pipeline some of my traffic over there as well.

Personally, I think it’s a good site, we have some pretty interesting scientific articles, and of course a lot of life updates from everyone in the house. I guess it’s similar to my blog except it gets updated more frequently and you would get a couple more perspectives on some different things. I recommend that you check it out, I promise you won’t waste too much of your time.

Etnanine

I also just wrote an article on there about multitasking which I kind of wanted to put on here. I guess there’s a conflict of interest writing for two blogs with such few ideas and limited time for writing but I’ll figure it out.

WordPress Automated Tagging

Sunday, November 4th, 2007

First of all, I’m going to be switching this blog into a wordpress blog very soon. It’s just too much maintenance for me and it takes me unnecessary time to add features that I want. With wordpress everything will look a lot nicer and work a lot better and it’ll just make things easier for me. I’ve gotten to the point where I’d rather focus on writing than doing site maintenance and debugging.

The other main reason for doing this is that I’d like to make an add-on for wordpress that tries to automatically generate reasonable tags for the writer’s entry. I’m switching over so that I can test this on wordpress. I would do it on my own framework except I haven’t really finished my tagging system and I don’t plan to so I have nowhere to test. So it’ll be a lot easier to just build off of a more robust system like wordpress and not have to worry about all of the details that they’ve taken care of for me, so that I can focus on the project at hand.

I came up with the idea for this project after talking to a company called Metaweb at an internship fair on campus. They’ve built an online query-able database of the world’s information (or some of the world’s information) so that developers can build application on top of it and take advantage of all of the nicely structure data. The database is Freebase. It’s fun to play around with, but I see it as pretty useful for building applications that require information for external sources (like the automated tagger).

So I was playing around with Freebase, and I thought that it would be really cool if I could automate tag generation on my blog (mostly because I usually forget to do it anyway). With this structure data I thought that maybe I can take advantage of the tagging that takes place on Freebase, and all I’d have to do is find out which tags to take from there. I could just look for keywords in the post, send queries to Freebase, look at the significant tags, and suggest some of the tags for the post. Sounds pretty straightforward.

The general plan of attack is: keep a postings file (just a word -> entry relationship ) of all the significant words in all the posts of this user. Then when a new post is published, look at the significant words and find the ones that seem like good candidates for tags (via some text search algorithm like TF-IDF). Then for each of these, look at the Freebase entry for them and pull out the tags on Freebase for these words. Suggest these tags (or some subset of these tags) as tags for the article. Then allow the user to choose which of the suggested tags he/she likes.

Obviously this is pretty high level plan of attack and I’ve been told that it’s not going to be that easy, but I certainly think it would be cool. Also, it occurred to me that it didn’t have to be Freebase that I’m querying. Maybe I could just look at previous posts from this blogger that contain a similar set of candidate words. There are definitely variations of this that may end up being better solutions, but I’ll play around with all of that when I actually start working on it.

I see this as useful because it not only allows my tags to be thought up for me (which is very convenient), but it could also keep my tags consistent with each other (I won’t have tags for “photos” and “photography” for example). That would drastically improve the readability of a blog as there would be less tags and stories would be categorized better. In these respects I think it’s a pretty worthwhile project to take on.

Also I think it’s more challenging than the stuff that I usually do. It deals with text search algorithms, efficient and appropriate data storage, and a fair amount of artificial intelligence (as in “is this word significant based on what this user has previously written about and what’s he’s writing about in this article?”). Fortunately, I’m currently taking a databases course where I’ve already learned about text search and of course data storage. I’m also planning to take an AI course next semester where I may learn some concepts that I can put to use in this project. Again, I haven’t really spent too much time on the project yet, apart from downloading wordpress and going through some of the code, and I may never get around to it if something more important comes up, but I think it’s a pretty cool project that I would enjoy working on. Hopefully I’ll find some time to get it done sometime soon.

On akrish.net’s backend

Tuesday, July 24th, 2007

I’ve already briefly highlighted akrish.net’s structure in my previous post, but in this one I’d like to talk about some of the design strategies that I’ve used. Since I’ve already talked about how I read files, I’ll spend some time to briefly talk about how I’m writing into my xml files.

The first good design tactic I used was object oriented programming. So it’s really easy to use OOP in Java and some other languages, but in PHP it’s not the first thing that most people think about (at least I don’t). Yet it’s still really useful, because it allows you to compartmentalize all of your code into neat little objects that you can pass around and use really easily. So I have two classes: a Parser class and a Writer class. Both of them are for handling interactions between the web site and the xml documents that I’m using for feeds. The Parser object is pretty much what I explained in the previous post, except that it’s now an object. The Writer object has a function that writes a blog post into a specified xml feed, and that’s pretty much all it does right now. OOP is a lot better than functional programming in this case for a couple of reasons. First off, in the Parser object, all of the functions deal with an xml_parser object and so require an initialized xml_parser. This is easily solved by just iniitalizing one in the Parser constructor so that i’m only initializing the xml_parser once regardless of how many times I need to use it. Similarly parser functions need to know what feed to parse, but with OOP I can specify the feed in the constructor, rather than individually in each function. It prevents a lot of repetitive code. Another reason is that I can serialize objects and maintain all of their data as the user moves from page to page. I haven’t taken advantage of this yet, but I’ll need to it to allow a blogger to preview his/her new blog post. Anyway using objects makes the code a lot easier to read/change.

I also made the site highly modularized and functional. In addition to the objects, I also have a lot of functions that spit out content onto the pages. For example I have helper functions that spit out the html for the header, footer, and the navigation bars for all of the pages. These are really helpful, because I only have to make changes in one place, and the changes are propagated to the entire site. Furthermore, there’s a lot less code and a lot of my files end up having very little code, making everything easier for me to read later on.

I’ve also made change to how I show each feed, after realizing that I’d made a huge mistake. I used to have a separate file to show each feed (not the parser object, but each of these files calls the parser object and shows the content of the feed). This turned out to be a huge mistake because each of these files were the same except for which file name they passed to the parser. Anyway making changes got to be really tedious, because I’d have to make the same change on each of the files. I fixed this by making my index.php display all of the feeds. It selects a feed based on a parameter sent by HTTP GET. This is a lot better design wise for reasons that I’ve already mentioned.

Finally, the writer object is pretty straightforward. I set up an HTML form that’s password protected, with fields for feed, title, and body of the post. When I submit the form, I pass all of these parameters into the Writer. The Writer uses some simple regex’s to split up the text and add the proper xml tags. It then reads the old feed, and inserts the new xml into the old feed in the right place. Finally it re-writes the entire feed. I don’t think this is ultimately the best way to do it, because a lot of stuff stays the same in the file, and if the feeds get really long (which they may), this process may end up taking a while. For now it’s ok but I’m going to think about a better way to write xml files.

So that’s pretty much how this site is written. I do plan to add support for more tags, for example Image tags and Code tags so that I can style code differently will probably be added eventually. I don’t expect the rest of the back-end to change much so this’ll probably the last big post about akrish in here. Most of the other posts will probably be about wenote or any other ideas that I decide to execute. If you have any feedback about my design or if you’d like to see some features added, please let my know by commenting or by Thanks.

A brief overview of akrish.net’s structure

Wednesday, July 18th, 2007

Since about 4PM yesterday when I actually started working on this site, I’ve been thinking a lot about how this site should be designed and built. The initial idea that maybe a lot of newer coders have is to just hard-code everything. This would mean writing my entries in HTML documents, manually moving entries from page to page etc. It didn’t seem like a very productive way of doing things, which got me thinking about some other, easier ways.

I know that there are a lot of blog-hosting sites out there, so I started looking at them for examples. Obviously they don’t let their bloggers write their entries into the HTML. That just restricts the number of users they have to mostly coders. I realized they must have some sort of web form that allows users to make posts and that they probably have some backend script that reads in the data from the web form and stores it somewhere. This seemed like a pretty reasonable idea to me, not only because it would let me leave my code alone, but also because it would allow me to make posts at any computer using any web browser, whereas in the other model I’d only be able to make posts when I have ftp access to the files on my server. It’s also a lot simpler in that I can write all the parsing tools now, and then I don’t have to manually do anything except write my entry.

I decided that this would be a good way to go, but how would I store the data? I immediately thought about mySQL because I’m pretty familiar with it and it wouldn’t be too difficult to create a bunch of tables (one per category) and then store date, title, content of each story in those tables. I was confident that this would work, but I also thought about storing my data in XML. The XML way would involve one xml file per category, each with ‘story’ or ‘entry’ tags that contain the date, title, and content of the entry. I settled on the XML approach because I’ve been working a lot with XML on wenote (my other site that’s currently being designed) and at work, so I’m pretty familiar with it, and I know that it’s very easy to use. Another minor advantage is that I can easily write articles in here before I create my web form (described above) for making posts. For example, I wrote this article directly in my xml document, and even though I don’t intend to do this, it does make it easier for me to get the ball rolling on this site.

Seeing how this is my first blogging site, I don’t really know if this is a good design plan, but it seems like it’ll work. I’m pretty happy that I’m getting a lot better at thinking of alternative ways to design things, but I do need to think more about pros and cons of each idea before just jumping into one of them. I’ll probably take a look at some of the more popular blogging companies in the next few days and compare my design with anything I can see about theirs. I’ve got a pretty basic, prototype implementation up and running but I’m changing it all the time, so I’ll describe that once I’ve settled on it.