Archive for the ‘projects’ Category

Research: Joe-e project

Wednesday, November 21st, 2007

I haven’t had much time to work on any personal projects in the past couple of weeks so I figured I would write about the research project that I’ve been working on during this semester. I’m working with one grad student and another undergraduate on a new language that’s supposed to make it easier for developers to reason about security issues in their code. The other undergraduate and I have so far been working on a documentation system for the new language and we’re just about done with that and starting up a much more challenging project.

The language (called Joe-e) is a subset of Java that gives the developer less power in dealing with the system. Joe-e code ascribes to the principle of least authority, which means that objects should by default be given no power (authority) and should be granted authority on a need only basis. Compared with the norm, where objects typically have access to everything and must be explicitly restricted, applying the principle of least authority makes it substantially easier for a developer to reason about which objects have access to other objects. In this sense developers can determine more about each of their classes and thus they have a better idea of what would happen to their code if something went wrong in one of the classes. As I haven’t explicitly worked on the security features yet, I don’t know much more about the language itself, but if you’re interested, feel free to read more about it on the language website.

My partner (the other undergraduate in the group) and I have until now been working on a automated documentation generating system for the language. Since the language is a subset of Java, it makes sense to use javadoc (java’s documentation system) but we had to augment it to include some of the other features that we required. We made our documentation look very similar to Java’s API, so that java developers don’t have a hard time switching from java to Joe-e. Everything that we’ve done has just been added to the standard Java API look in a nice and clean way.

In Joe-e, a lot of the Java library classes have been restricted in accordance with the security principles of Joe-e, and our documentation system needs to indicate which parts of classes have been suppressed and why. So one of the main things that we’ve done in our documentation is mark which methods, constructors and fields have been suppressed and why. Another feature of Joe-e is that library classes can have honorary implementations of certain interfaces that provide information about the classes authority. We’ve added a section in the documentation to indicate which of the Joe-e interfaces this class honorarily implements as well. Other than that, there aren’t many differences between the joe-e API and the java API.

When we started working on this project, both my partner and I thought it would be really easy, but it’s taken us almost three months to finish. We’ve been working on it pretty regularly and dedicatedly, but we’ve run into tons of problems with javadoc’s source code that were serious roadblocks to our progress. For example, one such problem is that the javadoc’s API has been locked making it “impossible” to extend the standard javadoc classes and create your own documentation generation system. We ended up spending a lot of time looking at different ways to bypass this lock without re-writing a lot of code ourselves and finally came up with a pretty good solution. Another big problem was that on sun’s website, the information about customizing javadoc applies to java 1.3 and the javadoc tool has completely changed since then. As a result we had to look through tons of source code in order to figure out all the details of javadoc when all we really needed was an overview of it. There were a couple of other stupid problems that we had to deal with which I guess were pretty educational because I got to see how larger software projects worked but overall I found them to be very annoying.

When we got down to writing our own code, it was very straightforward. We ended up extending all the classes in the javadoc source code and just modified a couple of methods in some of the classes to provide the behavior that we wanted. It really wasn’t that challenging of a project, it was just that we had to deal with all of this random problems that made the project take so much longer than I expected.

So I hope to finish up the project this week and put it up on the Joe-e website on Monday (which will be really cool). My partner and I have already started work on our next project, which sounds a lot more challenging and interesting. Overall, I’m really enjoying my research this semester, and am excited to stay with the team next semester and for the future.

WordPress Automated Tagging

Sunday, November 4th, 2007

First of all, I’m going to be switching this blog into a wordpress blog very soon. It’s just too much maintenance for me and it takes me unnecessary time to add features that I want. With wordpress everything will look a lot nicer and work a lot better and it’ll just make things easier for me. I’ve gotten to the point where I’d rather focus on writing than doing site maintenance and debugging.

The other main reason for doing this is that I’d like to make an add-on for wordpress that tries to automatically generate reasonable tags for the writer’s entry. I’m switching over so that I can test this on wordpress. I would do it on my own framework except I haven’t really finished my tagging system and I don’t plan to so I have nowhere to test. So it’ll be a lot easier to just build off of a more robust system like wordpress and not have to worry about all of the details that they’ve taken care of for me, so that I can focus on the project at hand.

I came up with the idea for this project after talking to a company called Metaweb at an internship fair on campus. They’ve built an online query-able database of the world’s information (or some of the world’s information) so that developers can build application on top of it and take advantage of all of the nicely structure data. The database is Freebase. It’s fun to play around with, but I see it as pretty useful for building applications that require information for external sources (like the automated tagger).

So I was playing around with Freebase, and I thought that it would be really cool if I could automate tag generation on my blog (mostly because I usually forget to do it anyway). With this structure data I thought that maybe I can take advantage of the tagging that takes place on Freebase, and all I’d have to do is find out which tags to take from there. I could just look for keywords in the post, send queries to Freebase, look at the significant tags, and suggest some of the tags for the post. Sounds pretty straightforward.

The general plan of attack is: keep a postings file (just a word -> entry relationship ) of all the significant words in all the posts of this user. Then when a new post is published, look at the significant words and find the ones that seem like good candidates for tags (via some text search algorithm like TF-IDF). Then for each of these, look at the Freebase entry for them and pull out the tags on Freebase for these words. Suggest these tags (or some subset of these tags) as tags for the article. Then allow the user to choose which of the suggested tags he/she likes.

Obviously this is pretty high level plan of attack and I’ve been told that it’s not going to be that easy, but I certainly think it would be cool. Also, it occurred to me that it didn’t have to be Freebase that I’m querying. Maybe I could just look at previous posts from this blogger that contain a similar set of candidate words. There are definitely variations of this that may end up being better solutions, but I’ll play around with all of that when I actually start working on it.

I see this as useful because it not only allows my tags to be thought up for me (which is very convenient), but it could also keep my tags consistent with each other (I won’t have tags for “photos” and “photography” for example). That would drastically improve the readability of a blog as there would be less tags and stories would be categorized better. In these respects I think it’s a pretty worthwhile project to take on.

Also I think it’s more challenging than the stuff that I usually do. It deals with text search algorithms, efficient and appropriate data storage, and a fair amount of artificial intelligence (as in “is this word significant based on what this user has previously written about and what’s he’s writing about in this article?”). Fortunately, I’m currently taking a databases course where I’ve already learned about text search and of course data storage. I’m also planning to take an AI course next semester where I may learn some concepts that I can put to use in this project. Again, I haven’t really spent too much time on the project yet, apart from downloading wordpress and going through some of the code, and I may never get around to it if something more important comes up, but I think it’s a pretty cool project that I would enjoy working on. Hopefully I’ll find some time to get it done sometime soon.

voiTunes Released!

Friday, November 2nd, 2007

I’ve been working on voiTunes for a couple of months on and off and I’m finally done! I’m proud to say that it has impressed quite a few of my friends and I’m ready to release to the rest of the world (well, just the mac users). You can download it here. It will only work on Mac OS X, and possible Leopard (I have yet to test it on Leopard). If you have Windows, I’m sorry but you’ll have to wait until I port it (if I ever get around to it).

A few notes: I really enjoyed working on the widget, it turned out to be not too difficult but I wasn’t able to add all the functionality that I wanted mostly because I wanted to have something concretely released this week. I have almost working code that adds a couple more features but it’ll take me several more hours to integrate that nicely into the working widget. As a result, functionality is limited, but it still works and works decently well. Recognition isn’t that great, but that isn’t exactly my fault as I’m using a recognition engine for Carnegie Mellon.

Please post comments, criticisms (be harsh!), or any other feedback.

Yahoo Hack-U

Sunday, October 28th, 2007

Last week, Yahoo! held a Hack day at my school, so my friend and I decided to participate/compete (there were some cool prizes). We made a little maps application that is accessible from both the web and via text messaging, but more on that later. First, what exactly is a Hack Day? So Yahoo! has been holding these events for a year (or a couple, I’m not really sure) and essentially they consist of one twenty-four hour session where you form teams and build something cool and useful. Yahoo! always provides food, entertainment and some help. In general they are really fun events for hackers. Most of the time the hacks are exactly that, very few of them turn into successful businesses.

University Hack Day’s are exactly the same thing. Yahoo! came to our school, had a couple talks about their API’s and about PHP, and then they let us form teams and hack away. We only had about 5 hours though, so in terms of technical complication, we couldn’t do anything spectacular.

My friend and I are pretty entrepreneurial-minded, and we always share ideas with each other. The application that we created is one of the ideas that we discussed some time in the spring, but we just never got around to working on it. In that sense the Hack Day was awesome, because it gave us time to crunch out a prototype, and it gave us more incentive to keep working on it. Apart from that, both of us think the idea is pretty useful, and I know for a fact that people have been wanting an app like this to be built.

The application that we built was a meeting place locater that finds locations closest to the middle of the parties involved. Currently it only supports two parties (inputted as addresses), but we plan to expand it to several. The user experience is as follows: the user inputs his/her address and his/her friends address and selects a type of place that they’d like to meet, and a map pops up with markers indicating locations in between the two addresses that match the search query. From the text-messaging interface, the user sends us a text message with the two addresses and location type and we send them back the address and phone number of the best result. It’s a pretty simple idea, but it’s something that we feel people would want to use in a variety of situations.

Technologically it was pretty simple. We used Yahoo!’s geocoding API to get latitude and longitude data and find the midpoint. Then we send a query to Yahoo! with the midpoint and the location type, and we get back data about the results that best match this query. Finally we plot all of this on a map. Naturally there are plenty of improvements that we can make here, but we didn’t have that much time, so we decided to keep it simple.

The text-messaging interface was really hack-y. Essentially, the text message is sent to a Gmail account, which gets routed to my Apple Mail.app on my locale machine. On my computer, I have an Applescript script to check my Mail.app inbox for new mail, and parse out the necessary data from it. I also have another Applescript script that takes in the data and sends an e-mail from Mail.app. (Both of these are fairly simple to code in Applescript, which is pretty awesome). Both of these scripts are wrapped in a PHP script that’s infinitely looping. The PHP script takes the info from the incoming e-mail and fetches data from Yahoo! (similar to the web interface). Then it calls the outgoing script. Clearly this implementation is pretty bad and obviously doesn’t scale, but it worked in our demo’s and that was really all that mattered.

The web interfaces is up and running and can be viewed at Ingapo. It’s called Ingapo which in Tamil (both my and my friends native language) means “go there.” Since the text messaging interface runs locally on my machine, I don’t keep it running. Also I’m pretty sure things will break if I receive and e-mail that’s not intended to be parsed by the scripts.

On the whole though, Hack day was fun and really worthwhile. I hope to attend some of the other ones (there’s one every year at Yahoo!’s headquarters) and in fact, the two of us may end up heading to Sunnyvale in the spring to participate there. It was also interesting to see some of the other ideas which were very cool as well.

What do you think of our idea? Do you think it would be useful?

Voice Recognition Media Library

Thursday, September 20th, 2007

It’s time to finally let the cat out of the bag. I announced about a month ago that I was working on a really cool project and now I’m finally going to talk about it. From the title, it’s pretty obvious that the project involves voice recognition and media libraries. Well, here we go!

I worked at a company called Tellme this summer, and all of their technology revolves around voice recognition. Anyway, after working on voice applications for them, I thought it’d be really cool to extend iTunes to take voice commands. I talked to some of my fellow interns about third party voice recognition software that I could use, and got started.

First, let me talk about some of the software I’m using. My voice recognition tool is called sphinx. It’s an open source Java package provided by Carnegie Mellon that’s really easy to extend. Sphinx was a bit difficult to install, but once that’s done, they provide a lot of demos that you can just look at and add on to. I just took their “hello world” demo, and added a bunch of stuff to it to get my media library recognition to work. If you’re at all interested (and because you’re still reading, I take it that you are) I strongly recommend checking out their software, playing with it and extending it to build your own applications.

On the other side of the application, I’m using a perl module to control iTunes. The module just has functions that output some applescript, so unfortunately this project can only be installed on top of OS X right now, but it’s a really simple, easy to use interface to applescript and therefore iTunes. This is what I really love about perl; there are so many libraries for perl that you can do pretty much whatever you want, just with installing a couple of modules onto your computer. I’m also using perl for XML parsing and other file handling, while I’m using Java for the user interface and interaction to sphinx.

Ok, so what exactly does my prototype do? Essentially, I have a java program that starts up the sphinx recognizer, waits for the user to say something like “itunes play”, “itunes pause”, “itunes next”, or “itunes select”, and then it processes the command, and sends a command to iTunes. “Play”, “pause”, and “next” are self explanatory, but “select” is a little more intricate. When the users says “itunes select,” the program prompts you to say an artist name, and then a track name by that artist and then it searches your iTunes library and plays that specific track. That’s the program from a high level.

Looking deeper down, the Java code is just a loop that waits for recognition, but I use different grammars each time. The main procedure just has a grammar that accepts the “iTunes” commands. Each command outputs a different string, which I send as the argument to a perl script. The perl script then uses the applescript library to send a command to iTunes.

Prior to loading the recognizer, I use perl’s XML libraries to parse “Itunes Media Libary.xml,” the file that stores all of iTunes’ track information. Then when the user says, “itunes select,” I dynamically generate a grammar composed of a list of all the artists in your music library. Then when you say an artist name, I dynamically generate a grammar of all the song titles. Once I have the artist name and the song name, I pass the information into the perl script, which again sends the commands to iTunes. On the whole, the project doesn’t seem too complicated, but there’s quite a bit of code involved, and it’s definitely more challenging than any of the other projects I’ve worked on.

My current prototype is fully operating as I just described, but it’s certainly not complete. Before I actually publicize and maybe distribute my software, I want to fully integrate it into iTunes as a plug-in. I also need to improve recognition, because my ultimate goal is to just leave the plug-in on, but it should know when it should be listening and when it shouldn’t. Finally, I also want to let the system learn artist and track names, because as of now, it only understands legitimate English words and a lot or artists names aren’t English words. I want to build a feature that allows you to train the system when you first run it, so that it learns this names and recognizes them for later runs.

Of course all of my aspiration and additions are a lot more challenging than just getting my prototype running, but I really think this project is cool and that it’s worthwhile for me to spend my time on. It would be amazing to show off to my friends, peers, and, of course, recruiters. What’s more important, though, is that I’ll learn a lot about building a stable, user-friendly, product that integrates a lot of different technologies. I hope to make this a project that I take from start to finish, spending time on all aspects of product development. It’s a challenge, but it’s been fun so far, and it’ll be so worth it when it’s complete.