Archive for the ‘Uncategorized’ Category

Some Interesting Articles

Saturday, January 21st, 2012

This week I found several interesting online articles that I’d like to share here.

The first relates to SOPA and PIPA, the two legislative acts about fighting online copyright infringement. I personally don’t know too much about them but in an effort to learn about these pieces of legislation, I found a report published by the ACM, the Association for Computing Machinery, that has a decent amount of influence, especially with scientists and technical people. The article is a bit technical, but the thesis is that the costs associated with blocking traffic (i.e. blocking DNS lookups and search engine hits) is quite high and it cannot be done simply by tampering with US-based DNS servers. Therefore the article proposes that anyone (or entity) that requests a court order under these acts needs to financially compensate the other party for carrying out the court order. This would mean that anyone who would like to censor a website would need to pay for that website to remove itself from DNS servers and search engines. The ACM article states that this is a non-trivial task, meaning that the prosecuting party would have to pay a substantial amount. If these bills were to pass, I hope this would mean that corporations would not be willing to actually carry out these court orders.

In somewhat related news, this NY Times article about academic publishing looks at alternatives to the traditional (read: antiquated) publishing system. This relates also to this article I wrote earlier but is not only better written but also better informed. There are several attempts to circumvent traditional academic journals; one mentioned in the article is ResearchGate, which is more or less an academic social network. I’m really happy to see that people are working on this and I hope that some of these catch one, despite the fact that academia is fairly conservative at adopting change. Another interesting facet of the NY Times article is that they managed to talk to spokespeople of Elsevier and Science, who gracefully toe-d the party line saying that the costs for maintaining curated records of publication motivates the exorbitant prices for journal subscriptions.

Cathy O’Neil, mentioned this article and wrote about one of her own horror stories of dealing with publishers. In fact her experience with publishers partly contributed to her leaving academia. In a nutshell, the publication process is atrociously slow, and this really slows innovation and also makes impatient people incredibly annoyed. Alternative form of publication and recognition could almost certainly speed up the dissemination of knowledge and foster more rapid innovation. I can see how this would really annoy me, but since machine learning is a field where top publication venues are mostly conferences, I haven’t noticed this much. Conferences are a great way to spread ideas quickly and efficiently, but in many fields they are regarded as second or third tier publication venues, so technical content is often lower quality. Maybe one quick fix in other fields is to convince people that conferences are a reasonable way to publish, thereby increasing their impact factor while simultaneously promoting more rapid innovation.

Timothy Gowers, a famous mathematician (a Fields Medalist) and blogger, also wrote here specifically about the bad practices of Elsevier, one of the big academic publishing companies. In his article he publicly declared that he would boycott Elsevier in every way, refusing to peer-review for, publish in, or in any other way serve for Elsevier journals. He also considers both top-down and bottom-up approaches for changing how these companies operate, and his boycott is a step in the bottom-up direction, an individual act rather than a more coordinated effort from academics. Either way, I’m glad to see that academics are taking a starting to take a stand against publishers.

Dire Wolf at Sectionals

Tuesday, September 13th, 2011

My first trip to the club series (on a real club team) started this weekend as Dire Wolf travelled to New Jersey for the sectional tournament. Just before leaving on Friday afternoon, I saw a physical therapist and was diagnosed with shoulder impingement, so I didn’t end up playing, which was unfortunate. Nevertheless, Dire Wolf took care of business, finishing 3rd overall and qualifying for regionals.

We came into the tournament seeded second in a 5-team pool, behind Oakland. We were really looking forward to games against Oakland (Pittsburgh rivals), Southpaw (Philly elite team) and Bear Proof, who beat us in a close game at NY invite. Southpaw and Bear Proof were both in the other pool, so on saturday we took care of business with the lower seeded teams (beating Clockwork Orange 13-4, Misogyny 13-6, and Hypnotoad 13-4). For the last round of the game we faced Oakland for the fourth time this season.

Oakland (Game 1): Dire Wolf pulled to start, and immediately broke to take a quick 1-0 lead. We traded points to 4s, where oakland eventually broke back, and finally took half 7-5. In the second half, Oakland opened with a string of breaks and eventually beat us convincingly, with a score of 13-6. Wolf played really well in the first half, forcing errant throws and a lot of 50-50 balls from Oakland, but we had a hard time earning turnovers off of these (they had at least two forced throws that could have easily resulted in turnovers). Even still, we generated turnovers, but failed to convert once we got the disc. After Oakland ironed out their kinks, they played a much more consistent game, and we started making more mistakes on offense, allowing them to run away with the game. With this loss, we finished second in our pool and qualified for semifinals on sunday.

Southpaw: Again we came out strong, earning a quick break and taking a 2-1 lead. Southpaw broke back after a couple of traded points to tie the game at 4s with Dire Wolf on D. Stifling dump defense by Greeno and Skylar resulting in a high-stall forced throw and a turnover and we broke again with a big huck from Tad to Karl for the score. Southpaw came back strong and a couple of miscues by our offense resulted in two breaks for Southpaw, to take half 8-6. After half, though we didn’t roll over like we did in the Oakland game, but managed to get back on serve at 9s capitalizing on some of their miscues. Unfortunately, that’s where the journey ends, as Southpaw got their act together and broke several times to win 15-10.

Southpaw is supposed to be one of the stronger teams in our region, and while we knew we could compete with them, I think we were all pretty excited about this result. I hope we play them again at regionals.

Bear Proof: We dropped into the third place game where we played Bear Proof, from central PA. I don’t have too much to say about this game. Hungry for revenge from NY invite, we came out really strong and went up 5-1 right away. We took half 8-2 and eventually took the game to 13-4, game to 15. At this point we knew we had pretty much won the game but it would be nice to bring it home convincingly. Unfortunately, we let bear proof run off a string of breaks and bring the game to 13-7, with the soft cap on. The last point was epically long with a slew of turnovers by both teams that ended in a bear proof score, but the hard cap gave us the win 13-8. Again we came out really strong, but slowed down in the second half, and if the game were closer to begin with, we would have let Bear Proof back into the game towards the end. Going from 13-4 to 13-8 is pretty much unacceptable, and a better team would have punished us for slipping in the second half. At any rate, with this victory, we proceeded to the second place game, a rematch against Oakland Ultimate.

Oakland (Game 2): The theme of the weekend for Dire Wolf was a strong start to close games. The same is true of this game, with us earning a break and making it 2-1 right away. We traded to 4-3, when Oakland ran off a string of breaks to take half at 5-8. Unlike the last time, we didn’t roll over after half, score our O point and the next to D points to make it 9-8. However, once Oakland converted their O point, the rolled off a couple of breaks and eventually won 15-11. This game got pretty out of hand, with a lot of arguing, aggressive marks and fouls, and a lot of disc spiking. However, it was also really exciting to watch, as it stayed pretty close throughout, and the lead shifted back and forth a couple of times. Both teams played well, and if we can stay focused for the entire game, I think we could come away with a win against these guys.

Closing Thoughts: I think the theme of this weekend is focus and consistency. In all of the games I talked about, we gave up a lot of points (and often the game) really late in the game. We’ve proven that we can play with some of the best teams in our region, but we need to be able to do that for the whole game, not just for the first half or first three quarters. I don’t believe this is an endurance issue, but rather an lack of mental focus towards the end of the game, as a lot of our turnovers seem to come from bad decisions and mistakes rather than fatigue. If we can figure out how to stay consistent for a duration of a game (and really the entire tournament) then I truly believe that we have a shot at nationals. It’s going to be tough, but I definitely think it’s possible.

NP-Completeness in the workplace

Tuesday, July 7th, 2009

My algorithms professor used to tell his students (including me) this story to motivate studying NP-complete problems and reductions. The story goes something like this: say you’re working as a software developer and your boss gives you this project where you need to come up with a efficient way to do some computation. You go to your desk and think about for awhile and then suspect that there is no efficient way to do the computation. So you can’t just go back to your boss and say, “I think this problem is NP-Hard, so I give up,” you need to show your boss that it’s NP-Hard and this motivates the studying of reductions. It further motivates the study of approximation algorithms and other techniques to cope with NP-Completeness.

I’ve heard this story a couple of times from my professor, but I never really felt like this is something that would happen. How often is it that you are given a problem that doesn’t look intractable immediately, and how much convincing will your boss really need? For that matter how often, in a typical job, do you encounter intractable problems? Like how often does this really happen at a typically development job? Now I haven’t spent too much time in industry and really in my time I never ran into NP-Hard problems so I don’t really know. Although the story was motivational for an algorithms course, the situation used to seem implausible to me. That is until I was challenged to do this myself

This summer, I’m working in a computational genomics lab refining some algorithms for analyzing gene expression data. The core problem we are working on is NP-Hard, but in my refinements, I ran into another NP-Hard problem that I did not immediately see as intractable. I guess this story may be a little bit skewed because I am basically doing algorithms research but I still thought it was pretty cool.

My work deals with finding connected components in a large gene network (an undirected graph). To simplify things, we have a scoring function and want to find connected components with good scores and because of the scoring function this turns out to be NP-Hard, read this for the exact problem and why it’s NP-Hard (The exact problem is a generalization of set cover). Anyway, the basic way we do this is to pick a vertex with a “good” score (called the “seed”) and then expand the module from that node.

In our experimentation, we decided to see what would happen if we allowed seeds to be multiple disconnected nodes and grew our modules from these multi-node seeds. Basically, here you choose k nodes to be your seed and you grow the module out from these seeds, but it is not a requirement that the result is a connected component, in this case it is ok if the resulting module consists of k connected components. The motivation was that from a biological perspective, a disease may affect two or more gene pathways and we wanted our algorithms to be able to account for that, hence allowing multiple connected components.

I set out to implement this feature and was thinking about ways to create these multi-node seeds. I started out by filtering away all of the “bad” vertices and was left with a set of 30-50 good vertices but it was (and still is) not clear what’s a good way to combine these single seeds into multi-node seeds.

My first approach was to simply enumerate all of the combinations and then choose the ones where the individual vertices had good scores. This approach besides having exponential time complexity was bad in that the seeds we selected were all very similar. This was because the individual nodes that were very good were in all of the multi-node seeds we selected. Thus this approach was not satisfactory. Another problem with this approach was that we typically did not see disjoint connected components in the resulting solutions.

To remedy the last problem, we decided on a new approach where we select multi-node seeds where each individual vertex is far from the rest. We compute this by looking at the neighborhoods (typically only 1 level of breadth first search) around each seed and computing the intersection of these neighborhoods. So this seemed like a good approach and we were hopeful this would produce disjoint connected components in the results so I set out to implement this.

Lets restate the problem: Given a set of vertices in a graph, where each vertex has an associated set of elements from some universe U. The goal is to find k vertices that minimizes the intersection of the sets, or such that the intersection of these sets is the empty set.

I thought about this problem for awhile and was quite confident that it was intractable. Before talking to the grad student I’m working with about it, I wanted to have at least a sketch of a reduction in mind. My first reaction was that I should be able to reduce set cover to this problem, possibly by “negating” all of the sets or something. I messed with this for awhile and I didn’t really get anywhere.

I mentioned this problem to some other people in my lab and we all thought about it for awhile. Eventually one guy found that he could reduce independent set to this problem quite easily. If you instead go back to the gene network graph and consider each of the “elements of the universe” as neighbors of the vertices in the graph, then you immediately see that this is the independent set problem. If you treat each neighborhood as a node in a graph, and there is a edge between two neighborhoods if they share an element in common, then you can solve this problem by running an independent set solver on this graph. Conversely, given a graph, you can define a universe of elements and a set to each vertex such that vertices that are not connected share no elements and vertices that are connected share some elements. Then you can look for multi-node seeds and solve independent set. The way I explained the problem above, and the way that I thought about it made it difficult to see that it was independent set, but if you go back to the original problem, it’s easier to see.

Now that I new the problem was NP-Hard, I started thinking about ways to get good enough multi-node seeds and eventually settled on a greedy algorithm that grows the seed sets by adding the node with minimal intersection. The resulting seeds are decent and we did start seeing solutions with multiple disjoint components which is exactly what we were looking for. Despite being faced with a “challenging” problem, we did end up finding a good working solution that gave us the results that we wanted.

This little anecdote, although quite silly in retrospect (because the reduction was pretty straightforward), really showed me that what I learned about reductions and NP-Completeness is important from a practical perspective. It also showed me that my professors story was actually quite plausible. Plus it was another situation where I got to apply things that I learned, and I always find that really cool.

The Language Barrier

Thursday, June 11th, 2009

Everyone speaks hebrew as their primary language here in Israel, and in interacting with people, I’ve noticed a couple of interesting things about languages. Almost everyone CAN speak english, but it isn’t their natural language (it’s kind of like how after only 6 years of studying spanish I CAN speak spanish, but in speaking spanish with spanish people I’ve met here, I’m not very effective at communicating in the language). I don’t have a hard to getting things done here because most people do speak English pretty well, and if they don’t, then there are always people around that can translate. However, in an environment where very few people naturally speak english, it is much harder to connect with people.

Given that I do spend some time in difficult interactions, I’ve been thinking a lot about languages in a variety of lights. Politically, languages can be a uniting factor (like it is in Israel), or conversely in can hinder unification attempts (like in India). Socially, speaking a different language from everyone else does have negative consequences on your relationships and interactions. And lastly, technically, the language that you think or operate in may close your mind to new ideas.

Language as a political tool
When you think about it, Israel is not incredibly different from India. Yeah, India is a much bigger country geographically and population-wise, but both were under British rule until the mid-20th century, both gained their independence around that time, and both are now relatively modern democratic nations (Israel more so and probably India less so). Further, Hinduism and Judaism are two of the oldest religions and both countries have rich ancient histories. Ok, so there are a lot of differences, but one I found interesting is about language, and how it affects the political environment.

There are tons of languages in India and although Hindi is the official language, unification did not come easy. Now, my view is that most kids in India, while knowing Hindi and their native language, are also very good (almost proficient) at English. I haven’t been to India in years so I could easily be wrong, but a lot of Indians come to the US and speak well enough for me to think this. I found some sources that counter this claim, but my friend Vivek, who lived in India for a couple of years recently supports me (but he went to an international school so…). And of course all of my Indian-American friends pretty much “know” just one language, and if they aren’t from an historically Hindi speaking area, it usually isn’t Hindi.

In contrast, in Israel, EVERYTHING is done in Hebrew (Ok that’s not entirely true, a lot of people speak Arabic and you do see street signs in Arabic). Since Israel was founded as a Jewish nation, there weren’t any real problems with making Hebrew the official language (except for the Arabs that were living here). All of the Arabs that I’ve met speak Hebrew fluently now, so here, everyone who calls themselves Israeli is fluent in Hebrew.

In Israel, practically everyone operates in Hebrew, and as a result, there’s more of a national sense of pride here. In India, I feel like this pride is lacking and the diversity in languages seems to correlated. The fact that it is much harder to settle on a national language in India is evidence that India is really diverse, and this diversity leads to less national pride. In Israel, not only does everyone speak Hebrew, but they are the only country where people speak Hebrew. If I were Israeli, hearing someone speak Hebrew would give us an immediate connection, just because we are both Isreali. One of my lab-mates was traveling in Europe with his family and another Israeli group overheard them speaking in Hebrew and the two groups started talking, simply because they shared this language. When he told me the story, he used the words “sense of national pride,” hopefully supporting my point.

Aside: While I’m here, I do the same thing with people speaking English. If I hear some one speaking English with an American accent, it’s an immediate connection.

So it’s pretty obvious that language is an indicator of how diverse a country is, but I never really thought that it could contribute to national pride.

Language as a Social Barrier
So even though I don’t speak Hebrew, I can communicate well enough to get things done here. However, I’ve noticed that I do miss out on a lot of things. As an example, I play Ultimate here and everyone that plays speaks English really well (In fact, many of the players spent considerable time in the US), but they naturally speak in Hebrew. So one time, there was a foul call that lead to an argument (as it is guaranteed to do in Ultimate), but this time the argument took place in Hebrew. I didn’t see what happened during the foul, but I couldn’t even figure it out by listening in. I could only decipher what happened by listening for tone and interpreting body language, from which I only learned a bit about the incident. After the uproar had died down, I asked what happened and was given a good explanation, but in the heat of the moment I could not participate.

Also, I went to a party this weekend and I found it really hard to interact with people. Of course everyone spoke English pretty well, but over the din of the music and in that kind of a setting, most of the people I talked to seemed reluctant to talk to me. Basically, people don’t want to have to think really hard to speak to someone at a party, so conversations are short and I didn’t really meet that many people. The party was still fun, but I definitely felt that I was at a social disadvantage by not speaking Hebrew.

In both of these situations, I felt left out of an experience because I don’t speak the native language here. Of course, if two people don’t speak the same language at all they are unable to connect, but here it’s hard (though not impossible) to connect with people even if they are quite familiar with English. You can have a conversation and build relationships, but it’s hard to share a lot of experiences without a common primary language.

Language as a mental prison
Ok that heading sounds a lot worse that what I’m going to get at. For my first week here, my parents and I sublet an apartment from this guy in Tel Aviv. When we met him, he was really nice and helpful, and in talking to him, I noticed that he used the participle verb form a lot, and in places that I (or other english speakers) would not use it. For example, he said something like: “When I am taking my bike to go somewhere, I usually am not leaving it for long, because bikes get stolen here.” A native English speaker would probably have said: “When I take my bike to go somewhere, I usually do not leave it for long …”, instead of using the participle form.

I noticed him say it a couple times and I’ve noticed a lot of Israeli’s use the participle in unconventional ways since then. A couple of days ago, I asked a friend about it and she said that it’s because in Hebrew they don’t have a participle form, they just have present, past and future. So, when people think in Hebrew but speak in English, it’s hard for them to figure out when to use the vanilla present tense and when to use the participle, resulting in unconventional uses.

So, I started thinking about how language affects how you think, and here’s where the article takes a technical turn. I think in English, so I’m sure my mind is constrained in certain ways that would not exist if I thought in a different language. Obviously, since I don’t think in another language I don’t know how that would be, right? And similarly, people who think in Hebrew are constrained in different ways that I am, like in how they are not sure about participles.

I think this is true for programming languages too. Over the past year, I’ve spent a lot of time programming in Java and Python, and as Java was my first language, it took me awhile to start using some of the more dynamic features in Python. For example, I don’t immediately see uses for dynamically adding a method to a class, and I think that’s largely because I think in a statically typed language. And recently I build a compiler in C++, and when I write in C++, I don’t think to use features like multiple inheritance, because I’m not used to them existing. Basically, the language that you think in tends to restrict how you use other languages, and it may result in you using a paradigm that works well in one language but that is horrible in another.

I’ve been reading a lot about functional programming and have spent a bit (not a lot) of time with Haskell. Everyone says Haskell is “hard” to learn if you’re used to imperative programming languages because you have to change how you think about programming. From this perspective, I completely buy that. It’s hard to get myself to think purely functionally because I’m used to methods having side effects and all of the stuff that isn’t “purely functional.” Since my first programming languages were all imperative, I’m constrained to think in a certain way, and it’s harder for me to think in a different way.

Of course you can get break those constraints, but it takes a lot of hard work in a new environment. With programming languages, I’m sure that if I spend a lot of time with Haskell, I’ll be able to think in the functional way. From observation it seems that the same is true for spoken languages. Of the Israeli’s that I’ve met, the ones that have lived in the states speak english like natives.

And so…
After spending tim here, I’ve begun to understand how important language is from a variety of perspectives. I find it quite interesting and it makes me a lot more excited to finish reading “The Languge Instinct” by Steven Pinker (but I’ve been “reading” it for like a year so we’ll see if that actually happens). I’m starting to think that traveling is really cool because you get to observe these kinds of things only when you dive into a new environment.

Introducing photos

Friday, May 29th, 2009

I’ve been in Tel Aviv for almost a week now, and will be here for 9 more. I’ll write about what I’ve done this week when I get a bit more free time, I need to organize my thoughts a bit more. Anyway, I’ve been taking a lot of pictures and I wanted a place to share them so I spent a couple of hours adding a photos section to this blog.

sunset from Tel Aviv

I did a bit of research on what photo-sharing tool to use. Notable options where various photo-blogging tools, Picasa, and Flickr. Photo-blogging tools were unsatisfactory because each image is uploaded as it’s own post, and that serves a much more artsy purpose than I’m looking for; I want to be able to bulk upload photos. Picasa seemed pretty cool but there’s a fixed disk space quota that I’ll most likely exceed very quickly, and I don’t want to have to pay for the service. So I rested on Flickr. Flickr so far has been pretty cool; there’s a 100Mb a month quota, which I’ve already exceeded for May, but you get effectively unlimited space if you’re patient.

So I uploaded a bunch of pictures and now I was looking for a good way to add them to akrish.net. I ran across this wordpress plugin that pretty much served exactly the purpose that I wanted. I installed the plugin, ran into a bunch of difficulties that took way to long to fix, and eventually got the plugin running. Then I updated my wordpress sidebar so that you can get to my albums from any of the pages on my blog. Since I ran into a couple of problems, getting this to work made me pretty happy.

Long story short, links to my albums are on the right. There’s only one right now and it contains some of the pictures of my trip to Israel, including a tour of Jerusalem and some of Tel Aviv. As I mentioned, I’m over my quota for the month so more pictures will be up in June.

By the way, let me know if there are any bugs as I did have to do some stuff by myself.