Computational Learning Theory

July 4th, 2012

Authors: Michael Kearns and Umesh Vazirani

Avrim Blum taught a course on Machine Learning Theory last semester but as I was busy taking some other courses I didn’t have the chance to attend. I however am quite interested in the computer scientist’s perspective of machine learning and decided to read Kearns’ and Vazirani’s short textbook as an introduction to the subject. I really had very little exposure to computational learning theory (despite working in statistical learning theory), but I found the textbook to be quite a reasonable introduction to the field. Kearns and Vazirani highlight some of the main definitions and concepts and go over some key ideas but omit many other important results in the field, either for lack of space or because the field is quite active.

At the onset, Kearns and Vazirani must first define what learning means, and for this (and throughout the field) they use Valiant’s Probably-Approximately-Correct (PAC) learning framework. This is the most fundamental model of learning in the learning theory community, because it captures the essential aspects of the learning problem while abstracting away everything else. Numerous variants and extensions have been proposed and heavily studied, capturing additional aspects of learning including active learning, learning in the presence of noise, and the bayesian persepective.

A brief summary of the book follows: The PAC-learning is developed in chapter one via a series of insufficient definitions and subsequent refinements. Along the way, we see examples of some learning problems, algorithms, and proofs. Chapter 2 explores the principle of Occam’s razor and show thats algorithms that output simple explanations of the data are PAC-learning algorithms. Chapter 3 unifies all learnability results with the beautiful Vapnik-Chervonenkis (VC) theory, which gives both upper and lower bounds on sample complexity for PAC-learning a concept class in terms of
a quantity called VC dimension, which captures the complexity of class. Chapter 4 discusses the classical and practically popular idea of boosting, whereby “weak” learning algorithms can be grouped together intelligently into a PAC-learning algorithm. Chapter 5 introduces the notion of noise, the statistical query (SQ) model, and shows the connection between learning in SQ and learning in the presence of classification noise. Chapter 6 and 7 focus on unlearnability, first establishing concept classes that cannot be learned in polynomial time (under complexity-theoretic assumptions) and then showing how reductions can be used to prove more unlearnability results. Finally, Chapter 8 shows how active learning, where the learner can construct instances and ask for their labels, can be used to overcome this inherent unlearnability.

One of the reasons I read Kearns and Vazirani is that (in my opinion) in statistics, it is hard to capture notions of computational efficiency but it seems to be essential in many estimation problems. The large body of work from the optimization community (many estimators are solutions to optimization problems), has developed a nice, although different from the CS-theorist’s, characterization of computational efficiency, but when studying limits of estimation (i.e. lower bounds), statisticians have no way to capture computational constraints. We typically study the optimality of estimators via information theory, which ignores computational considerations, but allows one to derive fundamental limits on estimation. In many problems this is sufficient, since we can give efficient algorithms whose performance matches the information-theoretic limit, but in others we are not sure if we can. A natural question in this latter setting is if computational considerations makes the problem harder, so that no efficient algorithm can match the information theoretic limit. The fact that computational learning theory works with a good model of computation and can develop such unlearnability results makes me think that bridging the gap between statistical and computational learning can help answer these questions.

So, from that perspective I think it was a very worthwhile read. The book has gotten me excited about a new perspective on learning and acquainted me with a new community of researchers and I think it is a promising direction for answering questions about computational and statistical tradeoffs in learning. I will say that the book is certainly not comprehensive, but it is a good, short introduction and I found it fairly easy to read. I would recommend to anyone interested in the field, and suggest Professor Blum’s course notes for further reading.

Edinburgh and Glasgow

July 3rd, 2012

For the past week and a half I have been in Scotland for ICML 2012 (which I hope to write about soon). Some friends and I arrived in Edinburgh a couple of days early so that we could sight-see and be tourists before things got hectic with the conference. Unfortunately, we didn’t get to see all that much of Scotland but we did thoroughly explore Edinburgh and also managed to fit in a day trip to Glasgow. Both cities are beautiful, and in exploring both of these cities, I had the wonderful opportunity to learn about incredibly rich political and cultural history of Scotland.

Edinburgh is the political heart of Scotland, with its majestic castle, the Palace of Holyrood House (where the Queen stays on her visits), and the Scottish Parliament, hosted in the joyfully quirky parliament building. To complement the royal sights, there is St. Giles Cathedral, the National Museum of Scotland, with its detailed exhibition on Scottish history, the Scottish National Gallery, two lush gardens (Princes Gardens and the Meadows), and so much more. I also climbed Arthur’s Seat (just a twenty minute hike up a hill by the city), for some spectacular views of the city and its surroundings.

In contrast, Glasgow is the cultural heart. It feels like a much more lived-in city than Edinburgh, which feels too touristy to be livable. Glasgow boasts another beautiful cathedral, and huge necropolis where many famous Scots are buried, and probably the most amazing university campus I have ever seen, among many museums and other things that we did not have time to explore. The university is a set of buildings that sits on a hill on the outskirts of the city, with a majestic gothic-style building looking over the city. It seems like an amazing place to go to school. Given that we were only in Glasgow for a few hours, we did not get to explore as much of the city as I would have liked, but I did enjoy the time we spent there and feel like I got a good taste of the city.

Scotland has a very interesting history, starting from the “indigenous” Picts and Scots, through the Roman struggles and the subsequent wars with England, to the religious struggles of the last millenia. Coupled with their rich and unique cultural, from the food to their Scotch whiskey, to their famous thinkers and artists (David Hume, Sir Walter Scott, J.K. Rowling, and Thomas Smith are either Scottish or spent significant amounts of their lives in Scotland), it is a wonderful place to visit and explore. I will caution that the weather is quite finicky, and even in the middle of summer it was raining on and off almost every day, although it was never more than a nuisance. I was disappointed that we didn’t get to visit the highlands, but I suppose it leaves something to look forward too if I ever find myself back in Scotland again.

Game of Thrones

June 22nd, 2012

Author: George R. R. Martin

After watching and thoroughly enjoying Season 1 of Game of Thrones on TV, and after hearing so many of my friends say the liked the books more than the series, I decided it would be worthwile to read the Song of Fire and Ice books. Actually, I have been meaning to do this for over a year now, but every time I went to the library I found that “Game of Thrones” was checked out, and only recently did I get ahold of the book. Once I did finally get ahold of it, I immediately read it, and sort of got hooked; I found myself reading it at work, spending large amounts of my free time reading, and losing track of time as I got lost in Martin’s intricately designed fantasy world. Given that all of this happened, it is pretty apparent that I really enjoyed the book.

Growing up, I loved reading fantasy novels; I was so impressed at how vividly and painstakingly authors like Tolkien and Rowling described these worlds that they had created. Martin has done exactly that with Game of Thrones; painting a picture of his world and developing a complete history that is revealed in bits and pieces over the course of the novel. Every detail is carefully thought out, every character plays his or her integral role in the story, and the consequences of every event play out in critical ways. Tolkien’s Fellowship of the Ring series is incredibly famous in part because of his attention to these details. Without having finished the series yet, I believe it is reasonable to compare Martin’s Song of Fire and Ice series to the Fellowhip of the Ring is this regard.

One of the things that I recently found to be disappointing about a lot of fantasy novels is that the plot-lines are very clear-cut battles between good and evil, characters are clearly aligned with one of the two sides, and conflicts and resolutions are often incredibly obvious. This is definitely not the case in Game of Thrones and it is one aspect I find refreshing that I think separates the series from other fantasy novels I have read. While there are some characters that are clearly good or clearly evil, it is hard to pinpoint the nature of many of the characters. Indeed many of the characters are truly looking out for themselves, shifting allegiances to whoever can aid their agenda the most. I really liked this political/intrigue aspect of the novel and as I mentioned, I think it distinguish the novel and probably the series.

The one issue that irked me while reading the novel is the fact that I had already seen the TV series and the latter follows the former almost exactly. While reading, I not only knew how events would pan out, eliminating all of the suspense and conflict, but I also had mental imagery of the scenes and of Martin’s world. I was disappointed that I lost the opportunity to be creative in envisioning the world and in retrospect, I wish I had read the novels before watching the series. Obviously this isn’t a criticism of the novel. Fortunately I haven’t started watching season two, and I plan on reading “A Clash of Kings” before watching that. I would caution against watching the series before reading the novels, if you’re into that sort of thing.

Having grown up with fantasy novels but not read one for several years now, I found Game of Thrones warmly nostalgic, and really enjoyed the novel. I am excited to get started on the second book, but, as usual, my reading list is growing faster than I can read, so I’m probably going to switch gears and read some non-fiction before “A Clash of Kings.”

A Random Walk Down Wall Street

June 3rd, 2012

By: Burton G. Malkiel

“A Random Walk” is a whirlwind tour of personal investing terminology and strategies, targeted toward the average new investor. Malkiel’s glaringly obvious thesis is that there is no sure-fire path to beating the market, but that a simple buy-and-hold strategy with diversification will follow the market to almost sure-fire capital gains. With appropriate asset allocation, taking into account one’s capacity and tolerance for bearing risk, Malkiel claims that this strategy will allow an individual to meet or exceed their investing goals. Along the way, we meet investment strategy after investment strategy (almost all of which fail after some point), and we also acquaint ourselves with several investment options, and technical jargon, all of which seems useful for a new investor.

Malkiel presents three main schools of thought: technical analysis, the “art” of finding trends in past stock history to predict future stock history and make investment decisions; fundamental analysis, where true value of a stock is assessed and used for investment decisions; and modern portfolio theory, where risk of a stock is evaluated and higher (systematic) risk generally yields higher expected returns, but also with higher variance. Modern portfolio theory seems to be the most reasonable and successful of the three schools of thought, but it too is faced with the challenge that estimating risk is challenging. The traditional measure of beta is not only hard to compute, but it also does not reflect all of the risk associated with any security. That being said, MPT seems to be the theory that holds the most water to this day.

Malkiel spends on chapter discussing behavioral finance and concluding that markets can actually have systematic biases from well-documented psychological factors such as loss aversion and overconfidence. I’ve been planning on reading “Thinking Fast and Slow,” which is devoted to behavioral finance/economics so more on this later.

Investment options discussed range in risk (and consequently expected return) from “higher risk” index funds and mutual funds to treasury bills and certificates of deposit. There are a wide variety of investment choices and Malkiel covers the pros and cons of many of them. One thing that I found somewhat poorly explained is that in many cases the tax advantages associated with one investment choice make it particularly appealing, but the specifics of these advantages are not covered. I, for one, was curious as to why these tax benefits exist in the first place and how they play such a prominent role in investment decisions. At any rate, tax benefits play a critical role and one should choose investment options with this in mind.

The last section of the book serves as an investment guide with Malkiel’s suggestions on asset allocations for various age-brackets (corresponding to capacity to bear risk) along with hints and tips for successful investing. Some of the big takeaways for me include:

  • Re-balancing one’s portfolio on a yearly basis not only keeps risk at the desired level but can also lead to higher returns as it naturally results in a sell-high, buy-low policy.
  • Dollar cost averaging, or slowly investing in monthly, quarterly, or bi-yearly increments, can reduce risk because it minimizes the timing affects of your investment. Not only that, it can lead to increased profits because your investment during downswings gives you more shares than if you had made the same investment at another time.
  • Diversification reduces unsystematic risk and the risk associated with a well-diversified portfolio will be almost entirely systematic, so you will reap the rewards associated with bearing that risk. Things like index funds and REITs are good vehicles for diversification.

Another theme echoed over and over is that any strategy that exposes some inefficiency in the market for profits will become widely adopted and consequently corrected. In particular, any fixed strategy that is successful is self-destructive in that the popularity of any strategy makes it less successful. Even the buy-and-hold strategy is self-destructive in the sense that the S&P 500 stocks have become artificially inflated by the popularity of index funds that track this group.

I still have a lot to learn about investing but I do think this book is a very good primer. Definitely worth a read, and also probably worth revisiting every couple of years to reacquaint yourself with terminology and investment strategies.

Mr. Yuk at Chicago Invite

April 8th, 2012

Last weekend, Mr. Yuk headed west to Chicago for our last tournament before the series. We were on a six-game losing streak after a terrible Sunday at Queen City Tune-up and a possibly worse Saturday at Easterns Qualifier, but for the first time since the fall, we had a full roster, which suggested we might have a more positive weekend. Results-wise, we certainly did; we finished in 9th place (out of 64 open teams) going 6-1 on the weekend, with our only loss coming to Michigan State (the eventual finalist) on universe point.

Unfortunately, the tournament format wasn’t all that ideal. We were pretty horribly under-seeded after poor performances against good competition earlier in the spring and this meant that to get into the championship bracket (and finish higher than 9th), we pretty much had to be undefeated on Saturday. Moreover, our cross-over game on Saturday evening would be against a two-seed from a power pool, in this case Michigan State. If we were seeded slightly better or slightly worse, our cross-over would have been against a three or four seed, which would have been a much more winnable game. Our loss to MSU meant we could get 9th at best, and by winning out on Sunday, we did exactly that.

First the break-down, then some comments.

  1. Michigan B: As is usually the case with Mr. Yuk, we started out pretty slow trading points early on and going down a break before we really got our act together. They ran a poach-y defense that our O-line had a pretty hard time dealing with, resulting in a lot of miscues and easy blocks for them. Our D-line really picked up the intensity, with much more tenacious man-defense that I’m used to seeing from our team, and also with some nice zone-looks that generated turnovers. We started to run away with the game in the second half and eventually won 13-8.
  2. Wright State: With some momentum from our win against Michigan B, we brought the pain on Wright State, generate break after break. The O-line didn’t need to play all that much, but when they did, were cool and consistent. The D-line worked hard to generate turnovers (actually earning blocks, not just capitalizing on miscues) and then converted time after time. CMU wins 13-3.
  3. Dayton: Building on our earlier rounds, we came out strong right away, generating three breaks to start the game and taking half 7-2. Dayton was plagued with injuries and consequently played pretty short-handed but our defense was pretty stifling. Again O-line did their job. CMU wins 13-6.
  4. Michigan State: As our cross-over game, this was a must-win situation for either team to get into the championship bracket on Sunday. At Easterns Qualifier we played MSU very close, losing 14-16, and I looked forward to a competitive, exciting game. As it turns out that is exactly what happened. We started out trading points, and eventually MSU earned two breaks in the first half. The second half was a different story. We earned three breaks to take the lead 12-11, generated another block with the opportunity to win but a costly turnover swung momentum in favor of MSU. They scored to tie the game at 12s, and we traded to 13s. On universe point, both teams were very impatient, resulting in a marathon of a point with the possession exchanging several times. A costly end-zone turnover on our part enable MSU to work their end-zone offense and score for the universe-point win. MSU wins 14-13.
  5. Dayton: Relegated to the 9th place bracket, we faced Dayton again first thing on Sunday morning. If it’s at all possible, they had incurred even more injuries but as usual, we came out incredibly slow, going down 1-4 right away. Once we got our act together both lines started clicking and we earned our breaks back to win 13-10.
  6. Western Michigan: Another fairly straightforward win for us. We started out trading and then broke a couple of times to take half 7-4. Deep defense was huge for us this game as they looked to huck early and often but we prevented them from completing the majority of these attempts. The second half was went smoothly and we came out ahead. CMU wins 13-8.
  7. Northern Iowa: Penn State was on the other side of the bracket after a surprising loss in their crossover game on Saturday, and we were expecting to see them in the 9th place game. However, they got upset by Northern Iowa, who we ended up facing. Again we came out strong right away, taking a quick 3-1 lead. They broke back once in the first half but we got a couple of more to take half 7-4. In what was a fairly uneventful second half, we ended up winning 12-9 after the game got capped. Thus we ended up in 9th place.

Some comments:

  • We clearly play up or down to our opponents. We played some teams that we should have crushed this weekend, yet many of our games ended pretty close. This meant that we had to keep our starters playing, so that in the games that mattered we were fairly fatigued. We need to get much better at putting away bad teams. On the plus side, we really stepped up to MSU and showed that we can compete with some very good, potentially nationals caliber, teams.
  • Another mental thing, it takes us way too long to get going in the mornings. Come regionals, we can’t dig ourselves into a hole on Saturday or Sunday morning, and really all it means is that we focus and take our warmup seriously. This is purely mental, but, to me, it is critical for our success in the series.
  • The last mental thing, we lost our composure towards the end of the MSU game and I think that cost us the game. We had a couple of unforced turnovers and bad decisions, especially on universe point, where we repeatedly gave them back the disc. Our offense was not clicking on that point and I think that in part it was because we’re not used to such high-pressure situations and lost our cool. We’ll only improve on that with experience but it is something to note.
  • Strategically, our O-line really struggled with the poach-y defense that Michigan B threw at us. Our cutters were very hesitant about moving through the poaches, making it very hard for handlers to find open upfield targets, and resulting in several turnovers. I’m not sure if a lot of teams in our region rely on poach-y defenses but it is something we should work on in the next couple of weeks.
  • Defensively, the one thing that really hurt us in the MSU game was our marks. We started out marking honestly, but MSU’s handlers repeatedly got off around breaks, making it near impossible for upfield defenders to get blocks. We adjusted by having our marks really take away the around option, but this came at the expense of opening up the inside looks, which MSU exploited, again at the cost of our upfield defense. Better marks would ideally challenge both options simultaneously, making it much easier for us to generate turnovers.
  • On a positive note, both our offense and defense clicked much better than it has in past tournaments. On the defensive side, missing two of our starters at past tournaments definitely contributed to our relative success this weekend. Offensively things also seemed to work very well for the most part (apart from a couple of miscues and the poach-defense thing). Defensively I was especially happy to see more and more people laying out and actually contesting both underneath and deep throws.

All in all, we had a solid weekend, winning games we should win, and with a really close, competitive game against a very good team in MSU. However, there are definitely things to work on in the next couple of weeks leading up to regionals. With our region only earning two bids to nationals, we’ll have to iron out all of these kinks if we want to make a push for one of them.

Conferences is next weekend so look for another post in a week or so.