My Podcast Interview on GLIMPSE

Standard

A few weeks ago, Alex Albanese interviewed me on the MIT post-doc GLIMPSE Podcast. We discussed populations, cakes, toilets, Google and many more topics. The podcast was released yesterday.

http://glimpsepod.scripts.mit.edu/home/2015/12/02/episode-4-le-nguyen-hoang/

It was an awesome experience. Alex is a great interviewer. And, even though he is a biologist (I guess I can forgive that ;)), he was extremely sharp and quickly saw deep connections between the various mathematical topics I discussed. It was a real pleasure to interact with him, and I think that the podcast shows this very well. In fact, I highly recommend you to check his other GLIMPSE Podcasts.

I gave a talk at Harvard yesterday!

Standard

And I loved it. It was part of the still-figuring-itself-out joint Harvard-MIT Postdoc science sharing seminars. My talk only lasted 8 minutes, and was about the online optimization problem I’m doing my research on. But there was hardly any mathematician / computer scientist / statistician / electrical engineer in the audience. And this might be why I really enjoyed it.

It’s a bit of a paradox though. 8 minutes, that’s very short. And to give an 8-minute talk to non-specialist is a huge challenge and a frustrating endeavour. I mean, there was not even enough time to describe my model! Nor could I explain how it generalized a bunch of problem, why my approach to the problem (based on tracking dual variables using primal indicators) could give a unified view on a bunch of things and why I was stuck in my proof attempt. These are the truly exciting things of my research. But there was no time for any of that.

But that’s okay. Because my goal was not to show what I have successfully achieved — as is so often the case in talks for specialist. Nor was it to teach the audience something they will be tested on. Instead, as always when I do science and math popularization, I wanted to show why what I do is so exciting, or, at least, how excited I am to do what I do. And the only way I know to do so is to make sure that the audience and I all share a great time. And we did — beer helped! At least I did.

I don’t have much of a career plan but I do know that this is exactly what I like most. I love sharing exciting mathematical moments with an audience…

My First Months in Boston

Standard

I’ve been lucky enough to start a PostDoctoral position at MIT. It’s been over a month. And it’s been rough.

The reason why it’s been rough is not because MIT is a bad environment — albeit hanging around with top scientists is definitely intimidating. It’s not even because of the huge amount of snow that’s been falling on Boston — I thought my Montreal cold days were over… The reason why it’s been rough is that I’ve entered a new area of research (online optimization), I’ve had a lot to read and learn about, and I wanted to do it well. So I’ve worked hard. Very hard.

What was both intimidating and exciting is that the field of online optimization is exploding. Within the last two years (if not the last six months), many ground-breaking papers have come out, with surprising results and hugely technical proofs. It’s intimidating, because it was hard for me to imagine that I’d be able to contribute even a iota to the wonderful results that have already been proved. But it’s exciting, because it also means that many important questions have not been answered yet, and that there’s a greater chance that the contributions I’ll make will have a significant impact. That’s what you want, isn’t it?

In any case, those were ideal conditions to compel me to work hard. And I did. I’m glad I did. After all, that’s why I’m here.

Inkscape on Mac

Standard

If you’re a Science4All reader, you might wonder how I did the (beautiful :P) figures of the articles. The answer is: Inkscape. Inkscape is a brilliant vectorial drawing software. It’s a like a high quality version of Microsoft Paint. It’s easy to use, and yet, you can be very accurate in your drawings. Plus, using extensions like textext, you can easily include LaTeX formula.

So far though, Inkscape was only available on Mac through X11 architectures like XQuartz. This has several annoying drawbacks. Yet, today, I’ve found out that a native version has come out! I’ve been using it a little bit and it appears to be working very well! Here’s a download page:

https://www.dropbox.com/s/n2l4nvht8umf83m/Inkscape-r0.48.4-r9943-10.8%2B-x86_64_RC5.dmg

You can follow instructions of Nicolas’ PhD blog: 

https://nicolasamiot.wordpress.com/2013/07/25/inkscape-textext-on-mac-os-x-mountain-lion/

I’ve actually had troubles installing EggBot, so my textext extension is not working, but I can drag and drop LaTeX formulas from LaTeXiT, and that’s good enough!

If you want beautiful figures (especially for math!), I strongly recommend you to go with Inkscape!

World Cup Predictions

Standard

The FIFA World Cup is approaching. I am excited.

A friend of my fine referred me to this beautiful chart, by Andrew Yuan. According to his simulations, Brazil is hugely favorite, with three times more chances to win it than the second best candidates. I want to congratulate him. He’s done a wonderful job. I wish more people could (or tried to) share their mathematical and computational efforts in such an elegant and entertaining way.

However, in science, one question needs to be systematically raised. How trustworthy are the results?

To find it out, we need to look up his model. In essence, he explains the levels of football teams with two factors. First is the FIFA ranking. This ranking is derived from recent game results, with points attributed to wins and draws that depend on teams’ opponents. It’s pretty messy, so I’m not going to attempt an explanation of this ranking here. Second is the Home/Away factor. As any football fan knows it, it’s an advantage to play at home, and the History of World Cup definitely backs up this assertion. Indeed, winning home teams include Uruguay (1930), Italy (1934), England (1966), West Germany (1974), Argentina (1978) and France (1998). The following figure yields a powerful visual representation of this phenomenon.

Image

Yuan then goes on estimating the probability that team A beats team B given these two factors, by looking at the historic of games since 1993. Once again, this corresponds to the figure above. It is rather obvious from the figure above that better ranked teams are more likely to win. But Yuan went further and drew the underlying curve that best fit these data. This curve is what predicts exactly, in Yuan’s model, how FIFA ranking + Home/Away/Neutral affect A’s probability to beat B.

So, what’s wrong with Yuan’s model? I’d say that it’s a wonderful attempt at modeling football games, which already requires a huge amount of work. However, it may still not be detailed enough. Some important factors may be missing. For instance, it’d be interesting to look at how the number of missing players affect a team’s probability to win. If Argentina has to play without Messi, it’s not going to be the same Argentina (even though the factors FIFA ranking + Home/Away/Neutral are unchanged!). On the opposite though, we have to be careful about not adding to much factors, as over-fitting the model may unveil meaningless patterns.

One way of not increasing by too much the number of factors is to question these we already have. Once again, the figure above is particularly good to illustrate what I mean. We clearly see that the FIFA ranking alone is not good enough to explain team A’s probability to beat B. This is particularly true when you consider teams that are separated by less than 20 places, which will be the case in most games of the World Cup. In fact, as you can see on the figure, the Home/Away/Neutral effect is, in this case, way more relevant than the FIFA ranking. Crucially, the FIFA ranking may simply not be reliable enough. This blatant fact is what explains the huge difference between Yuan’s prediction of Switzerland’s chances to win and bookmakers’. In fact, I’ve already criticized the FIFA ranking here, where I point out that it’s not based on any solid mathematical ground and sounds much more like some sort of obscure machinery. Because of that, Yuan’s prediction is sort of like predicting the future of 4-year-old kids based solely on their abilities to count to 10. It may yield some indication… But can’t we do better?

I think so. In fact, 8 years ago, I made my own predictions for the 2006 world cup. Results were (misleadingly) amazingly good. Find out how I did it by reading this article I wrote. Lately, I’ve pondered a use of a more robust Bayesian approach to this modeling. Maybe my statistician days are not over… For 2014 though, I won’t have time to run simulations!

My Experiment on Truth-Telling in Shift Scheduling with Employees’ Preferences

Standard

I have been very fortunate. In my PhD research, I have had the chance to venture through diverse seemingly disconnected areas of applied mathematics. I have been combining operations research and game theory, as well as theoretical proofs and computer simulations. My bibliography overlaps papers from economics, computer science and mathematics, while the connections with philosophy, biology and industry seem clearly within reach. Yet, lately, I have found myself doing even more unexpected things.

Yesterday, I gave a talk which invited 20 other students to join an experiment which tests the theoretical and computational outputs of my research. In brief, my research consists of a shift scheduling algorithm which includes employees’ preferences. One big question I raised amounts to the incentive-compatibility of this algorithm. More specifically, will employees have incentives to reveal their preferences truthfully?

Quite often, I’m asked: “Why would an employee not want to reveal his preferences truthfully? How could it not be in his incentives to do so?” After all, the shift scheduling algorithm aims at optimizing employees’ satisfactions. So, how can it yield me better shifts if he uses untruthfully revealed preferences rather than my actual real preferences? Well, let’s take a cake to figure it out. Suppose the cake is half vanilla, half chocolate, and that there are three contenders. Now, imagine you like vanilla and chocolate equally, but that the two others have strong but different opinions about vanilla chocolate. One loves vanilla. The other loves chocolate. A cake-cutting algorithm aiming at maximizing the sum of satisfactions would then yield all the vanilla to the vanilla-lover, and all the chocolate to the chocolate-lover. This leaves you, chocolate-and-vanilla-lover, with nothing. That’s because the cake-cutting algorithm doesn’t maximize all contenders’ satisfactions (this actually doesn’t make any sense); it merely maximizes the sum of the satisfactions.

With this insight, the more natural question that comes in mind is rather: “Among all preference revelations (and there are lots of them!), how on earth could it be that the best revelation is always the truthful one?” It seems unlikely. Worse, it seems impossible to design a shift scheduling algorithm that would guarantee truthful revelations to be optimal. In particular, the nicely optimized shift scheduling algorithm I have been developing in the last year seems unlikely to yield truthful revelations as employees’ optimal strategies. I want to make sure of that.

Surely enough, I could have made computer simulations to search for optimal strategies (and, for cake-cutting procedures, I did!). But wouldn’t it be more convincing if, in addition to formal proofs of that, I had user experience feedbacks that point to this weakness of my shift scheduling algorithm? This is why my professor convinced me to organize an experiment with human subjects playing the roles of strategic employees. Hopefully, they’ll be smart enough to figure out optimal untruthful strategies! Hence pointing out a major flaw in my shift scheduling algorithm.

At this point, you might wonder why I’m so eager to have a well-founded case against my own shift scheduling algorithm. It’s because I have a way to fix that. This fix is based on heavy mathematics and computer codes. I’ve been trying to sell it. But buyers have been having troubles to see why such an (intellectually) expensive fix should be worth purchasing. Probably because they don’t see what flaws the fix fixes.

But does the fix works? Interestingly, the second part of the experiment will help us unveil the efficiency of this fix, as I’ll add it to the shift scheduling algorithm. Hopefully, if my fix is good enough, truthful strategies should suddenly become optimal, or, at least, near-optimal. Hopefully.

To be perfectly honest, I have my own doubts about the performance of this fix (mainly due to computational limitations). It’s a huge challenge that my work and I will be facing here, and, frankly, I’m quite worried about it. I’ve got used to working silently and alone on my research ideas, testing them with myself and my supervisors being the only judge, and exposing them only once I consider them mature enough. Yet, in this experiment, I feel like I’m working in a very exposed manner. It adds pressure. It adds stress. It adds anxiety.

But it’s exciting.