Some books I read this year

It's been a good year for reading. Here, in no order, are a few books I enjoyed:

(1) My Struggle (vol. 1), Karl Ove Knausgaard.

Breathtaking. There's a good chance you've read glowing reviews elsewhere by now, so here are a few reactions that are more personal: Among other things the book is a work of applied epistemology, and is masterful in this respect. I enjoyed the discussions of Western culture, and how things like masculinity get expressed in that culture, that are not as heavily influenced by American culture as most of what I read. The book is structured remarkably well: it's easy to judge novels by the quality of their set pieces or most interesting passages, and Knausgaard scores very well on such metrics, but the book is also wonderfully composed on a macro level.

(2) The Hard Thing About Hard Things, Ben Horowitz.

Informative, insightful, and fun. It reads like an unevenly polished brain dump, which I don't mean as a criticism. I'll pay for as many of these as Ben Horowitz wants to dump and unevenly polish. It's a constant stream of good advice presented plainly and clearly.

Horowitz made the interesting choice to address the reader as a CEO--that is, to make a startup CEO the formal audience of the book. Surely the vast majority of the book's readers are not CEOs. Of course, the information is useful for anyone, and especially anyone at a startup, but perhaps the reader learns the book better after assuming the CEO role for the sake of the narrative. The line between fiction and nonfiction sometimes blurs.

(3) The Namesake, Jhumpa Lahiri.

It's funny; I agree with almost every criticism I've ever heard of Lahiri (William Deresiewicz's comments are the best of the Lahiri-bashing genre), but I still read her with engagement and interest. This book was no different; its Boston setting probably helped.

(4) Addiction by Design, Natasha Dow Schüll.

It's a book about slot machines, discussing the relevant psychology, history, economics, and much more in great detail. It's thorough and well-researched, but always interesting. Schüll should be commended for never letting fact-recitation distract her from the book's big questions: What makes people play slot machines? How does it feel to play one, and to play one as an addict? (And how different is the addict's experience)? And so on.

Fantasy baseball: how good is PFM at using PECOTA?

Although I’m a baseball fan first and a fantasy baseball player second, I do play fantasy vigorously. As I’ve transitioned into being a professional computer/data guy, I’ve used baseball as a way to learn tools and practice “data science” skills. One nice feature for a beginner is that there’s enough data-cleaning work to give me a small taste of what it’s like, but not nearly as much as in other domains. With all the excellent data sets out there, I can get to the fun stuff quickly without completely neglecting data collection.

Valuing fantasy baseball players divides, as a first approximation, into two projects: predicting players’ outputs and using those outputs to generate fantasy values. I know many serious fantasy players do their own projections, but at least for now I’m guessing I can help myself more by working on the second step than on the first. I’ll make some obvious adjustments for injuries and similar news, but that’s about it for now.

That second step, generating fantasy values from projected statistics, is complicated, but my league’s (fairly common) structure lets me make certain simplifications. It’s a head-to-head league in which each of twelve categories is worth a point, and whoever gets the most points wins.

The suppositions I’ve chosen to make so far are:

(1)    That a player’s fantasy value is, for each category, proportional to the sum of the probabilities that his contributions to a given category cause a fantasy team to win that category.

(2)    That this value is proportional to the standard deviation of the distribution of team scores in that category in the fantasy league.

(3)    That a player’s output in a category can be represented by his average projected weekly output in that category.

(4)    That the means and standard deviations I’ve derived from my league’s last two years of results are a fair guide to how the league will play this year.

One could criticize these suppositions in many ways. The last one, for example, is particularly suspect. It might be better to create a model based on 2014 projections and estimate the values that way, but there are some advantages to my method. One advantage is that it was great fun to scrape all the results from ESPN.com. (Many complications arose here, but BeautifulSoup and Requests were enough for the bulk of the work.)

(2) and (3) are similarly approximate. Supposition (2) might cause me to overvalue Billy Hamilton, for example, who will get many of his stolen bases when his fantasy teams have already locked up the category for the week. Using supposition (3) leads me to ignore differences in variance of outcomes between players, which is probably an oversimplification. And criticisms like these could be multiplied.

All that said, I think that a very useful first approximation results from this kind of strategy: use the projections to guess at what a player will contribute every week, use standard deviations to estimate the number of category wins that those contributions will add, and add those category-wins up.

Nothing I’ve described is statistically or technologically difficult, and I’ve chosen a similarly simple-but-reasonable method for adjusting player values at each position. (Estimate how many players n will be drafted at each position and set the n+1th-best player at that position’s value to zero, adjusting everyone else accordingly.)

My results are already surprising: they differ significantly from Baseball Prospectus’ Player Forecast Manager. What’s surprising about this is that I’m using the same PECOTA projections that BP’s tool is, so that any differences in our fantasy valuations are not due to differences in projections, only to differences in deriving fantasy value from projections. (I’ve also configured the tool to use the same categories that my league uses.) Even differences in determining positional value can’t account for too much of the differences, because even filtering each set of results by position leaves wide disparities. (I can’t give many examples here, because PECOTA and PFM data are subscriber-only, but two of the many claims that result from my method are that BP is--for my fantasy league--overvaluing Starlin Castro and undervaluing Jason Kipnis.)

Perhaps metarationality requires me to defer to BP completely here; it seems clear that I at least ought to partially adjust my beliefs once I see that a company of very smart people, plenty of whom get hired away by MLB teams, disagree with me. However I decide to weight my own results and PFM's come draft day, right now I’m left with a disjunctive conclusion. I think one of these must be true:

(1) BP’s calculations are way better than mine. This would be interesting as an example of a situation where a sensible basic method diverges enormously from the best method.

(2) The only major league feature that PFM does not explicitly account for--that my league is head-to-head--makes a big difference to player values. This would be interesting as a fact about fantasy baseball.

(3) BP’s calculations are not way better than mine. This would be interesting for many reasons. Although I’m not at all sure that my algorithm is better than or even roughly as good as theirs, this is a possibility worth at least considering. If it were true, it might mean that my league is idiosyncratic and that I’m doing a good job of adjusting to it; it might also mean that I got lucky to choose a different and better basic approach from PFM’s.

There’s much more to say here, but I’ll say it better after I work harder to find patterns in the differences between my results and PFM’s, and after I work to improve my algorithms. (The answer to the title question is currently: I don't know.) For now I’ll simply suggest that serious fantasy players might do better to spend less time projecting players and more time valuing players based on projections.

Getting Things Done: two months later

Two months ago I wrote a glowing review of Getting Things Done. I’m still happily using the system (which, for whatever it’s worth, is little more than a commitment to keeping a list of your projects and the next sub-projects you need to do for each of them). Its most surprising benefit is that “switching costs” are much lower than they used to be. Knowing that I’m not going to forget an important task makes it easy for me to focus on what I’m doing now even if I just started doing it.

Some of the book’s advice, though, is either incomplete or outdated. Here is a small addendum to the system:

(1) Not all projects cleave neatly into next actions, on the one hand, and everything else, on the other. Much of my work is coding and writing. It is not convenient, and usually not helpful, to put “write the next paragraph” on a list and then replace it with a similar item when the paragraph is done. Code is similarly difficult and inconvenient to plan like this.

Rather than abandon the system for these projects, I just try to list all the project-parts that could be done next, updating and deleting as the project progresses. It’s not perfect, but here metaphysics simply doesn’t agree with the system.

(2) David Allen recommends sorting tasks into “contexts” in which sets of them will be appropriate to do: e.g., one for tasks that require a phone. Many of us today, though, have few different physical contexts for work: we work at computers from which almost everything we do is accessible. Now it’s probably better to optimize your next-action-grouping for fast search, not for context-appropriateness. I group mine by rough similarity and relatedness of parent projects; I also have “writing” and “coding” lists for time I’ve devoted to those or just find myself in the appropriate frame of mind.

(3) Many of the tools Allen recommends are amusingly out-of-date. (The technical resources he mentions would make for a nice round of trivia.) I find that Trello is a great way to manage my tasks; it can do the job of most of the technology Allen recommends.

Stuff I wrote elsewhere in 2013

2013: What I've been reading

I'm not the sort of person who produces particularly good "best books of the year" lists. I simply don't read enough new stuff. Also, my new job has caused me to read more about biology and computers, and to do more of my book-consumption in the subway or via audiobook as I commute. That, in turn, has shifted some of my reading from fiction to nonfiction, which feels less distorted by the audiobook format. The best stuff I read this year was King Lear, Pride and Prejudice, and The Great Gatsby. But that doesn't make for much of an end-of-the-year list. (If you want a new discovery, I quite like Pierce Gleeson, who posts a bunch of his stories here.)

Predictably, much of my 2013 nonfiction consumption was not of books I read front to back. There were lots of blog posts, textbook chapters, and MOOC lectures--I generally use MOOCs like libraries and not like courses.

Even so, however, here are ten nonfiction books that were new to me and that I found valuable this year, in no order:

  • Average is Over. The most thoughtful work of economics and the most exciting "futurist" thing I read this year.
  • It turns out that Web design is less exciting to me than most other kinds of coding, but I still think Jennifer Robbins' book on the subject is good.
  • I am currently devouring The Structure and Interpretation of Computer Programs, available for free here.
  • Another old, great computer book that was new to me: The C Programming Language.
  • Among poker books, PLO Quick Pro was probably the best one I read (no link in case you don't want the video that plays automatically on the book's Web site). Among strategy books that do not cost $400, I thought Playing the Player was the best I read (though I didn't read as many new poker books as in years past).
  • I learned lots from the information-dense and usefully opinionated Engineering Long-Lasting Software.
  • David Allen's Getting Things Done has made me significantly more productive. I reviewed the book here.
  • How To Cook Everything Vegetarian made my food life better.
  • So did Classic Indian Cooking.
  • In my dissertation work I read a lot of excellent work on Plato's Theaetetus. I gained new respect for David Sedley's book on the dialogue, which I already liked.

Contemporary philosophy and coding

The other week I was talking to a recently graduated, poker-playing philosophy major who is trying to learn about computers as he looks for a job. (We were introduced by a mutual acquaintance who, I gather, had a very easy time deciding the two of us would find something to talk about.) The conversation turned to the ways in which philosophy can prepare one (or not) for work with computers. He thought my remarks on the subject were more interesting than I thought they were, which prompted me to think a bit longer about it. I had surprised my interlocutor by claiming that philosophy was quite directly relevant to coding; further reflection not only reinforced that belief but revealed a further, surprising aspect of that relevance.

Though I wouldn't claim that studying philosophy helps you learn to code more than studying computer science or math does, I'm very confident that the connections are real and deep, and go beyond the basic observation that the clear thinking demanded by good philosophy pays off in any subject. Studying the mechanisms of language is now a central and unavoidable task for a student of philosophy. I've spent long hours spent thinking about sense vs. reference, the relationship between utterances and states of affairs, what makes something a complete statement, and so on. I imagine anyone who has done this will have an easy time understanding, say, the distinction between passing by value and by reference.

I first learned the lambda calculus (from this book) in a philosophy course, and that's directly applicable to coding even if you aren't doing heavy-duty functional programming. Distinguishing various functions of verbs of being prepares one to use distinct symbols to check for equality and to make things equal. Logic courses, required in many philosophy programs and encouraged in many more, are of obvious importance. And, of course, UX is largely just a subfield of applied epistemology, and various ethical subjects inform every part of a coder's work...

...But what surprised me at the end of this train of thought is that it's not just the cognitive strengths trained by philosophy or even specific healthy patterns of thought that help a coder. Even the discipline's more problematic features have helped me. If anything, contemporary philosophy is too focused on language, but that focus has been a great benefit to my career (see above). I've also long suspected that academic philosophers have started to value logical consistency too much, and that we might be better off if we asked philosophers to spend a little less time worrying about the ways their definitions interconnect and a little more time thinking about whether those definitions are worth worrying about in the first place. Again, though, this is plausibly a benefit to those of us who now spend lots of time thinking about the properties of unusual data structures. And reading technical documentation is easier when you've read lots of journal articles that are optimized, even over-optimized, for proving and communicating logical features of a theory.

Some of those remarks are quite controversial (though I'm far from alone in believing them, and would be willing to defend them at length). Notice, though, that even if I'm wrong to think that some of these are problematic features or emphases of the discipline, those features and emphases would still be useful training for coding. Of course, I do believe that philosophy is a wonderful and noble thing, that academic philosophy is in many ways flourishing and at its historical peak of quality, and that philosophy's best features help its students in many careers, including mine. But the value of philosophy to computer work seems to me slightly more independent of the quality of the state of the discipline than I would have expected.

"I wouldn't have learned this anyway!"

I work in genomics and bioinformatics now. As an undergrad I was a math major and took plenty of writing and philosophy classes on the side. I haven't taken a (non-MOOC) biology or computer science class since high school. An obvious question to ask myself is whether I regret not having devoted more of my undergraduate career to stuff I'm working on now. It would be nice to have had more of a head start in the various subjects that intersect in bioinformatics--but I was an undergrad from 2001 to 2005. Back then, next-generation sequencing did not exist (or barely existed, depending on one's definition). So much of modern bioinformatics is an attempt to solve the questions set by this technology, and that fact puts a hard limit on how much I could possibly have learned about my work back then.

As for the computer stuff: yes, some extra work in algorithms and extra practice debugging would have helped. But, as I've written elsewhere, computer science and programming is probably the best subject for self-study today (I wrote a little bit about my experience here). And--crucially--some of the most important computer skills are not things you learn as a CS major. Working in a shared repository; reading and debugging others' code; finding the right libraries to use in your own project and making sure the license permits you to do so... there are, as far as I know, not so many places in which a CS student spends a lot of time doing these things, and very few in which these are requirements of a CS major. A good friend of mine was a diligent, smart CS major who still uses his computer skills hourly in his work. He does not have a GitHub account.

So, looking back, those poetry and math and philosophy courses look pretty good. This is not only because they constituted a liberal arts education that is good to have regardless of its precise relationship to one's career, but because they--at least arguably--have been at least as relevant to my computational biology career than either computer science or biology classes would have been.

One year later

It was about a year ago this time when I started Project Euler as a way to brush up on some math and perhaps learn something about computers along the way. This was a perfect complement to my dissertation-writing. These were small projects that I could solve any way I wanted--some hardly even required a computer--and for which the costs of failure were small. Computer work and philosophy remain ideal twin pursuits, but the balance between them has shifted: I'm now a full-time worker at a genomics startup. Like everyone at a new-ish startup, I do too many things for a job title to communicate much about my job, but there's a lot of coding and other technical stuff involved. Somehow my whirlwind tour of Ruby, Python, SaaS, Ruby on Rails, Django, algorithms, C, C++, SQL, and assorted front-end stuff (and various small study projects I'm forgetting) ended in bioinformatics... which is perfect for me. I constantly need to learn new tools and technology, which I think is great fun, and the underlying subject matter is fascinating. Genomes behave in stunning ways.

I can't imagine a better topic than coding (broadly understood) for Internet self-study. Free and cheap resources are everywhere, from books to auto-gradable projects. Even if I hadn't wound up benefiting professionally from my studies, my life would still have become more efficient, more informed, and more fun.

I've been through a (virtual) mountain of books, tutorials, podcasts, blog posts, essays, and Tweets this past year. Perhaps there is some future benefit, to me or other people, to my listing a few resources that have been most helpful.

Engineering Software as a Service. The introductory chapters are one of the best introductions to, and defenses of, agile development. More than any other book or blog post, it has helped me to think about SaaS accurately and effectively. (Not that I don't have a long way to go to really understand SaaS.) I've also watched a bunch of Armando Fox's freely available class videos, and his lecturing style is excellent: blunt, content-rich, and serious, but with a sense of fun.

The Zipfian Academy's blog post on data science resources may or may not be comprehensive, but it's led me to a bunch of the most useful online courses, books, and other resources. And the structure of the post itself contains some useful lessons about the structure of the discipline.

Lynda.com has been an incredible bargain for me. Kevin Skoglund's courses are particularly good. It's been pleasantly surprising how useful the videos are as not just as courses but as quick references. Not infrequently, remembering and returning to where I learned something on lynda.com is the fastest way for me to access some solution, technique, or even command. (And I'm pretty good at Googling.)

Kernighan and Ritchie's seminal book on C is even better than its reputation led me to expect--not least because the prose is remarkably clean, effective, and readable. They do the reader the service of allowing one to think about C, not about untangling sloppy sentences.

Bill Howe's Coursera course on data science is the most useful one I've tried on the subject. The lectures might charitably be called "minimum viable" from a production perspective, but Howe has obviously thought carefully about how to teach the material, and the viewer frequently benefits from the breadth of his knowledge of the subject. He frequently discusses how a subject or tool fits in to the whole current landscape: whether there's something better for a given task, what prompted its development, what misconceptions might be causing people to misuse or misunderstand it, and so on. Also, the exercises are much better than average, and well worth doing.

WSOP 2012: indirect trip report

I enjoy writing trip reports of my time in Las Vegas. I never got around to finishing my writeup when I went out to play the World Series of Poker last year, though. The notes I took are pretty funny when I look back at them now: "Bally's 7/11 Freedomfest"

"Jerry's Art-a-rama"

Those were events listed on a big board at Bally's--possibly the schedules for various of the hotel conference rooms.

"Woman as if shushing spilled water."

This, if I recall correctly, comes from video poker at Bally's. (I learned enough VP so that I can get coffee more cheaply by playing VP and getting it comped than by buying it directly.) A very drunk woman spilled a glass of water on the bar and her first reaction appeared to be to shush the water.

"Tanning pseudoscience."

No idea.

"Paul Mullins--rugby charity poker?"

A note from when I was scouting my day 2 table draw.

"Old-school VP font."

I thought it was interesting how VP machines quite often have a retro aesthetic. (It seems that I used my VP/coffee intervals as times for serious reflection and note-taking.)

"Tiny stakes at Sahara--$40 + $2 phone. Open-air tournament at O'Shea's. Late night at Aladdin. Off-menu cheap steak @ San Remo. 20-40 LHE @ Mirage when that was a thing. Caesars and Wynn when they were supposed to be the center of the poker world. Old room near nightclub at MGM. Upstairs @ Foxwoods where you could play 5-10 LHE w/ full kill & watch the gathering snow go red in a CT sunset."

Looks like I was recalling poker-related experiences that are now, in various senses, no longer possible. Seems like the sort of thing I would write during a VP session.

"Scooter W bath"

During one of the first bathroom breaks of the Main Event, I was exiting the men's room and saw a big guy on a scooter speeding out of the women's room. Another bystander and I made eye contact. I furrowed my brow to express confusion. He said: "what the fuck?"

"33 unbluff image / r UTG 525 euro call SD call / 832r bet 850 call call / Q bdfd bet 1750 fold r 6500 / riv 2o ch bet 11k of 20k fold."

The record of a hand I played on day 1. The first-ever strategy hand of the Thinking Poker Podcast. I raised under-the-gun with 33 and what I believed to be an unbluffable image. A European pro called and so did a guy in a San Diego Padres cap. The flop came 832 rainbow; I bet 850 and they both called. The turn was a Q, putting up a backdoor flush draw. I bet 1750, the European pro folded, and the other opponent raised to 6500. I called. The river came an offsuit deuce. I checked, the opponent bet 11000 of his remaining 20000, and I folded.

"Gareth, killer impersonation, you know it's accurate w/o having ever seen the target."

I met Gareth Chantler for the first time. We had Indian food on a dinner break. I was impressed at his impersonation of an acquaintance of his and it prompted some reflections on epistemology and authenticity.

"Appleman quote?"

Probably this refers to something in The Biggest Game in Town, my favorite nonfiction nonstrategy book about poker: if I recall correctly, Mickey Appleman is quoted as saying that poker would not be worth playing (for him) if it were not a spiritual endeavor. Unsure what prompted me to recall this.

"Plays poker as expression of his moral outlook. Our Commitment to Excellence. O/U 2.5, # of companies in which he is a big [illegible] decisionmaker."

My impression of an opponent at a table I was moved to partway through Day Two of the Main Event.

"I trade, on avg., 5 to 600K a day. I'm what they call a swing trader. --Giants"

On Day One I played with a guy in a Giants uniform who said this. I believed and still believe it's true.

"LAD cap gesturing w/ beer, spilled on ~50yo woman."

A guy in a Dodgers cap couldn't resist talking with his hands, I guess. I remember this being really funny.

"Pushing guy raises to 1100 of 8k, next 3b to 2325 (!!!) He prob has a hand but I can't fold AK (reasons). Call button, raiser calls. Flop QJx hh. Ch, bet 6k (!!), I fold (I have Ah), fold. He show QQQ."

A hand from day 1. Someone obviously trying to gain chips raised to 1100. The next player to act reraised to 2325, and I thought the bet sizing was very strong, but I also thought that AK and position was too much to fold. Not wanting to four-bet, I called on the button. The original raiser also called. The flop came QJx with two hearts. It checked to the guy who'd made it 2325, who bet 6k, which is a very big bet for this situation. I folded even though I had two overcards, a gutshot straight draw, and the Ace of hearts. The other guy also folded and the bettor showed that he'd flopped top set.

"'I'm like, 'Excuse me. I'm in business school. I'm a real person.' (Aria buffet, unironic.)"

Overheard. The Aria buffet is quite good, FWIW.

"Battle rap?"

This was happening near the escalators next to Bally's on the Strip, right where Elvis impersonators often hang out. Two guys with a boombox and a cheap microphone. One was quite good, the other terrible.

"Nose moving chips around b/c getting head massage."

This was what an opponent of mine was doing in a cash game. I was equal parts amused and grossed out. This note was written on an Aria napkin, which makes me think this was a 2-5 NL game.

"AK r, Ali rr 2600, call -- K52r / 2 / 9 -- 3k/6k/11.5k AQs"

A hand where I called down a bluff. A woman named Allison reraised my early-position raise and then fired all three streets on a K52/2/9 board with no flush draws possible. I checked and called all the way down. This was uncomfortable: that kind of pressure represents that one pair is no good. But I thought she'd be prone to pot-control or bet less with a hand like AA, and I thought she wouldn't reraise with 55 or 22, and I had a blocker to KK, and anyway she was giving off a lot of aggressive signals, which often means weakness. Also from scouting her on Twitter I suspected that she hadn't played this tournament (or at least made it to Day Two, when this hand took place) before--and such players are often over-prone to chase after money they've invested in a pot by launching bad bluffs. I decided to call with my AK and she was bluffing with AQ that had totally whiffed.

Review: Liars and Outliers

Without ever slamming the book down in disgust, I seem to have stopped reading Bruce Schneier's new book halfway through. He's a smart guy and I love his blog. But this book is--by design--short on security details. Instead it is concerned to give a unified theory of trust, society, and indeed human motivation.

As social theorists go, Schneier is far from the worst. He's an insightful guy and a clear writer, if a bit too ready to deploy Humean assumptions without argument. My guess, however, is that Schneier's best readers enjoy his work for its attention to detail and focus on the smaller-scale mechanisms of security. Indeed, it is Schneier himself who largely trained me into wanting, in a treatise on security, less grand theorizing and more discussion of smaller contexts.

I regret neither buying the book nor failing to finish it.