The Tragedy of Cut Features

Having to cut features from a product can feel like this:

This week I made the difficult decision to cut Influence’s bill annotation feature, which would have allowed people to markup the full text of bills and share their notes with others through social media.

The nail in the coffin came on Tuesday: sometime over the weekend the Government Printing Office added bill text search to its robots.tx file, preventing YQL from querying the data.

Image

The final “Disallow” line was added over the weekend, preventing us from using YQL to query for the full text of congressional bills

Without YQL our app would encounter CORS issues trying to query the data directly. We could have set up a server to get the text or bulk downloaded all bills and served them ourselves, but this problem popped up on our last day of feature development. I had four hours to scrap or salvage the feature.

I decided to cut Annotator. It was a heavy library that required jQuery Migrate to work with jQuery 1.9 and above, which threw constant warnings that Annotator’s own code would need to be migrated to the new jQuery functions. Besides that, Annotator doesn’t play nicely with dynamic content or AngularJS — two things that were absolutely non-negotiable for us at any stage in development.

Cutting that feature allowed me to refocus on delivering full-text bills to our users. We had discussed loading the bill text in an iframe as a last resort option, but I wanted to make it look a little nicer. Within a few minutes I had converted the bill feature to show the full text of bills as a PDF inside of an embedded Google Docs PDF viewer.

Image

New Bill View after cutting features

It was a disappointing technical setback to scrap annotations, but it was out of our control. Who could have known that the federal government would block our queries? It wasn’t worth getting frustrated when we had so many other cool features to finish, and we were able to replicate most of the feature’s original functionality with a simple fix.

We’re getting excited to wrap active development at the end of this week. Stay tuned for our launch!

Influence Update: Word Search

I’ve been having a lot of fun with a new site feature I’ve been building: select a legislator and see their most frequently used words or enter a word and see how often it has been said in Congress since 1996. Some surprising finds:

PRIORITIES: “war on drugs” (1751) vs. “war on poverty” (329)

PUBLIC HEALTH: “marijuana” (3990) vs. “heroin” (3305) vs. “cocaine” (4634) vs. “alcoholism” (727)

BIG ISSUES: “raise taxes” (5346) vs. “cut taxes” (3408) vs. “football” (12,516)

SWEAR WORDS: “Shit” has been uttered 7 times in Congress. That’s still less than ass (35) and damn (245).

We hope to have a deployed version next week that everyone can play around with, but until then here’s a preview shot of basic functionality:

Image

So You Want to Write CoffeeScript?

Our open-source project Inspiration is written in CoffeeScript, which has given me some time to consider what I like and don’t like about the language.

In general, I think it’s important for software engineers to be flexible and occasionally agnostic about the particulars of different stacks. I try to be a team player who can adapt quickly and start contributing to others’ projects, which becomes really difficult if you’re putting your foot down about which language, library, or framework is being used.

I can work with CoffeeScript, but I’m having trouble imagining a scenario in which I would pick CoffeeScript over JavaScript if I could choose either one for a project.

I just finished watching a presentation on Grunt to a group of JavaScript beginners, which featured a CoffeeScript app that compiles automatically using Grunt. “Isn’t that amazing!” the students said.

Amazing… and kind of painful! It makes my spleen hurt to see how much group effort has gone into setting up Grunt for compiling CoffeeScript instead of writing features and shipping code. Hours of work were squandered on tasks that would have no impact on the end user’s experience with our app.

Yes, CoffeeScript looks very pretty most of the time, but why would you want to add another layer of complexity to an already-complex project? And why would you want to save time on writing lines of JavaScript only to waste that time compiling over and over from CoffeeScript back into JavaScript? And even more than that, why would you want to put yourself through debugging CoffeeScript converted into JavaScript — this bizzaro world file that you didn’t even write!

Here’s an analogy: I want to write a novel but it really bothers me that English has ugly irregular verbs. Even though I’m an excellent English speaker and writer — and it maybe takes me 15 seconds to Google whether I should use “lead” or “led” in my sentence — I decide to spend a few months learning Spanish so I can save those few seconds every couple pages by using Spanish’s beautifully regular verbs.

Because my audience doesn’t speak Spanish, I need to pay someone to translate my Spanish back into English. While the translator is compiling my novel, I’m going to hop on The Onion and kill my productivity for a while. But I’m so glad I saved a few seconds not having to Google for irregular verbs!

My translator finally finishes my novel and hands it back to me. But guess what? I used the wrong verb in Spanish and now it’s not translating correctly into English. I have to look at the error in English, find the corresponding section in the Spanish I wrote, and fix the error — in Spanish. Then I get to wait again for my translator to recompile my novel.

When I was learning JavaScript, a mentor made fun of the programmers who cache things like “var _len = myArray.length” so their for loops are 2 milliseconds faster. It’s like trying to wax your grandma’s Toyota Camry so the decreased wind resistance helps her get to the grocery store a split second faster. I get compiling in languages like C++ or Java — it makes a meaningful speed difference that actually improves performance and lets you do things you can’t pull off in an interpreted language. My problem is not with compiling — it’s with compiling when it’s unnecessary and may actually cost you in lost productivity and human error.

At a certain point, trying to save a little bit of time can end up costing you enormous amounts of time — learning CoffeeScript syntax, setting up Grunt, waiting for your CoffeeScript to compile while you get knocked out of the zone, debugging your CoffeeScript. I think “a certain point” probably includes anyone whose coding bottleneck is knowledge about how to solve a problem, not the speed with which you can get your idea about how to solve a problem written into code and tested.

I’d love if someone could prove me wrong — convince me to love CoffeeScript! But until then, I’m happy to spend a few seconds adding parentheses, curly braces, and semicolons. I would gladly trade the temporary slowdown for huge gains in productivity and major reductions in code/system complexity.

Influence: Exploring Aaron Swartz’s “Watchdog”

I’m working on an open-source project with a group of friends that visualizes campaign contribution data. I follow politics pretty closely so I’ve had the frustrating experience of trying to parse mounds of dense, poorly-organized data trying to find something meaningful. My friends were frustrated by the same thing, so we’ve teamed up on Influence: a web app to explore politicians’ campaign contributors and their influence on legislation.

We’ve been looking at Aaron Swartz‘s codebase from his Watchdog project, which has been abandoned since his death in January. While we’re creating a new project with different goals, his code has been helpful in framing our objectives and selecting the most relevant data to display about politicians.

There are a ton of websites that report on public data, but they are all crippled by the same problem: because there is so much data out there, these sites seem to believe they need to make it ALL available in full detail. That’s a really overwhelming experience. Follow The Money is a good example:

Image

Data, data everywhere… but why should you care?

It’s like I jumped into an Excel spreadsheet and time-traveled back to 2003. Yuck. OpenSecrets makes their data a bit more curated and visually appealing, but we’re still stuck in PHPLand in the mid-2000s:

Image

Getting better! But still dense and intimidating for non-pros

Our team has tried to focus on the core user experience, dropping exhaustively comprehensive data in favor of visualizing a smaller, curated set of key facts. We’re using D3.js to animate data and create sleek graphics. We also believe that most users want to share what they find with other people, so we’re making every graphic easily shareable through social media.

Finally, we’re making the full-text of bills available so users can read, comment, and share their annotations with friends and other users. Our MVP looks like this:

Image

Select a bill from the dropdown menu and the title/text will change almost instantly, thanks to bidirectional data binding from AngularJS

On load the page makes an HTTP request to the Sunlight Foundation’s API for a list of the twenty most recent bills. The response object contains URLs for the full bill text provided by the Government Printing Office. Selecting a specific bill from the dropdown menu — created using AngularJS’s ngOptions attribute — sends a second HTTP request to the GPO for a bill in plain-text formatting. We do some formatting ourselves — removing underscores and hard-coded carriage returns that Angular escapes and renders as actual text — then the text is dynamically inserted into the page.

From here, our next milestone is to add annotation functionality. I really like Annotator, which provides a highly-scalable solution for annotating text. You can host the files yourself or use their CDN, and you can store the annotations on your own database or use their free storage. We’ll probably start with the CDN/free storage option as we’re building our MVP, then host the files ourselves and save them to our PostgreSQL database.

Image

Some early design mockups for our legislator view. The bottom-right pie chart will be replaced with a list of bills, which leads to the bill view described in this post.

Eventually we’ll have a legislator view that shows top contributors by name and industry, as well as a list of bills authored or cosponsored by that legislator that affect those industries. Clicking on one of those bills will take the user to the bill view described above.

If you’d like to check out our code or fork it yourself, visit the Influence repo on GitHub. You can also send me questions via Twitter at @eastbayjake. I’m happy to answer questions and would love to see more people involved in government transparency!

Case Study: Integrating Your Website into a Facebook App

I recently did some contract work for Wedgies, a Las Vegas startup that’s trying to make online polling a little less painful. The company has a great sense of humor — did the name give it away? — and their product adds tangible value to users’ social media experience. USA Today is using Wedgies to poll readers about their stories, while non-enterprise users are exploring less serious topics.

Wedgies’ cofounder Jimmy Jacobson graciously agreed to let me do a post about my experience working with their codebase. The engineering problem was straightforward: Wedgies focuses on social media polling, but their Facebook sharing functionality only generated an external link back to the Wedgies website that users could post onto their profiles. But external links have some of the lowest engagement levels on Facebook while photos and apps perform significantly better. Wedgies wanted to integrate their existing website functionality into a Facebook Canvas app so users could enjoy their polling services without leaving Facebook.

Facebook Canvas apps are simply iframes inside of a Facebook page. They point to a URL on an external website and load the content inside of a Facebook window. Facebook Canvas apps don’t involve uploading any code or content to Facebook — it’s entirely hosted by you but displayed within a Facebook window. You can create a simple Facebook Canvas app in 30 seconds by setting the Canvas URL field to any HTTPS website. (Note: If you only have a HTTP server, you can only use the app in “sandbox mode”)

Image

Enter your website’s URL into the Canvas URL field

Image

We’ve got a fully functional Facebook Canvas app!

In the example above, note that I don’t have access to FitBit’s servers — I didn’t write any server code, I didn’t modify their page, I’m simply displaying their website’s content within an iframe running inside of a Facebook page.

The engineering challenge came from Wedgies’ desire to have some pages not display the blue header and contact/about/blog footer from their webpage. I knew we would have to add some dynamic templating to their existing Mustache templates. This would be really easy if iframes had access to the window.location.url parameter — but they don’t! We can’t just use regex to check if the page’s URL includes apps.facebook.com because that variable is not in the iframe’s scope.

I dealt with this by creating a new /fb route in Express that duplicated the functionality of the website’s /question route. The Canvas URL would point to wedgies.com/fb — you can try it for yourself and see that it’s exactly the same content that loads inside of an iframe in Wedgies’ Facebook app.

Now we can check to see if the iframe’s URL ends with /fb (catching all instances where a user navigates directly to the page) or the previous URL ended with /fb (catching users who click through to display the results).

Image

The Wedgies website, not yet integrated into a Facebook app

A quick note about Wedgies’ architecture: it’s a stateless web app that acts a lot like a traditional webpage. The “next” button is a URL link to another page in the app with a route that loads a random question. It throws out the page’s state and reloads the content again. Sessions are being tracked through Twitter or Facebook login, but almost all content is being rendered server-side and not dynamically altered client-side.

But we needed to make the Next button load wedgies.com/fb instead of wedgies.com/question, but only if the page was a Facebook page. Changing it server-side would be an odd choice because it’s difficult (impossible?) to distinguish between an iframe running side of a Facebook app and any other browser loading the content inside of a full window. I chose to dynamically alter the Next button’s URL on the client-side if the window’s URL or previous URL was wedgies.com/fb. That would catch cases where the user loaded a random question but wanted to skip to the next one (current URL is /fb) or where the user answered a question, saw the results, and had so much fun that they wanted to load a new question. This was done with a few lines of jQuery that check the current and previous URLs to see if they were wedgies.com/fb, then update the DOM element if either case was true.

The easy step was to add some dynamic templating in Mustache. A simple example:

{{^fb}}

  {{> header}}

{{/fb}}

The code above uses a Facebook parameter that’s passed into the server’s template rendering function. I wrote a ternary that checks whether the pathname ends with /fb, passing a true or false value to the render function. The code above wraps the header template with a conditional: unless fb:true, render the header. The page will now only render with a header if the previous or current URL ended with /fb.

Image

The fully-deployed Facebook Canvas app

The Facebook app is now live. I had a great time working on the project and it’s really cool to see clients like USA Today using the code to ask their readers questions. It was also great for improving my Express skill set and giving me more exposure to the MEAN stack.

Thanks to Jimmy and Porter for letting me write about the experience! Go forth and give all your friends a Wedgie, then share it on Facebook. If you’d like advice or have follow-up questions about creating a Facebook Canvas app, shoot me a tweet.

PS – Wedgies has the weirdest and most awesome domain name acquisition story ever. (It involves an ex-con and Ron Paul supporters. Go read it.)

HTML5 Form Validation

I spend a few nights each week tutoring new JavaScript learners and students in introductory computer science courses. One of my students showed me the following prompt this evening:

Use JavaScript to perform client-side validation on at least three fields with the following specifications: (a) Validate a field for minimum length, (b) Validate another field for both minimum length and numeric or alpha characters only, (c) Validate an email address that must be in a valid email format. The form should not be permitted to submit unless all fields pass validation.

I puzzled over this for a minute. It’s easy to validate forms with jQuery, and it’s even easier to do it with HTML5 — but this college TA explicitly asked for JavaScript. Could that include HTML5? After discussing the prompt for a few minutes, we decided to do HTML5 form validation. (It’s not JavaScript, but it’s done in-browser. I wasn’t expecting a Ph.D. candidate in computer science to have significant web development exposure… easy mistake.)

I really like this Dive Into HTML5 guide to form validation. It covers the most common uses in a beginner-friendly format, emphasizing how simple HTML5 has made validating forms.

Want to make a field required and only accept numeric inputs, such as an age or order number? It’s easy:

<input type=”number” required></input>

Want to validate an email address? Not only is it a snap, but it’s 100% backward compatible! Older browsers like IE6 will treat this input as type=”text”:

<input type=”email”></input>

The Deep Dive guide helped my student upgrade from lifeless, text-only forms to interactive self-validating forms — all with a bit of HTML5. Best of all, we were able to cut his code from 60 lines down to just 15!

Make Your Own Mashup / Ad Lib Generator

I love making people laugh. The first program I ever wrote was an Ayn Rand chatbot that would “talk” to my friends. The algorithm waits a few seconds after you submit a response, then replies with a real Ayn Rand quote appended with “, you (adjective) (noun)!” Try it out!

These generators are fun and incredibly simple to make. This week I came up with an idea for a “Tech Startup Idea Generator” and had it coded and online within an hour. To create your own, follow the steps below or fork my repo on GitHub!

I started by creating four arrays: company prefixes, company suffixes, real internet startups, and users. It looks like this:

Image

The generator will follow a simple format: a product idea (prefix+suffix) followed by “It’s (company) for (customer)”. We can get the data out of the arrays and into our HTML using jQuery’s .html() function:

Image

Each line does basically the same thing: for the HTML element with a given id, set the HTML value of that tag to a randomly-chosen item from the corresponding array. The random number is generated using JavaScript’s Math library: Math.random() creates a random decimal between 0 and 1, which when multiplied by the array’s length should give us a number between 0 and the array’s length. Math.floor() then rounds the number down to the nearest integer, which produces a number from 0 to length-1 — exactly what we want for accessing an element from an array with an index of 0 to length-1. (Why can’t we use the Math.ceil() function? Because we would get numbers from 1 – array.length, which would result in never using the array[0] word at the beginning of each array or throwing an error by trying to access an element one place off the end of the array.)

Now let’s use jQuery to call these functions when the page loads, then again whenever a button on the page is clicked:

Image

If you are new to jQuery, let’s break down what each line does. Line 14 tells jQuery not to run these functions until the page has finished loading. (We would get an error if we tried to change an HTML element on our page before the element has even been created!) Line 15 runs the functions from above. Line 16 makes our button a jQuery object that does something when clicked, and Line 17 tells it what to do: run the random array selection again when the button is pushed. Why go through the work of creating a generate function instead of just repeating the code twice? This code follows DRY (Don’t Repeat Yourself) web development practice. If someday I change my HTML id to #user instead of #customer, I now only have to change it in one place rather than two. That’s going to make this code easier to maintain and harder to break.

We’ve got all of the JavaScript/jQuery for the generator in only 19 lines of code! What does the HTML look like? Pretty simple:

Screen Shot 2013-07-20 at 4.33.10 PM

We’ve got a headline, a company line, and a product pitch line. The HTML in this file is purely semantic, laying out where the dynamically-generated elements from the JavaScript file will be placed. Again, if you are new to JavaScript and jQuery, two things to notice: this file requires jQuery and a link to the file must be included, as well as another script tag linking to the actual JavaScript file where the arrays/functions/etc. from above are stored. If either of these tags are missing, this file will not run! (Another note: put script tags at the bottom of the body so they load after the page’s HTML elements. Placing the script tag in the head will load the jQuery functions before the HTML elements they modify, causing the jQuery not to work. Watch out!)

Finally, let’s add a tiny bit of CSS so this isn’t just plain black text crammed into the upper left corner of the page:

Screen Shot 2013-07-20 at 4.37.38 PM

I normally include CSS as a separate file using a link tag, but the amount of styling is small enough that it’s not worth incurring an extra HTTP request for another file — it would be much faster (and just as easily maintainable) to keep it in the HTML file for an app this small.

And there you have it! A silly startup pitch generator that uses only 19 lines of JavaScript, 12 lines of HTML, and 16 lines of CSS. And now we get fun ideas like this:

Screen Shot 2013-07-20 at 4.52.40 PM

You can play with my version here. Please feel free to fork my repo and tweet me your results if you try your own version!

King of N-Queens

This problem is a computer science classic, and I was totally excited to tackle it using JavaScript. For the uninitiated, n-Queens asks how many different ways n queens can be arranged on a board of size n-by-n without any of the queens threatening any other queens. The question was first posed in 1848, the first solutions for an 8-by-8 board were published in 1850, and the problem endures today with a race toward ever-higher computations of n. The current record — calculating for 26 queens on a 26-by-26 board — might be broken with an algorithm from Hack Reactor alumni two classes ahead of us!

Here is a single solution for n=8, one of 92 possible solutions:

Image

One of the first things students notice is that correct solutions require each queen to occupy a different row and column, and every row and column need to have a queen for the solution to be valid. Think about it: if queens can attack along rows or columns, then each queen needs to have her own row and column — and every row/column on the board needs a queen if you’re going to get to n!)

This realization lead to our team’s first breakthrough in implementing a solution. Placing a queen in a single square in the first row eliminates a vast swath of options — including the entire top row. You can continue building solutions by moving to the next row, where a tree of solutions begins to emerge: each space where you can still place a queen after placing one in the top row is now a “child” solution to the problem.

Realizing that the solutions resembled a tree caused our team to think about recursive traversal of trees. It seems like a lot of engineers are hesitant to use recursive solutions when there are ways to create iterative solutions, but I find that odd. Using a recursive solution forces engineers to scale down the solution to discrete, repeated tasks that can be well-understood.

In the case of n-Queens, we repeated one childishly simple process over and over: place a queen, look at the next row, place a queen in the first available spot, look at the next row, place a queen in the first available spot, look at the next row… you get it. Here’s a snippet of our recursive solution:

Image

Our remaining problem was about remembering which solutions we’d tried. We thought about copying the entire board and reversing steps when we hit an invalid solution (nowhere to place a queen on the next row) or storing the solutions we’d tried in memory.

Eventually we stumbled onto the concept of “backtracking” — the fundamental principle that most computer science professors are trying to teach when they assign this problem to students. We needed to iterate across each of the n squares at the top of the chess board, but each branch of solutions stemming from that top square had invalid solutions that could be “pruned” from the tree and considered no further. If we hit a row where we couldn’t place a queen, we needed to backtrack to the last row and try placing a queen at the next available square. If there were no more available, we needed to go back another row and try the next available square… and so on.

If our recursive function crawled to the bottom of the tree and placed 8 queens, we needed to iterate a counter that recorded how many solutions we had found. Here’s a table of the number of solutions for each value n:

Image

Our function was able to find 92 solutions to 8-queens in less than 1 second. We started to experience lags at n=11 (4 seconds), n=12 (15 seconds), and n=13 (>90 seconds).

One of our classmates tried to refactor his code to use bit-shifting to evaluate attacks along diagonals, easily the most time-consuming step of our code. He claimed that his implementation with bit-shifting could find the solution to n-15 (over 2.2 million solutions) in one second! His primary motivation, however, was that the refactored code — using bit-shifting and recursive search — can fit into a tweet. Here’s a tweet-sized implementation from one of the HR guys behind the world-record n-Queens solution:

Image

I really enjoyed n-Queens. It feels amazing to solve a challenging problem like this in only two days — especially one that computer science students might take a few weeks in college to solve. Hack Reactor pushes students to tackle big problems and get used to pushing through the feeling that you don’t know enough about a subject to get started. If you’re willing to approach unfamiliarity with an open and curious mind, you can do big things quickly.

Onward to web security and foiling XSS/CSRF attacks!

D3 the Easy Way

We’ve started to dive into using the D3 library for animations, which culminated in a clone of this game that relies heavily on D3 manipulation of DOM nodes. As I was trying to find my way around this jQuery-style library, I got my bearings with this excellent tutorial. There is a lot of bad documentation out there about D3, but its compact size and powerful functions make animating accessible and immediately impressive.

What’s so great about this tutorial? Lots of D3 material jumps in at a conceptual level then immediately goes deep into the weeds on implementation. Experienced software engineers talk about D3’s “steep learning curve” — you quickly go from the enter-update-exit model for rendering graphics to nitty-gritty details about appending data to elements and working with SVG.

If you want to bridge that transition, the tutorial is a good place to start. Learning how to create an object, have it display on screen, then make it do some simple motions is about 60% of the learning curve on D3. (One trip-up point: getting straight that the SVG’s dimensions are the “canvas” on which shapes’ dimensions are rendered. Making the SVG and rendered elements of similar sizes may cut off parts of shapes or obscure them entirely!) Once those stumbling blocks are out of the way, it’s easier to engage with implementation details in a thoughtful, sophisticated way.