RSS Pick: Dion Almaer

dion almaer photo

Photo by marcosfernandez

I think that the RSS reader is such a fantastic invention. It lets me monitor many bloggers and news sites, and see new content.  This lets you have an eye on lots of writers, including some that haven’t written for a long time.  I’m going to be highlighting blogs that I follow, one per month.

The first is Dion Almaer’s, who, unfortunately, has moved most of his writing to Medium.  But Dion is a great technologist.  He currently is employed at WalmartLabs Mobile.  He’s written such gems as:

Your coding voice:

When people ask me about Java and why I don’t often write applications in it, my answer is not that I think “Java sucks”. I think the JVM is amazing technology, and there are a ton of fantastic APIs. Using Java is a great answer for many situations. However, the least amount of fun that I have had programming has been when using the Java language. It isn’t just that it feels frustratingly verbose, although that is part of it.

and Browsers are Finally Catching Up (in 2009):

But, the browsers are finally changing. The new crop come with technologies that show that the browser vendors are thinking about building a platform for desktop quality applications. The Chrome comic book was full of this.

Remember the Chrome Comic Book?

Dion, thanks for sharing your knowledge, please resurrect your blog!  (Dion, I know this is an old photo–feel free to send me a new one and I’ll update this post.)


Avoiding “Module ‘XXX’ is not available” Errors in AngularJS

headbang photo

Photo by Nirazilla

As I continue to build applications with AngularJS, I see how strong the ecosystem is.  While there aren’t quite as many plugins for Angular as there are for JQuery, many of the JQuery plugins have been wrapped up as Angular directives.

One of the issues I banged my head against a couple of times when using a new directive was how many places I had to modify code, and the totally non intuitive error messages that were displayed.

Here are the places you need to modify if you want to use a directive:

  • a module definition (your app or a controller).  You need to add the directive to the list of dependencies: var controllers = angular.module("controllers", [ "localytics.directives",'ngModal' ]);.
  • the index.html file. You need to add links to the javascript files and css that the directive uses. If you are using bower to manage these components, you can find the files under the directory managed by bower.
  • the karma.conf.js file. This file sets up the environment for your unit tests. You want to set up the files attribute to point to the javascript files you added to the index.html page above.

If you don’t add the correct module name to your dependency list, misspell it, or don’t add the javascript to the index.html or your karma configuration file, you will see this error message in your console:

Uncaught Error: [$injector:modulerr] Failed to instantiate module app due to:
Error: [$injector:modulerr] Failed to instantiate module controllers due to:
Error: [$injector:modulerr] Failed to instantiate module ngModall due to:
Error: [$injector:nomod] Module ‘ngModall’ is not available! You either misspelled the module name or forgot to load it. If registering a module ensure that you specify the dependencies as the second argument.

http://errors.angularjs.org/1.2.25/$injector/nomod?p0=ngModall

at http://192.168.0.200:8000/app/bower_components/angular/angular.js:78:12

And the app won’t start.

Hope this helps someone else avoid some head banging.



Cassandra Dev Day Denver

data photo

Photo by justgrimes

Most of my datastore experience is with RDBMS like mysql, oracle and postgresql (though I did work with some key value stores like berkleydb back in the day). So when a full day, free intro to Cassandra was offered, I jumped on it, even though it was in Centennial. You can view the schedule, speakers and talk synopses for the day. There were two tracks, beginner and advanced. Since I didn’t know anything about Cassandra, I followed the beginner track.

First, though, it was amazing how many people were there. The two main companies behind the talk, Pearson Education and DataStax, a vendor providing a commercial, supported version of Cassandra, ended up having to provide two overflow rooms, and it was still standing room only for some of the talks. Quite a nice turnout, and I think the sponsors were pleasantly shocked. I was also surprised by the number of folks from Boulder. I happened to sit next to two folks from Westminster and Superior, and ended up having a common friend or colleague with each. Small world.

I learned a ton about Cassandra, from its internals, to its topology (the ring’s the thing) to abstractions that let you query it (CQL, which is a subset of SQL) to data modeling to using the java driver, which makes accessing Cassandra almost as easy as using JDBC. While there are some SQL concepts that appear to map fairly well to Cassandra, I put quotes around them below to remind myself of the fact that a Cassandra ‘table’ isn’t the same as a RDBMS table, ditto for ‘row’, ‘primary key’ and other important concepts.

I think the biggest takeaway for me was that Cassandra is a “write many, read once” system. Because you can only query efficiently on one or two keys, if you have multiple queries, you want to write the data multiple times in a denormalized system, one ‘table’ for each query. Because of this, Cassandra shines in use cases where you are doing a lot of inserts, have known queries, and need speed and availability (sensor data was mentioned several times).

How does this actually work? Here’s an example (as best I understand it–here are some others from people who actually have experience using this technology):

If you have click stream data, in standard apache format, and you want to be able to have it stored in a database and highly available for a few specific queries, Cassandra might be a good choice. Here’s a line of my clickstream, from my blog:

157.55.39.40 - - [15/Oct/2014:08:03:30 -0600] "GET /wordpress/archives/date/2007/07 HTTP/1.1" 200 66258 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

This is time based data, and has some valuable information, and some not so valuable information. What things might I be interested in querying on? Well, I might care about the user agent, request time, ip address, status code, or URL path requested. I probably don’t care about the HTTP method, HTTP version or the bytes served. But, for the sake of this example, let’s say my application needs to show the location of the most recent 1000 users for a given country, for a fancy widget for my website. I will use maxmind or a similar service for mapping ip addresses to country. That’s all I care about. (Yes, I know this is contrived–I had to revise this example a couple of times to make it fit with Cassandra’s usage model.) I would set up in this ‘table’ in Cassandra.


CREATE TABLE IF NOT EXISTS location (
country text,
time timestamp,
user_agent text,
status_code int,
path text,
PRIMARY KEY (text, time)
)

In this table, country is the partition key, and time is the clustering key. That means that this query: select location from location where country = 'USA' limit 1000; will be screaming fast. If I wanted to look at paths requested by time, I would not add an index to this table, but instead create another whole table, say request_path. Then my insertion code would write to both location and request_path. And then clients that wanted path information would use the specific table. Yup, denormalization is the name of the game.

This means that Cassandra has certain specific use cases, and that trying to use it as a general purpose data storage and query engine is foolish. Several presenters mentioned that Cassandra plus Apache Spark for general queries was a good solution.

Denormalization as standard operating procedure isn’t the only mind bending facet of Cassandra. Others:

  • the presenters also talked about the high availability and replication of Cassandra–you can actually configure it to be data center aware so that it automatically replicates data across different data centers.
  • For each keyspace (a set of tables, so similar to a schema), you specify how a replication factor–how many times each piece of data is stored.
  • For every read or write, you specify how many nodes must either agree or accept the data, respectively.
  • Adding nodes is the preferable way to deal with scale. You can add a node easily, and Cassandra will auto partition data and spread existing data across the new node.
  • The biggest Cassandra setup they mentioned was 75K nodes.
  • Each ‘row’ can have up to 2 billion records. If a row stretches across nodes, you’ll kill performance.
  • There’s a process called ‘compaction’ which is similar to Java’s garbage collection, and just like GC, you have to pay attention to how compaction works, because it will affect performance.

All in all, very interesting day, and I appreciated the experience. One more (interesting) tool to add to the toolbox.


Parents In Tech Interview

baby photo

Photo by paparutzi

A few months ago I was contacted by Morgan who read a comment I’d made on Hacker News about reshuffling my work life balance.  He was starting a site for parents who work in technology, and was looking to interview such people for tips on parenting.  After a flurry of emails, we finally found a time that worked for both of us and were able to skype for an half hour.

My interview is up here.  Morgan doesn’t do a ton of editing, so it is a little rough, but you get a sense of my thought process:

M: Has having a Baby changed your worldview, beliefs, or how you treat other people? How so?

 

D: Sometimes I wonder how my parents can take me seriously, given that they saw me as an infant. You put it nicely, getting some empathy, starting out as something that just cries, poops and sleeps.

Full post here.

If you are a parent who works in technology and want to chat with Morgan, let me know and I’ll do an intro.



#TBT: Precision and Accuracy in Software

I originally wrote this in Dec of 2004. I still think that having someone who can answer engineers’ questions authoritatively increases productivity (of the engineer). However, now I’d emphasize that developers need to spend some time learning their domain to gain some intuition, and truly great business software engineers will learn when to bump a question up to a business person and when their intuition can be trusted.

————————————————————-

Back in college, when I took first year physics lab, there was a section of the course that focused on teaching the difference between precision and accuracy in measurement. This distinction was crucial in experimental physics, since measurement is the bedrock of such experimentation. Basically, precision is how many digits of a measurement actually mean something. If I’m measuring the length of a room with my stride (and found it to be 30 feet long), the precision is less than if I were to measure the length of the room with a tape measure (and found it to be 33 feet, 6 and ¾ inches long). However, it’s possible that the stride measurement is more accurate than the length found with the tape measure, that is, it reflects how long the room actually is. (Perhaps there’s clothing on the floor which adds tape measurement, but which I stride over.)

These concepts aren’t just valid in physics; I think they’re also useful in software. When building a piece of software, I am precise if I build what I say I am going to build, and I am accurate if what I build actually meets the client’s business needs, that is, it solves the business problem. Almost every development tool either makes development more precise or more accurate.

The concept of precision lends itself easily to automation. For example, unit testing is rapidly gaining credence as a useful software technique. With unit testing, a developer writes test cases for each part of their code (often at the method level). The running of these tests ensures that code is actually doing what the developer thinks it is doing. I like writing unit tests; it gives me comfort to know that corner cases are taken care of and that changes to code can be fairly easily regression tested. Other techniques besides unit testing that help ensure precision include:

Round tripping: using a tool like TogetherJ, I can ensure that the model (often described in UML) and the code are in sync. This makes it easier for me to verify my mental model against the code.

Specification writing: The more precise a spec is, the easier it is to translate into code.

Compilers: the checking that occurs at compilation time can be very helpful in ensuring that the code is doing what I think it is doing–at a very low level. Obviously, this technique depends on the language used.

Now, precision is needed, because if I am not confident that I understand what the code is doing, then I’m in real trouble. However, accuracy is much more important. Having a customer onsite is a great example of a technique to ensure accuracy: you have a business domain expert available all the time for developers’ questions. In this situation, when a developer stumbles across a part of the business problem that they don’t quite understand, the don’t do what developers normally do (in order of decreasing accuracy):

  1. Ask another developer, which works great if the target audience is developers, but not so well otherwise.
  2. 2Best approximation (read: guess at the correct answer).
  3. Ignore the issue. (‘I’ve got a lot more code to write before I can go home today, and we’re shipping in two weeks. We’ll just let the customer discover it and deal with it as a bug.’)

Instead, they have a real live business person, to whom this software really matters (hopefully), who they can ask. Doing this makes it much more likely that the final solution will actually solve the business problem. Other techniques to help improve accuracy include:

Issue tracking software (I use Bugzilla): Having a place where questions and conversations are recorded is truly helpful in making sure the mental model of the business user and the programmer are in sync. Using a web based tool means that non-technical users can participate and contribute.

Specification writing: A well written spec allows both the business user and developer to have a sense of what is being built, which means that the business user can correct invalid notions at an early stage. However, if a spec is too detailed, it can be used to justify precision at the cost of accuracy (‘hey, the code does exactly what’s specified’ is the excuse you’ll hear).

Spring and other dependency injection tools, as well as IDEs: These tools help accuracy by decreasing the costs of changing code.

Precision and accuracy are both important in software engineering. Perhaps the best way to characterize the two concepts is that precision is the mapping of the programmer’s model of the problem to the computer’s model, whereas accuracy is the mapping of the business’ needs to the programmer’s model. However, though both are needed, accuracy is much harder to obtain. Knowing that I’m building precisely what I think I’m building is beneficial only insofar as what I think I’m building is actually what the customer needs.


The Power of the Internet to Create Low Friction Marketplaces

 

RipeNearMe is an online marketplace for locally grown food.

Ever had a peach, apple or lemon tree go bananas with a bumper crop and not know what to do with it? Or maybe you’ve seen your neighbours’ trees overloaded, left to the birds, falling to the floor and going to waste? We have too, and that’s why we started RipeNearMe: a web app that connects people through the food we each grow ourselves.

As we near the end of harvest season here in Colorado, it is great to see this kind of marketplace for free or low cost food.  Maybe the bags of zucchini at the office are on their way out?

Regardless, it would be impossible to have this kind of market if the internet (and specifically the HTTP protocol) wasn’t powering it.  The transaction costs are simply too high and the value of the goods too low.

Any other examples of low friction marketplaces that the internet enables?


Be an Informal Recruiter

link photo

Photo by elcovs

As I kick start my consulting business, I’m talking to many people–everyone in my network, anyone that I am referred to, and random people from Hacker News.  Employment is far less fungible than other purchases (even housing), so it behooves anyone selling labor to cast a wide net.

These introductions and conversations have not just given me the chance to talk about my skills and knowledge, but also to learn about other needs of the company.  Even if they don’t have need for a senior developer who can talk business and learn new languages, they may have a need for someone else in my network.

I’ve been about to do a few such intros since early August.  It actually is quite fun to do this, and it is good for karma.  It’s also a great way to stand out from the pack–if I an helpful to the organization before I take a job/contract with it, imagine how helpful I will be when I am engaged day to day.

During initial contact with the interesting organization, I talk about my skills and how that might fill the organization’s needs–after all, they are interested in meeting and learning about me.  But I also note any other needs, either via postings on their websites, needs they imply or mention in the conversation, or by simply asking them: “do you have any other needs at this time?  I like to help and would be happy to ping my network”.  I take notes.

Once I know some needs, I consider who in my network might help fulfill them.  Then I reach out to the members of my network and see if they can help. Typically, I send an email with the details of the need, and ask if they know anyone who might be a fit for the company.  Network members don’t have to be looking to make a move.  They will probably know of folks who can be a good fit.  For example, a marketing will typically know far more marketers than I will.  Therefore, if an organization I am interested in helping needs a marketing assistant, reaching out to marketers in my network and asking if they know anyone looking will be helpful. This interaction is useful to my network contacts–it lets them reach out to their network, opens a conversation with me, informs them of labor market conditions with minimal work on their part, and could end up in a new job if the fit is right.

This technique also lets me have a soft touch point with the prospect in a week or so.  I say something like “I reached out to my network about the position X you posted, and haven’t heard back from anyone, just wanted to let you know.”

If I don’t have a specific person who could help fill this role (either themselves or via their contacts), there are other ways to add value.  I’ve passed on recruiting tips (or interested tech articles), helpful employment sites, or general labor market advice such as “based on what I’ve seen, you might to have a hard time finding someone expert in tech X for pay Y”, or “in my experience, rates for tech X are $Y/hour”.  All of these add value to the interaction at very little cost to me.

Connecting people with openings is great for hiring managers on the other side of the hiring/contracting process.  If I am casting a wide net looking for contracts, I have recent data as well as a network and perspective worth sharing.  Since this is low cost to me and has benefits for me, the organization I am interacting with and members of my network, it is worth the extra effort to be mindful of needs and to send that intro email.




© Moore Consulting, 2003-2014 +