Excited to be speaking at Develop Denver

Coffee break at a conferenceI’m excited to be speaking at Develop Denver. This is a local conference with a wide variety of topics of interest to developers, designers and in general anyone who works in the interactive industry. From their website, they want to:

[bring] together developers, designers, strategists, and those looking to dive deeper into the interactive world for two days of hands on code & design talks.

I’ll be doing two presentations. The first is my talk on Amazon Machine Learning, which I’ve presented previously. The second is a lightning talk on the awk programming language. I’m excited to be presenting, but I’m also looking forward to interesting talks from other speakers, covering topics such as IoT, software development for the developing world, web scraping, APIs, oauth, software development, and hiring practices. (That list is tilted toward my interest in development–there’s plenty for everyone.)

If you’re able to join, it’s happening in about two weeks in downtown Denver (Oct 18-19 in the RiNo district). Here’s the link for tickets, and here’s the agenda.


Fifteen years

Birthday cakeWow. 15 years ago today I wrote my first post. Thank you for reading!

I’ve been blogging for longer than I’ve done any other activity in my life. Longer than any job, any romantic relationship, any hobby.

And I can’t see stopping. Sure, I may use other outlets, but I’m trying to keep the streak of no month going by without a post going (darn you, November 2011!). I think that writing is an essential part of software development and it’s been an essential part of my reflective process. I primarily blog for myself. But it is still fun when folks reach out to me and let me know my writing has helped them in some way. It’s also fun when I google a problem and realize that I’ve already written up an answer.

I encourage everyone I meet to blog. It is a fantastic way to accumulate credibility and expertise. (You never learn something as well as when you are trying to explain it to others.) It also gives me a chance to expound on some of my thoughts around jobs, business, and software development. I’m lucky that my work almost always is something interesting enough to write about, at least once a month. Maybe one of these days I’ll put together a ‘best of’ and make a PDF.

I also think it is important to blog at a place you control. Yes, Medium or Facebook make it easier to gain a following. But what they giveth, they can taketh away. I may only have limited engagement here on this blog, but what I write is mine and will never be anyone else’s.

Hard to believe that my blog will be able to drive next year. Again, thanks for reading.


What makes you a better developer, working at an early stage startup or working in a team?

Bike courierI gave a presentation at Boulder.rb last night about my experience being the technical co-founder of a startup for two years. After the presentation, someone asked a really interesting question. Between working as a solo co-founder of a startup or working in a larger team at an established company, which experience makes you a better developer?

First, a digression. I often commute around town by bike. There are many benefits to doing so, but one of the ones I think about a lot is being on a bike gives you the ability to move through streets more freely. Specifically, you can switch between acting like a car (riding on the road) and acting like a pedestrian (riding on the sidewalk). Used judiciously, this ability can get you places faster than either (hence, bike messengers).

In my mind, a true developer is like that. They can bounce between the world of software (and across the domains within it) and the world of business to solve problems in an efficient manner.

Back to the question at hand. I think that the answer is based on what you mean by better. Are you looking to gain:

  • customer empathy
  • ability to get stuff done through barriers of ignorance and resource constraint
  • a wide set of experience across a lot of different software related domains (security, operations, ux, data modelling, requirements gathering, planning, bug fixing, etc)

If getting these skills make you into the better kind of developer you want to be, you will be well served by being a technical co-founder or founding engineer (more thoughts about the distinction here).

If on the other hand, you are looking for:

  • deep knowledge of a smaller subset of the software world
  • the ability to design software for long term maintainability and performance
  • experience working with a team of stakeholders, each with a different perspective on the problem you are solving

then you are seeking a place on a team, with process, code reviews, conference attendance and free snacks (most likely).

Who is a better developer? The person with experience working with (possibly leading) a team and deep knowledge of a subset of technology? Or the person who can be a jack of all trades and take a product from an idea to something customers will pay for?

I’m going to leave you with the canonical consultant’s answer: “It depends.” The former I’d call an expert programmer and the latter a true developer. They are both extremely valuable, but are good in different companies situations.


Amazon Alexa

I had a lot of fun working on a one day ‘hackfest’ project with Amazon Alexa. I learned a lot about voice UX and Alexa implementation details.It’s an interesting platform, especially if you have broad brand recognition and can deliver high level valuable information via short chunks of text.

From my blog post on the Culture Foundry site:

The multi step interaction is a bit clunky, but I think it’s a great way to avoid collisions between different skills. Basically, the user calls out an ‘invocation’ like ‘open color picker’. Interactions with Alexa after that are send directly to that particular skill until an end point is reached in the interaction tree. Each of these interactions is triggered by a different voice command, and is handled by something called an ‘intent’. Intents can have multiple triggering commands (‘what is my favorite color’ vs ‘what is my color’, for example). There’s also a lightweight, session level storage while the entire invocation is occurring, which means you can easily pass data between intents without reaching out to a more persistent data storage.

You can read the whole post over there.


Aborted Adventures with Amazon Athena and US PTO data

GoddessesI was playing around recently with some data (from the US Patent and Trademark Office), trying to import it into S3 and then to Athena. Athena is a serverless big data query engine with schema on read semantics. The data was not available on the AWS public dataset repo. Things didn’t go as well as planned. Here’s how I wanted them to go:

  1. download some data
  2. transform it into CSV (because Athena doesn’t support currently XML and I didn’t want to go full EMR, even though hive supports XML)
  3. upload it to s3 bucket
  4. create a table based on the data
  5. run some interesting queries using Athena
  6. possibly pull some of the data in Amazon Machine Learning to do some predictions
  7. possibly put some of the data in an s3 bucket as JSON and use datatables to create a nice user interface

Like pretty much every development project I’ve ever been part of, there were surprises. What was different is that I had a fixed amount of time since this was an exploratory project, I set a timebox. I didn’t complete much of what I wanted to get done, but wanted to document what I did.

I was able to get through step 5 with a small portion of data (13k rows). I ended up working a lot on windows because I didn’t want to boot up a vagrant box. I spent a lot of time re-learning XSLT in order to pull the data I wanted out of the XML. I used a tool called xmlstarlet for this, which worked pretty well with the small dataset. Here’s the command I ran to pull out some of the attributes of the XML dataset (you can see that I also learned about batch file arguments):

xml sel -T -t -m //case-file -v "concat(serial-number,',',registration-number,',',case-file-header/registration-date,'\n

,',case-file-header/status-code,',',case-file-header/attorney-name)" -n %filename% > %outfile%

And here’s the Athena schema I created:


CREATE EXTERNAL TABLE trademark_csv (
serialnumber STRING,
registrationnumber STRING,
registrationdate STRING,
statuscode INT,
attorneyname STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
LOCATION 's3://aml-mooreds/athena/trademark/';

After I had done the quick prototype, I foolishly moved on to downloading the full dataset. This caused some issues with disk storage and ended up taking a long time (the full dataset was ~300 files from 500M to 2GB in size, each containing about 150k records). I learned that I should have pulled down one large file and worked it through my process rather than making automating each step as I went. For one, xmlstarlet hasn’t been updated for years, and I couldn’t find a linux package. When tried to compile it, it was looking for libxml, which was installed on my ec2 instance already. I didn’t bother to head further down this path. But I ran into a different issue. When I ran xmlstarlet against a 500MB uncompressed XML file, it completed. But any of the larger files caused it to give an ‘out of memory’ error. I saw one reference in the bugtracker, but it didn’t seem to apply.

So, back to the drawing board. Luckily, many languages have support for event based parsing of XML. I was hoping to find a command line tool that could run XSLT in order to reuse some of my logic, but it doesn’t appear to exist (found this interesting discussion and this one). python seemed like it might work well.

Then I ran out of time. Oh well, maybe some other time. It is fun to think about how I can automate all of this. I was definitely seeing where lambda functions and some other AWS features could have fit in nicely. I also think that using RDS might have made more sense than Athena, given the rate of update and the amount of data.

Lessons learned:

  • what works for 13k records won’t necessarily work when you have 10x, let along 100x, that number
  • work through the entire pipeline with real world data before automating any part of it
  • use EC2 whenever you need to download a lot of data
  • make sure your buckets and athena are in the same region. I wasn’t, and there was no warning. That’s fine with small data, but could have hurt from a financial viewpoint if I’d been successful at loading the whole dataset
  • it can be fun to play around with this type of stuff, but having a timebox keeps you from going down the rabbit hole too far

Atoms and Bits

Atom

I found this post by Fred Wilson about atoms and bits to be a nice counterpoint to this post from a few years back by Chris Dixon. There are so many advantages that a software product has over a business that delivers physical goods, whether that is inventory, startup costs, or distribution. (Yes, if you build a mobile app, you have to pay the vig to Apple/Google, but you also can distribute at zero incremental cost to hundreds of millions of people.) From Fred’s post:

we should … understand that the timelines [for working on businesses that focus on atoms] will be longer and the road to adoption will be more challenging.

The fact is that atoms are just harder to work with than bits. However, that very difficulty can provide a moat around your business (“where there’s muck there’s brass”). If your business is bits, find another moat (branding, network effects, niche domination).


Going to GopherCon and Thoughts on Golang

GopherI am excited to go to GopherCon this year. I’ve been maintaining a couple of codebases written in Go/golang. Some are smaller webhook and automation programs, but a couple are larger data processing systems–take this data from these two sources, munge it a bit, put it over here. (Unfortunately, this is all custom, not using an ETL toolkit like, say, ratchet).

I’ve found Go to be an interesting challenge. It’s C based, but there are a few wrinkles/idioms that I’ve enjoyed figuring out (and more that I’m learning).

Things that surprised me:

  • GOPATH and the need for a domain name in paths.
  • That you have to search on golang to find anything useful.
  • The fact that any file in a package can add functions to any struct (I think I have the terms correct, please forgive me if I don’t)
  • The lack of an editor that can do reference searching (“show me all places this function is called”). I think VS.Code can do this, but have downloaded it and the Golang extension and can’t seem to figure out it. (This is likely my failing, not golangs, but I was looking forward to coding in a static language for just this reason. Well, this and safe refactoring.)
  • The strictness. I’m actually pleasantly surprised by it (no unused variables seems like such a no-brainer!) but golang is quite opinionated in terms of language syntax.

I unfortunately haven’t had as much time to write golang as I planned when signing up for this conference, but I’m looking forward to meeting some other folks and the excitement that always happens when you attend a conference. In particular, I’m looking forward to “Go says WAT?” which is patterned after the famous WAT video and “From prototype to production: Lessons from Reddit’s ad platform”. Hope to see you there.


Easily extracting conversations from a slack group

Man slack liningSlack is an amazing productivity tool when used correctly. One of the primary uses I’ve seen is for open source projects to provide support (Craft CMS, OG-AWS) or for communities to be built (Techfriends, Denver Devs). If you don’t have the luxury of the owner of your slack being Slack’s VP of engineering, the costs of $x/month/user can cause these types of slacks to remain on the free plan.

Which means that you are limited to the last 10k of messages.

And that’s fine for the vast majority of messages. Sometimes, however a discussion is so good that it deserves to be indexed and shared, which means it needs to be pulled out of the Slack walled garden and onto the web (I also wrote about how to do this with the Facebook Group walled garden last year). Sometimes you might just want to save it beyond the 10k message limit for your own selfish reasons.

You can of course do this extraction manually (I did so here and here). But that’s a lot of work.

Another option is to use Zapier. The slack integration is trivial to set up, and has a number of options. From there you can push to a google spreadsheet (if you want to do further reification) or directly to WordPress (or any of the other integrations).

The nice part about this is that the Zapier slack integration is that you have a variety of options that can trigger the publishing of a message to a spreadsheet:

  • a post of a public message in a specific channel
  • a post of a public message in any channel
  • starring of a message by you
  • attachment of a certain reaction emoji (I picked a floppy disk) to a message, no matter who adds the emoji

I’ve just started doing this but am excited to have a low friction way to pull high value conversations out of slack. Slack is great for synchronous communication and easy discussion. When real knowledge drops, it should be shared with the future and anyone who can type into a search box. Do make sure to let folks know because there may be some expectation of privacy that you’ll want to respect.


Where I’m Writing

It’s been a bit quiet around here, but since I joined Culture Foundry I’ve been writing over on their site more. I’ve written on a number of topics, but this one about how to move files between different servers in order to scale a traditional CMS application horizontally, is one of my favs:

Compared to the other methods of syncing files, this works well with non cloud native applications. It also has the virtue of using old tested tools. You can use this system on prem or in the cloud, anywhere with SSH and rsync. This system works well with large numbers of small objects because rsync can be configured to only push new objects.

I’ll be splitting my blogging time between this site and the Culture Foundry site in the future. See you there.

 


Startup COOs: What do they do?

I really enjoyed this post about what startupo COOs do. It was interesting because it wasn’t just opinion, there was also data. I particularly enjoyed the evolution of the author’s role over time, and the percentage of COOs that owned various responsibilities. To be fair, it was a small set of respondents in one geographic area, but still interesting.

However, the money quote was:

I actually have [simple way] of explaining what I do, and I would sum it up this way: taking things off my CEO’s plate, and figuring out how to thoughtfully scale the company.

Of course every company’s different, but I’ve seen the pattern of a visionary and an operator in a couple of companies, and it’s powerful.



© Moore Consulting, 2003-2017 +