Skip to content

Programming - 9. page

Boulder Facebook Developer Garage Notes

So, on Thursday I went to the Facebook developers garage, kindly arranged by Kevin Cawley. It was held at ‘the bunker’, secret home of TechStars (we didn’t even get the precise address until the afternoon of the event. I could tell you where it is, but then I’d have to kill you). It was an interesting evening; it started off, as per usual with tech events, with free pizza and pop and beer (folks! if you want a job where you can get free pizza regularly, consider software). Then, there were six presentations by local startups. A number of the companies were ones that I’d seen before at the New Tech Meetup, and that was what the garage felt like. Rather than a deep look at some of the technical or business problems that Facebook applications face, it was a quick overview of how the platform was being used by existing startups. Then there was a video conference call with Dave Morin, the platform development manager for Facebook.

As far as demographics, the garage had about 40-50 people there, of which 4-5 were women. One of the presenters asked how many people had apps running on Facebook; I’d say it was about 40%.

While I’ve succeeded in writing a “hello world” application for Facebook, I am by no means a Facebook developer. I attended the garage because the platform is exciting new technology. There are a number of interesting applications that have a chicken and egg problem: the application is cool if there are a lot of people using it, but people will only use it if is cool. For example, any recommendation engine, or classified listing service (incidentally, I started off the night talking to the Needzilla folks, who are hoping to build a next generation Craigslist). Facebook, by making finding and adding applications delightfully easy, helps developers jump the chicken and egg problem. Of course, it’s just as easy to uninstall the application, so it needs to be pretty compelling. And that is why I went to the garage (no, free pizza had nothing to do with it!). Also, note that this meeting was streamed on Ustream and will hopefully be available on one of the video sharing sites sometime soon. I’ve asked to be notified and will post the link when it is available.

The first speaker was Charlie Henderson of MapBuzz. They’re a site looking to make mapping easier for users to build and create content and community around. They partner with organizations such as the Colorado Mountain Club and NTEN to build maps and communities. They’re building a number of Facebook apps, mostly using iframes rather than FBML. They had some difficulties porting their existing application to work within Facebook, especially around bi-directional communication (if you add a marker or a note to a map within Facebook, that content will appear on the Mapbuzz site, and vice versa). They had some difficulties with FBJS. I couldn’t find their widget on Facebook as of this time.

Lijit is a widget you can drop in your blog that does search better, as it draws on more structured information, including trusted sources. They use the Facebook API as another source of information to search that is fairly structured. They insisted they complied with the Facebook TOS and are starting to look at OpenSocial as a data source as well.

Fuser showed a widget that displayed Myspace notifications on Facebook. Again, they used the iframe technology, and actually used a java applet! (I’m happy that someone is using a java applet for something more complex than a rent vs buy calculator). Since Myspace doesn’t have an API, they ended up having that applet do some screen scraping.

Useful Networks had the most technically impressive presentation, though not because of its Facebook integration. This company, part of Liberty Media, is looking to become a platform for cell phone applications with location services (a “location services aggregator”). Eventually, they want to help developers onboard applications to the carriers, but right now they’re just trying to build some compelling applications of their own. Useful Networks has built an application called sniff (live in the Nordic countries, almost live in the USA) which lets you find friends via their mobile phones and plots locations on a map in Facebook. They had a number of notification options (‘tell me when someone finds me’, etc) and an option to go invisible–the emphasis on letting people know about about their choices surrounding sensitive information like location was admirable. The presenter said that they had been whittling away features to just the minimum for a while. Most importantly, they had a business model–each ‘sniff’ was 50 cents (I’m sure some of that went to the carrier). 80% of the people using sniff did it via their mobile phone, while the other 20% did it from the web. They used GPS when possible, but often cell [tower] id was close enough. And the US beta was live on two carriers. Someone suggested integrating with Fire Eagle for another source of geo information.

Stan James, founder of Lijit, then demoed a Facebook app. I’ve been pretty religious about not installing Facebook applications, but this is one that seemed cool enough to do so. It’s called SongTales and it lets you share songs with friends and attach a small story to each one. As Stan said, we all like songs that are pretty objectively awful because of the experiences they can evoke. As far as technical difficulties, he mentioned that it was harder to get music to play than to just embed a Youtube video (a probable violation of Youtube’s TOS).

David Henderson spoke about Socialeyes. Actually, first he spoke about Social Media, a company he worked for in Palo Alto that aimed to create adsense for social networks (watch out, Google!), and has a knowledge of Facebook applications after creating a number of them. One of those was Appsaholic, that lets a developer track usage of their app. He showed some pretty graphs of application uptake on Facebook, and made some comments about how the rapid uptake created viral customer interaction that was a dream for marketers. His new venture is an attempt to build a Nielsen ratings for social networks, measuring engagement and social influence of users as well as application distribution.

After the presenters Dave Morin skyped in. Pretty amazing display of cheap teleconferencing (though people did have to ask their questions to someone in Boulder, who typed them to Dave). He said some nice things about Boulder, and then talked a small bit about how the Facebook platform was going to improve. Actually, I think he just said that it will improve, and that Facebook loves developers (sorry, no yelling). Dave shared an anecdote about a girl who build a Facebook application which had 300K users and sold it in three weeks. Then people asked questions.

One person asked about how to make applications stickier (so you don’t have 50K users add your app, then 50K users delete it the next week). No good answer there (that I caught).

Another asked about the SEO announcement, which Dave didn’t know about because he ad been in a roundtable with developers all day (and I couldn’t find reference to on the internet–sorry).

Someone else asked about changes to the Facebook TOS, in particular letting users grant greater data storage rights to applications (apparently apps right now can only store 24 hours of data). Dave answered that what Facebook currently allowed was in line with the market standards, and that as those changed, Facebook would.

Somebody asked if authenticating mobile Facebook apps was coming, but I didn’t catch/understand the answer.

Someone else asked about the ‘daily user interactions’, and what that meant. After some hemming and hawing about the proprietary nature of that calculation (to prevent gaming the numbers), Dave answered that the number is not interactions per user per day, but the number of unique users who interact with your application over the last day. He also talked about what went into the ‘interaction’ calculation.

Someone asked about monetization solutions. Dave mentioned that the Facebook team was working on an e commerce API which he hoped would spawn an entire new set of applications. It will be released in the next couple of quarters. (This doesn’t really help the people building apps now, though).

Another person asked about the pluses and minuses of FBML versus the iframe. Dave said that FBML was “super native and super fast”, whereas the iframe solution lets you run the application on your own servers, and is more standard.

Yet another person asked about whether they should release an app in beta, or get it a bit more polished. Dave said that his experience was that the best option was always to get new features into the hands of users as soon as possible, and that Facebook was still releasing new code once a week.

Someone asked about Facebook creating a ‘preferred applications’ repository and Dave said that there were no plans to do that.

Somebody asked if Facebook was interested in increasing integration between FBJS and flash, but Dave didn’t quite get the question correctly, and answered that of course Facebook was happy to have applications developed in flash.

Dave then signed off. It was nice to be able to talk to someone so in touch with the Facebook platform, but I felt that he danced around some questions. And having someone have to type the questions meant that it was hard to refine your question or have him clarify what you were asking. FYI, I may not have understood all the questions and answers that happened in Dave’s talk because I’m so new to Facebook and the jargon of the platform.

After that, things were wrapping up. Kevin plugged allfacebook. People stood up and announced whether they were looking for work, or if they were looking for Facebook developers. By my count, 5 folks said they were looking for developers; no developers said they were looking for work (hey look, kids! Facebook app dev is the new Ruby on Rails!), including the developer of the skipass application and mezmo.com.

Was this worth my time? Yes, definitely. Good friendly vibe and people, and it opened up my eyes to the wide variety of apps being built for Facebook. The presentations also drove home how you could build a really professional application on top of the API.

would I go again? Probably not. I would prefer a more BJUG style presentation, where one can really dive into some of the technical issues (scaling, bi-directional communication, iframe vs FBML vs Flash) that were only briefly touched upon.

[tags]facebook[/tags]

On writing a book

I saw this great post on the nuts and bolts of writing a book on the BJUG mailing list. Well worth a read.

I have sympathy in particular with his ‘writing schedule’ comment. I started to put together a book with some friends, and it was hard to keep things moving. We ended up not moving forward with the book and placing the content on a blog (which has proven hard to update as well). The book was about software contracting.

[tags]authorship,inertia[/tags]

Finding bugs from software history: a talk at CU I attended

I went to an absolutely fascinating talk today at CU. Sunghun Kim gave a talk titled “Predicting Bugs by Analyzing Software History”. The basic premise is you can look at historical unstructured information, including emails, bug reports, check in comments, and if you can identify bugs that were related to that unstructured information, you can use that to find other bugs.

He talked about two different methods to ‘find other bugs’. The first is change classification. Based on a large number of factors, including attributes of the program text, complexity metrics and source control management meta data like time of checkin (don’t check code in on Friday!) and committing developer, he was able to identify whether or not a bug was introduced for a given checkin. (A question was asked about looking at changes at the token level, and he said that would be an interesting place for further research.) This was 94% precise (if the system said a bug was introduced, there was a 94% chance it was) and had 70% recall (it missed 30% of real bugs introduced, but got 70% of them).

They [he collaborates a lot] were able to judge future changes probabilities of introducing a bug by feeding all the attributes I mention above on known bugs to a machine learning program. Kim said there were ways of automating some of the collection of this data, but I can imagine that the quality of bug prediction depends massively on the ability to find which bugs were introduced when, and tie known bugs to those attributes. The numbers I quote above were based on research from a number of open source projects like Apache and Mozilla, and varied quite a bit. Someone asked about the difference of numbers, and he said that the habit of commit activity was a large cause of that variation. Basically, the more targeted commits were (one file instead of five, etc), the higher precision could be attained. Kim also mentioned that this type of change classification is similar to using a GPS for directions around a city–the more unfamiliar you are with the code, the more useful it would be. He mentioned that Apple and Yahoo! were using this system in their software development.

The other interesting concept he talked about was a bug cache. If you’ve developed for any length of time on a given project, you know there are places developers fear to tread. Whether it is complicated logic, fragile interfaces with legacy systems or just plain fugly code, there are sections of any codebase where change is a bit scary. Kim talked about the Windows Server 2003 team maintaining a list of such modules, so that anytime anyone changed something on that list, more review than normal would take place. This list is what he’s trying to repeat in an automated fashion.

If you place files in a cache when they are identified as having a bug, and also place other files that are close in checkin time to that file, you can build a cache of files to closely review. After about 50-100 files for the 200 file Apache project, that cache of 20 files (10%) contained a significant portion of future bugs. Across several open source projects, the range of bugs contained in the cache was 73-95%. He also talked about using this on the method level as opposed to the file level.

In both these cases, machine learning that happens on one project is not very useful for others. (When they did an analysis of the Mozilla codebase and then turned it on the Eclipse codebase, it wasn’t good at finding bugs). Kim speculated that this was due to project and personal coding styles (some people are from Mars, others write buffer overflow bugs), as the Apache 1.3 trained machine was OK at finding bugs in the Apache 2.0 codebase.

Kim talked about several other interesting projects that he has been part of, including the ‘Memory of Similar Bugs’, which found that up to 40% of bugs are recurring patterns, and ReCrash, a probe that monitors an application for crash conditions, and, when it finds one, automatically writes a unit test that can reproduce the crash situation. How cool is that? The cost, of course, is ReCrash imposing high overhead (13-64% increase) as a cost of monitoring.

This was so fascinating to me because anything we can do to automate the bug finding process lets us build better software. There are always data input problems (GIGO still rules), but it seemed like Kim had found some ways around that, especially when the developers were good about comments on checkin. I’m all for spending more time building cool features and better business logic. All in all, a great talk about an interesting topic.

[tags]the bug in the machine, commit early and often[/tags]

My experience at the MySQL Performance Coding Webinar

Last Tuesday, I attended a MySQL webinar. I registered on the MySQL website (with a site-specific email address, of course) and periodically am invited to these webinars. I’d tried to attend in the past, but something else (usually billable) always interfered. Not this time!

The talk was titled “Performance Coding for MySQL” and the author, Jay Pipes, did a fantastic job. (He is also the co-author of Pro MySQL.) The slides from the presentation are up, and he also answered questions sent to him during the presentation in some detail as well. His presentation, about an hour in length, covered both basics (like, normalize first (slide 4), think in sets rather than iterators (slides 20-23)–basic, but not intuitive), and under the hood intricacies (like, think about the size of your primary keys and consider record size (slide 6), avoiding deletes with MyISAM (slide 27) and vertical partitioning to take advantage of the query cache (slides 9-11) ) . He also pointed to a script that he wrote to find useless indices.

Well worth my time. Thanks MySQL and Jay, for making a resource like this available and free! There’s a whole lot more, so I’d recommend downloading the slides and giving them a run through, if you interact with MySQL as a DBA or a developer (or, as is often the case, both).

(On that note, I’d like to recommend the MySQL DBA blog for your perusal–apparently recently renamed the ‘Senior MySQL DBA’ blog, heh.)

[tags]mysql dba, think in sets, webinar[/tags]

Using ‘tasks’ in Eclipse

Update, 11/13/2009:  If you are looking for help with tasks using Mylyn (integrated into later versions of eclipse), you don’t want this post.  Instead, you’ll want to read and watch the resources here.  This post is all about simple text based code markers, not Mylyn’s implementation.

In Eclipse (version 3.2–one revision behind the current latest), I use a feature called ‘Tasks’. Using this feature, I can, anywhere in a file managed by Eclipse, put in a tag like ‘XXX’ and write a note to myself. It’s very handy because when I’m developing, I’ll often think of a problem or situation that I’d like my code to handle, but not have time to deal with it just then. I could add it to a bug tracker, or an excel spreadsheet, or write it down, but I find adding it to the source code works just as well. Then, I can use the ‘Tasks’ view in Eclipse to gather all of these notes at a later time, and deal with them one by one. I add it using this type of comment (for java–if I add the task in an XML file, I use XML comments):

//XXX need to revisit this class.

The way you include the tasks view in the java perspective is: go to the java perspective, then choose ‘Window’ from the menubar, then ‘Show View’ then ‘Tasks’. According to the help, Tasks are usually only shown in the ‘Resources’ perspective.

My one gripe with tasks is that it seems to be easy to create them, but darn hard to delete them. You can add new task tags via this path: ‘Window’ / ‘Preferences’ / ‘Web and XML’ / ‘Task Tags’, and use the ‘clean and redetect task tags’ button. This does appear to pick up new tasks (or tasks marked with tags that you’ve added), but doesn’t seem to remove tasks that are no longer marked in the source code (whether they are no longer marked because you removed them from the source code, or because you removed that task [TODO] tag from the list of task tag).

If you add a task via the mouse, you can remove it by right clicking on the checkbox. However, that doesn’t remove the task comment from the source file. Also, if you add a task via a comment, you cannot mark it done in the ‘Task’ view.

What I’d like is some synchronicity between the source file and the view. If I add a task in the source file, it should show up in the ‘Task’ view. If I then delete that line from the source file, I’d like the view to reflect that. If I mark it as completed in the view, then choose ‘Delete Completed tasks’ from the context menu, the line in the source file should be removed as well.

Am I missing something here? Am I using tasks incorrectly?

I looked through the Eclipse help, on google, and in the Eclipse newsgroups but did not find anything that helped. No mention of this issue in the release notes for version 3.3. Browsing around the buglist didn’t turn up anything that applied to what I want to do (via a quick scan).

I’ll probably file a bug sometime soon, but should really review all the entered bugs to see if someone else has my issue. In the meantime, I’ll just bleat a bit here.

[tags]eclipse,tasks view[/tags]

The ant jar task and duplicate files can cause bizarre behavior and missing/incorrect files when unzipping

I just ran into some bizarre behavior. I’m building a web application on Windows XPSP2, using Ant 1.6.1. The application worked fine. Then, I added one more copy instruction, to move a GWT component I’d just added to the appropriate place in the war file. Suddenly, the application ceased to work.

After a few hours of tracking down the issue, I found that it didn’t matter whether it was the new GWT component or an old one; it didn’t matter whether the copy went to the same directory or a new one–simply adding the new set of files caused the issue. Then I noticed that the unzipped version of the application differed. In particular, the configuration files differed. That explained why the application wasn’t working–it wasn’t configured correctly.

But, why were the configuration files different?

I examined the generated jar file. When I unjarred it, the configuration files were correct. I was using the jar command, whereas the ant script was using unzip. I made sure the jar file was copied correctly. I made sure the old directory was deleted, and that the ant unzip task would overwrite existing files. Still, no fix–I was seeing the incorrect configuration files.

Then, this part of the jar task documentation jumped out at me:

Please note that the zip format allows multiple files of the same fully-qualified name to exist within a single archive. This has been documented as causing various problems for unsuspecting users. If you wish to avoid this behavior you must set the duplicate attribute to a value other than its default, “add”.

The other possible values for the duplicate attribute to the jar task listed are “fail” and “preserve”. It doesn’t explain what the other options actually do; “fail” causes the jar task to fail when duplicate files are encountered. This seems to be sane default behavior, and I’m not sure why it’s not the case. “preserve” seems to preserve the first file added, and doesn’t add duplicates, but doesn’t tell you that duplicates exist.

Update, 2:09:   “preserve” does tell you that duplicates exist, in this form: WEB-INF/web.xml already added, skipping

I had, for a variety of reasons, a jar task that was adding two sets of the configuration files, with the same names and paths, to the war file. Something about adding a few extra files seemed to flip a switch, and change existing behavior. Whereas before, the unzip task picked the correct configuration file, now the unzip task picked the incorrect file. I don’t know more than that, because I didn’t dig down into the source.

The answer is to move the correct files to the top of the jar task, and change the “duplicate” attribute to be “preserve”.

I hope this post saves someone a few hours of banging their head.

[tags]ant, jar files, duplicate files, headbanging[/tags]

FogBugz world tour, Boulder edition

I went and saw Joel Spolsky’s talk about FogBugz6 tonight. It seems to be quite the powerful software development tool. But I’m afraid that it seems to suffer like every tool–it forces you into certain methods of development. For example, there’s no way to ensure that every bug entered is viewed by QA. Now, that isn’t a problem for the teams I currently work on, but I can see it being a problem for teams I have worked on. Joel mentioned very valid reasons for doing this, but they only seem valid for the subset of development teams that FogBugz targets.

In fact, as I left, almost every conversation I heard was about the product, and how people could fit it into their process, rather than use the process it gives you. Because FogBugz really is more than a bug tracking system–it now goes from documentation/requirements gathering all the way to estimation to bug tracking to customer support. FogBugz appears to be a tool that is used in almost the entire software development life cycle–hey look, it’s RUP lite.

But I’ve used never version 6, and I’m sure there are significant wins. My other concerns are that the software estimation parts sound like they’re 1.0 features (just from the words he used–at best 1.1 since they used FogBugz6 to develop FogBugz6); I’d rather wait until the features are more settled. I’m sure you could use just the bug tracking system, and they’ve certainly taken the ‘Web 2.0’/instant response/make it feel like a desktop application ideas to heart. The cost is another concern; while minimal, it is greater than $0. On many projects I’m on, just using any bug tracker, let alone an entire software development tool, is difficult, and you can’t beat stealth bug tracker installs. (I’m on record as saying “I have to say that I think the open source solutions (Bugzilla and PHPBT) are going to eat the commercial solutions’ lunch for small projects, because they are a cheaper substitute with all the required attributes”, just as an FYI.)

One thing that really surprised me at the talk is how many folks were there evaluating FogBugz as opposed to seeing Joel speak. Around one third of the audienced had used or was using FogBugz. Joel opened up the floor to questions, and every single one except one (of mine) was about features or flaws in FogBugz. I mean, this is the guy who wrote the Joel Test and no one took the opportunity to ask him general development questions, even though he said he’d field them. I don’t know what the deal was.

Will I give FogBugz a try? Not right now. But I’ll keep an eye on what they’re doing.
[tags]software development tools, bug tracking, fogbugz, joelonsoftware[/tags]

Choosing new technology, or tail chasing

Robert Hanson, who built the very useful GWT Widget Library, has an interesting post where he asks:

Let’s say that you are a developer, and you have been spending the past year or so really getting to know a given technology. Now you are being told that the technology you are using is inferior to this “other” technology. You take a look and realize that it might be best to switch. A year later you finally have a good understanding of the tool, and use it with great skill. Then someone tells you about this “other” technology.

How many of us built our own MVC frameworks only to move to Struts, then maybe on to Spring MVC. Sure, there are some improvements made in each technological step, but since you are spending most of your time really getting to know a product you often spend little time getting the most out of it. This is compounded by the fact that you often use several of these products at the same time, adding to what you need to learn.

So what is a dog to do? Although you are moving forward, you never quite catch the tail. Should you just stop moving forward, or run faster or slower?

Personally, I think that there is a middle ground. As a developer, you need to keep up on broad trends and tools, because they can make you so much more productive. The problem arises when you don’t know how much more productive you will be, until you use the technology or tool for a while….

However, just because there is a new tool around, that doesn’t mean you have to use it. In fact, if you have an existing technology that does the job, you should not abandon it just to move to the new technology. There’s always a cost analysis, because learning a new technology is not free. Your time is worth something.

This cost analysis is something that developers should learn to do and appreciate because that process is exactly what most companies need to do before they decide to implement or build new software. Just like a developer, most companies think that a new technology, or system, will help them, but are unsure how much it will help them, and how much it will cost them. Just as for a company, a developer deciding to learn and use a new technology is not solely a technology decision.

There are many ways to minimize the risk of learning a new technology–prototype, read documentation, be conservative and consult someone who’s an expert in the new technology (which means they’ve already made some of the mistakes). Each of these have benefits and detriments. Prototyping takes more time than the others. Reading documentation is great if there is documentation, and if the documentation is accurate, but might teach one as many lessons as using the technology. Being conservative means that you’ll probably miss out on some productivity improvements, just as you’ll miss out on some time sinks. Consulting an expert is great, if you have access and know what questions to ask.

I think the answer to Robert’s final question is intensely context sensitive. It depends on the following five considerations, among others:

  • how crucial a new technology is to your productivity (ie, if you are a java business developer, learning GWT might be lower on the list than learning Spring)
  • how easy you think it will be to learn
  • whether you can be paid to learn it
  • how much spare time you have
  • whether you have a project to use the new technology on

[tags]tail chasing,technology[/tags]

PHP form generation

I just wanted to say: if you are building an application in PHP and you need to edit or search data from a relational database, HTML_QuickForm, DB_DataObject, and, occasionally, DB_DataObject_FormBuilder, can be very useful for prototyping and, depending on your client’s needs, building.  The tools are well worth a look if you’re planning to write any custom PHP database manipulation code.

Announcement: FRUGOS GeoSummit 2007

One of my clients is helping out with this unconference. If you’re into GIS, it seems like it’d be worth going. I certainly had fun at the last unconference I went to.  I am planning to attend; hope I see you there.
———————–

FRUGOS (Front Range Users of Geospatial Open Source) is holding its
first GeoSummit on Saturday, June 16th at Churchill Navigation–100
Arapahoe–in Boulder.

This will be a unique gathering of a variety of folks interested in
Place–geo-types, hackers, academics, artists, amateur enthusiasts,
etc. While there certainly will be representation from the GIS and open
source worlds, we encourage all who are fascinated about the
intersections of technology and engagement with the world around us to
participate.

Also, we’ll be structuring the day around the “un-conference” model (see
http://www.barcamp.org), so, for starters, you
can expect:
No Pitches
No PowerPoint
No Passivity (unless you’re a little sleepy after lunch)

Bring your laptop (we’ll have wireless), and a project or enthusiasm
you’d like to talk about with the group, get feedback, and collaborate
on fresh solutions: the agenda of the day will be structured during
the morning registration/sign-up/socializing period.

If interested–
1) RSVP by joining the Google Groups set up for this event–

http://groups.google.com/group/geosummit

2) Bring a laptop (and cellphone/GPS if your enthusiasms tilt that
way), your idea/project, and willingness to collaborate

3) Spread the word

Tentative Schedule

9:30-10:30AM Registration, refreshment, socializing
10:30-12ish Sessions
12ish-2 Lunch (there’s a grill, beverages, and hiking trails)
2-? Sessions

This promises to be a great combination of creativity, intellectual
engagement, eating and drinking, and socializing.

————————

[tags]barcamp,gis,unconference[/tags]