"Open Source" Posts
Posted in Open Source
Comments
Matt Todd
Most database-backed applications, especially Ruby applications written in Rails do fine with a SQL Database, like MySQL. Adam Wiggins of Heroku does a great job explaining how SQL Databases are an Overapplied Solution. There are definitely a few cases we’ve seen where a NoSQL solution like Redis can really shine.
At Highgroove, we’ve got several projects utilizing the Redis key-value store. Here are a few reasons you might want to look into Redis:
1) ”Data is frequently written, infrequently read” – if you are making tons of writes and MySQL can’t keep up, Redis has been clocked at 110000 SETs (the INSERT equivalent) per second!
2) ”Data can be expired” – if you have data that can be expired on a regular basis, over time, or explicitly, like stats, logs, and session data, Redis can explicitly expire whole keys or quickly trim large lists of data in no time.
3) ”Data is a collection of COUNTs, or SUMs of other data.” – if you have ever written an UPDATE statement that adds +1 to a record, you probably know that it is quite expensive, and could possibly be wrong. Redis has built in support for incrementing and decrementing values, quickly.
We love SQL Databases, so you won’t see us abandoning them any time soon, but we enjoy adding tools like Redis to our toolbelt.
More Information on Redis:
Redis Project Page – http://code.google.com/p/redis/
Redis (wikipedia entry) – http://en.wikipedia.org/wiki/Redis_)
A Collection of Redis Use Cases – http://www.paperplanes.de/2010/2/16/a_collection_of_redis_use_cases.html
Posted in Open Source
Comments
CBQ
A new client of ours had a big problem. The site they built was getting too many searches (a very good problem to have). The searches all used Andre’s geokit gem and the geokit-rails plugin to provide local results.
Even with the library’s multi-geocoder support (Google, with failover to Yahoo), the site was hitting the limits imposed by both services every day!
So, we quickly implemented a query caching mechanism that caches geocoding lookups that don’t change very regularly, saving the site from making all those API calls. But, the cool part is, we actually added this functionality to the open-source library itself, and the client’s application now directly benefits.
Several things will now happen, because we contributed back to the open-source library, instead of keeping this “addition” to ourself (and the client):
- other developers facing this same problem can now leverage our code when they use geokit
- other developers can now submit even more functionality, fixes or enhancements to our code, and even better support for problems we may have one day, meaning that we will eventually benefit from this too
- our client now knows that the code they relied on us to develop now has even more developers eyeing it, making sure it works
- our client (through us) is contributing to open-source, and can feel good about using open-source technologies, having fulfilled their part of the agreement they implicitly made, by leveraging the gains (and price) of implementing a solution based on open-source software
Truly leveraging open-source technology, when done right, can be a huge win for everyone: the client, vendor, and community.
Posted in Open Source, What We Wrote, Presentations
Comments
Matt Todd
Trying to handle image manipulation, creating PDFs, or in-memory caching in pure Ruby is like trying to win the Tour de France on your hipster single-speed bike. The single-speed works 90% of the time, but when you have demanding performance requirements, it’s not good enough. Many popular Ruby libraries, such as MySQL/PostgreSQL, RMagick, and most of the webservers Ruby applications are deployed on (like Passenger, Mongrel, and Thin), harness the blazing speed of the C language and libraries to handle the heavy lifting and performance-intensive business that Ruby can’t keep up with on its own.
In some of my recent work, I had the opportunity to delve into and expand on a Ruby extension written in C for looking up geographic information based on IPs. This library was vital to one of our client’s projects that has immense performance requirements without the possibility of full request caching. By utilizing the existing GeoIP C library for accessing their special in-memory binary database, we were able to keep up with the demand the application would be seeing.
As is common at Highgroove Studios, along with making sure our contributions to the library were open sourced, I took the lessons and experience gained from this unusual endeavor and presented them to our local Ruby User Group here in Atlanta. I focused more on exposing the bridge between the Ruby and the C environments and understanding the internals of the Ruby language from a C standpoint. However, armed with this knowledge, any Rubyist is able to open up most any Ruby extension or even the Ruby language implementation itself and understand what’s going on. My goal was to get the developers over the initial hurdle of being able to read the code and understand it enough to investigate further.
Personally, I gained from this experience a better appreciation for the real beauty of the Ruby language and the effort required to make it as fluid and dynamic as it is as well as having a more thorough understanding of the internal workings of the language. Working this close to the language core has also made a difference on my Ruby style, both in trying to fight the language less but to also use it more efficiently and effectively.
For more information, check out the presentation slides1 and some of the C examples I wrote for the presentation2. Also check out the GeoIP I contributed to which inspired this whole adventure3.
1 http://www.slideshare.net/maraby/writing-ruby-extensions
2 http://github.com/mtodd/ruby-c
3 http://github.com/mtodd/geoip
Posted in Open Source
Comments
CBQ
Both James Edward Gray II and Matt Todd were quoted in Satish Talim of Ruby Learning’s Poll: 20+ Rubyists are using Sinatra – Do You?
Sinatra is a Ruby framework for quickly creating web applications with minimal effort—a DSL for the web. We use Sinatra for several client projects, and it is also an integral part of Scout.
Posted in Open Source
Comments
James
I had an unusual Christmas Eve. While my wife was entertaining the family, I was in my office programming until nearly midnight. (Have I mentioned how much I adore my wife?!)
What had me so tied down at such an odd hour? One of the open source libraries Highgroove maintains: FasterCSV.
The Ruby core team gave me the nod to replace the standard CSV library with FasterCSV just before the Ruby 1.9.0 release. While a little more warning would have been nice, I was happy to do it.
I think it’s important to discuss why Highgroove, not just allows, but actually prefers that I spend some of my time maintaining our open source libraries, like FasterCSV.
The reason we created the library is nice and simple: we needed it. We import or export spreadsheet data at some point for most of our applications. We work with large data sets and we needed that to be fast. We also frequently use headers in the CSV files to map directly to database fields so a broader feature set to support such things was important to us.
Now we could have built it and just used it internally, but we decided to share. Did we do it because we are just really nice guys? Not really. (We are though, of course!) We did it because we wanted it to be great.
There are many subtle variations and edge cases for CSV. We debugged the library the best we could, but we’re human and we missed things. That meant our library would have trouble with some inputs. We (hopefully) would have noticed those eventually, but we’re just one set of users and by releasing it on the world we multiplied the user base by thousands. Needless to say early users found problems for us much faster than we could have alone. Better than that, they sometimes even fixed them for us! Even when they don’t, they usually send in an example that leads us straight to the problem which is almost as helpful.
That’s not all. Remember all of those features I told you we love? I wish I could brag and tell you I invented all of them, but some of my favorite features in FasterCSV were sent in by the users. I got the ball rolling with what I could see us needing, but the users ran with the ideas and made my library better than I knew it could be.
Highgroove isn’t losing hours on me applying a few patches here and there. They are gaining the help of many extra employees. We love that.
To top it all off, Matz gives us the ultimate Christmas present. By blessing our work, he saves us the trouble of even installing it for new projects. Very soon we will be able to count on our code being available in all modern Ruby installs. That’s just one less thing we need to worry about.
With open source everybody wins.
Posted in Open Source, What We Wrote
Comments
James
If you are following the Capistrano preview releases, you may have noticed a new dependency. Capistrano now depends on HighLine, an open source input library by yours truly.
The reason for the switch is that Capistrano needed a reliable way to grab passwords in a cross-platform way. That turns out to be a lot harder than you might guess. On Unix, termios can make short work of such challenges, but that’s an extra C extension install and it doesn’t work on Windows.
HighLine combines the knowledge of several platform gurus to use the right solutions in the right place. Even with all that knowledge as an advantage Capistrano’s maintainer, Jamis Buck, still had concerns. termios can’t be made a HighLine dependency, since we want to stay cross-platform and when defaulting to stty HighLine was a little flaky for the way Capistrano users might need it. Jamis and I discussed these concerns and HighLine was patched with better support for Capistrano’s needs. Jamis later added the dependency and HighLine benefited from another round of expert knowledge.
It still impresses me how much we can accomplish with the super friendly open source model of development. Thanks for the input Jamis!