Solr sends JSON as text/plain

Yet another reason not to use Solr. The discussion in this Jira issue is interesting.

The reason for this as I understand is to enable viewing the json response as as text in the browser.

Is there perhaps a more general feature we could turn this into? An expert level ability or parameter to set a custom content-type?

The problem right now is in the current class hierarchy of the response writers.

the NamedList is a weird datastructure for those who are not so used to Solr. You don’t know what is included in that unless you do an instanceof. Most of the users are happy to write out the documents

to handle this problem I would use ‘wt=json&wt.mime-type=application/json’

Phrase analysis and expansion with Ruby

The idea is to take a phrase and analyze it for use in Information Retrieval. We need to tokenize it into words, possibly transmute some of the tokens, possibly expand some tokens into subphrases. This class lets you register lambdas to perform transformations, substitutions, and expansions. Expansions can take a numerical value representing the cost of the operation; this is intended for raising or lowering the scores of matches in the theoretical IR application.

Given the phrase “joe’s sushi & bait-shop shack”, assume I want to tokenize on whitespace, replace the ampersand with the word “and”, and create word variants for the hyphenized and apostrophized words. See the last spec for an example of the Ruby data structure this class generates.

class Analyzer
def initialize
@expansions = []
@transformations = []
@substitutions = {}
@tokenizer = lambda { |string| string.split }
def tokenizer(&proc)
@tokenizer = proc
def expansion(cost=0.0, &proc)
@expansions << [cost, proc]
def substitution(input, output)
@substitutions[input] = output
alias_method :sub, :substitution
def transformation(&proc)
@transformations << proc
def tokenize(string)
def process_token(token)
@transformations.each do |proc|
token =
if out = @substitutions[token]
token = out
variants = {}
@expansions.each do |cost, proc|
if variant =
variants[variant] = cost
variants.size > 0 ? [token, variants] : token
def analyze(string)
tokenize(string).map { |token| process_token(token) }
describe "An Analyzer" do
before do
@analyzer =
it "can take a custom tokenizer" do
@analyzer.tokenizer { |string| string.split(/\s+/) }
@analyzer.tokenize("three blind mice").should == %w{three blind mice}
@analyzer.tokenizer { |string| string.scan(/[\w']+/) }
@analyzer.tokenize("joe's bait-shop").should == %w{joe's bait shop}
it "can perform weighted term expansions" do
@analyzer.expansion(0.5) { |word| "'", "") if word =~ /'/ }
@analyzer.expansion(0.5) { |word| word.chomp("'s") if word =~ /'s$/ }
@analyzer.process_token("joe's").should == ["joe's", {"joe" => 0.5, "joes" => 0.5}]
@analyzer.process_token("boring").should == "boring"
it "can transform terms" do
@analyzer.transformation { |word| word.reverse }
@analyzer.process_token("123").should == "321"
it "can substitute terms" do
@analyzer.substitution("&", "and")
@analyzer.process_token("&").should == "and"
it "expands terms after substitutions" do
@analyzer.expansion { |word| "ampersand" if word == "and" }
@analyzer.substitution("&", "and")
@analyzer.process_token("&").should == ["and", {"ampersand" => 0.0}]
it "substitutes after transformations" do
@analyzer.substitution("joe", "joseph")
@analyzer.transformation { |word|'m', 'j') }
@analyzer.process_token("moe").should == "joseph"
it "does phrases, if you know how to Enumerable#map" do
@analyzer.sub("&", "and")
@analyzer.expansion(0.5) { |word| "'", "") if word =~ /'/ }
@analyzer.expansion(0.5) { |word| word.chomp("'s") if word =~ /'s$/ }
@analyzer.expansion(3.0) { |word| word.split('-') if word =~ /-/ }
@analyzer.expansion(0.1) { |word|'-', '') if word =~ /-/ }
orig = "joe's sushi & bait-shop shack"
analyzed = [
["joe's", {"joe" => 0.5, "joes" => 0.5}],
["bait-shop", {"baitshop" => 0.1, ["bait", "shop"] => 3.0}],
@analyzer.analyze(orig).should == analyzed

view raw
hosted with ❤ by GitHub

Deletes, Transposes, Replaces, Inserts

Very simplistic rudiments of a spell checker in Ruby. Based on Norvig’s article.

# useful for things like
module Edits
DICT = { "cap" => 1, "carp" => 1, "clap" => 1, "cramp" => 1 }
def deletes
map_transforms { |word, i| word.delete_at(i) }
def transposes
map_transforms { |word, i| word[i], word[i+1] = word[i+1], word[i] }
def replaces
("a".."z").map do |c|
map_transforms { |word, i| word[i] = c }
def inserts
("a".."z").map do |c|
r = map_transforms { |word, i| word.insert(i, c) }
terminal = "#{self}#{c}"
r << terminal if terminal.score
def map_transforms
out = []
chars = self.split('')
self.size.times do |i|
yield(word = chars.dup, i)
word = word.join
out << word if word.score && word != self
def score
String.send(:include, Edits)
describe "a String, imbued with Edits" do
it "works" do
r = "crap".deletes
r.should == %w{ cap }
r = "crap".transposes
r.should == %w{ carp }
r = "crap".replaces
r.should == %w{ clap }
r = "crap".inserts
r.should == %w{ cramp }

view raw
hosted with ❤ by GitHub

Instant.rake: Compile and run individual Java classes using Rake

Sometimes, when forced to work with Java, you just want to copy and paste some code and fiddle with it. A real project build system is overkill. Try Instant.rake:

Improved object wrapper for JRuby Embed

New in JRuby 1.4 is JRuby Embed, which lets you eval Ruby from Java classes. It works, appears to be well-written, and needs some sugar. Here’s a class that limits your options in a helpful way.

King’s Third Rule of Software Development

Any software project not written in Java will clearly state on its homepage the implementation language.

The Need of a Study of Anatomy (also Swans)

“In our initial sketches for compositions, when memory has to take the place of the living model, we rely to a great extent on our anatomical knowledge for the suggestion of action and form generally. And again it adds materially to our faculties for self-criticism, which, like a sense of humour, is often, nearly always, our salvation.”
Solomon J. Solomon, The practice of oil painting and of drawing as associated with it

Knowledge of your tools is necessary, but not sufficient. The choices you make when planning the structure of software depend on your knowledge of the problem domain. A project is limited (sometimes crippled) by your comprehension of the form and motion and constraints of the body before you.

“It looks like it was made with, you know… longing. Made by a person really longed to see a swan”
Kaylee, Firefly

Self-criticism and a sense of humor are ineluctably linked, I find. Those who have not the capacity to criticize their own efforts lack most of the capacity to laugh at their own failings. If you don’t think it’s funny when you spend two hours failing to find a mindless bug in a simple depth-first traversal function, then you’re not me.

The Beginning of the End for Rubyforge

Jamis Buck is abandoning development of SQLite/Ruby, SQLite3/Ruby, Net::SSH and Capistrano. I do not say this derogatorily; Jamis owes us Capistrano like George R. R. Martin owes us A Dance with Dragons.

In the comments to that post, Dr Nic asked,

… were there ever “core contributors” who could be all added to the rubyforge project’s admin so they can start releasing new versions? Or did you ask all of them and no one said they’d take over the project?

Jamis replied:

“[T]here are no other core contributors. I tried once to create something like that, but no one else seemed to have the “passion” or “vision”. Lots of people submitting patches (many of them quite good!), but no one demonstrating a real, general desire to dig into the internals. That’s kind of why I left it like I did—there really wasn’t any heir-apparent that the keys could be left to.

“That said, if someone steps forward and seems to be getting community support (for any of the projects) behind them, I’ll be happy to give them admin access to the appropriate rubyforge pages.”

Rubyforge served a purpose for several years, and served it well. But Rubyforge is a bottleneck in the distribution of code, and this is exacerbated by the Ruby community’s reliance not only on RubyGems, but on the idea of the canonical, official version of a project. The increased popularity of distributed version control releases some of the pressue. GitHub has substantially reduced the friction involved in collaboration. Even so, the idea still holds that once a line of work is ready, you release it on Rubyforge, so that it’s official.

Good coders, even those not afflicted with a love of novelty, will eventually grow bored with their projects. The distribution model represented by Rubyforge cannot, or at least should not, long survive this human tendency.

John Galt, meet Paul Graham

From the Arc tutorial, line 372

arc> (is 'a 'a)

(for those who don’t get the joke, don’t worry, it wasn’t that funny. Neither was Ayn Rand.

"Tables are the [lisp] lists of html"

Paul Graham:

Tables are the lists of html. The W3C doesn’t like you to use tables to do more than display tabular data because then it’s unclear what a table cell means. But this sort of ambiguity is not always an error.

Zed Shaw:

I may never do another CSS only layout again. I’m starting to wonder how … we got sucked into that crap, especially if the only way to really get a good looking layout with CSS and div tags is with mountains of stylesheets, html, and sometimes some damn javascript.

I’m not kidding about the javascript. I’ve seen people desperately trying to force their square-peg 3 column layout through the CSS round hole resort to javascript tricks to force the columns in the right spots.…

It’s not gauche to do what’s easiest and nobody’s going run you out of Designer Town (population 100) with sharpened pitchforks and blazing torches.

Insincere apologies to Zed for the bowdlerization.

Arc, at last

Arc, a long-awaited dialect of Lisp, is finally available to the public for testing. As the man says:

“Arc is still fluid and future releases are guaranteed to break all your code. In fact, it was mainly to aid the evolution of the language that we even released it.”

Graham has been influential in encouraging the curious to try out a lisp of one sort or another. Since I first read his articles about Common Lisp and began tracking the infuriatingly undocumented progress of Arc, several new lisps have appeared. Here is wealth, and may Arc enrichen us further.

Hello world!

Welcome to This is your first post. Edit or delete it and start blogging!