Solr sends JSON as text/plain

Yet another reason not to use Solr. The discussion in this Jira issue is interesting.

The reason for this as I understand is to enable viewing the json response as as text in the browser.

Is there perhaps a more general feature we could turn this into? An expert level ability or parameter to set a custom content-type?

The problem right now is in the current class hierarchy of the response writers.

the NamedList is a weird datastructure for those who are not so used to Solr. You don’t know what is included in that unless you do an instanceof. Most of the users are happy to write out the documents

to handle this problem I would use ‘wt=json&wt.mime-type=application/json’

Phrase analysis and expansion with Ruby

The idea is to take a phrase and analyze it for use in Information Retrieval. We need to tokenize it into words, possibly transmute some of the tokens, possibly expand some tokens into subphrases. This class lets you register lambdas to perform transformations, substitutions, and expansions. Expansions can take a numerical value representing the cost of the operation; this is intended for raising or lowering the scores of matches in the theoretical IR application.

Given the phrase “joe’s sushi & bait-shop shack”, assume I want to tokenize on whitespace, replace the ampersand with the word “and”, and create word variants for the hyphenized and apostrophized words. See the last spec for an example of the Ruby data structure this class generates.

Deletes, Transposes, Replaces, Inserts

Very simplistic rudiments of a spell checker in Ruby. Based on Norvig’s article.

Even Solr sucks less when you add Rake

Instant.rake: Compile and run individual Java classes using Rake

Sometimes, when forced to work with Java, you just want to copy and paste some code and fiddle with it. A real project build system is overkill. Try Instant.rake:

Improved object wrapper for JRuby Embed

New in JRuby 1.4 is JRuby Embed, which lets you eval Ruby from Java classes. It works, appears to be well-written, and needs some sugar. Here’s a class that limits your options in a helpful way.

King’s Third Rule of Software Development

Any software project not written in Java will clearly state on its homepage the implementation language.

N-grams and N-logs with Ruby

N-grams are useful, in spell-checking, for example. I’ve been working on a project where I need to extract the word-level equivalent of n-grams from phrases. Lacking a better name, I call them n-logs. As one might expect, Ruby makes this easy. Here is a pipe-friendly script:

Modifying this script to create n-grams instead is trivial.

The Need of a Study of Anatomy (also Swans)

“In our initial sketches for compositions, when memory has to take the place of the living model, we rely to a great extent on our anatomical knowledge for the suggestion of action and form generally. And again it adds materially to our faculties for self-criticism, which, like a sense of humour, is often, nearly always, our salvation.”
Solomon J. Solomon, The practice of oil painting and of drawing as associated with it

Knowledge of your tools is necessary, but not sufficient. The choices you make when planning the structure of software depend on your knowledge of the problem domain. A project is limited (sometimes crippled) by your comprehension of the form and motion and constraints of the body before you.

“It looks like it was made with, you know… longing. Made by a person really longed to see a swan”
Kaylee, Firefly

Self-criticism and a sense of humor are ineluctably linked, I find. Those who have not the capacity to criticize their own efforts lack most of the capacity to laugh at their own failings. If you don’t think it’s funny when you spend two hours failing to find a mindless bug in a simple depth-first traversal function, then you’re not me.

Determine local (not inherited or mixed in) methods in Ruby

I’ve been finding this useful for exploration of other people’s code.

Follow

Get every new post delivered to your Inbox.