Archive for the ‘programming’ Category

Music Information Retrieval

Tuesday, March 18th, 2008

Over the weekend, I really got into music information retrieval (MIR). Its basically grabbing meta-information of an audio file by analyzing its waveform. This type of information is really valuable, especially for a music company (ie: Grooveshark). If I ever have time, this would be a really fun side project. A really good source of information about this topic is this bibiography page (too bad it hasn’t been updated since August, 2007). A list of up and running MIR systems can be found here.

What makes MIR systems so important is that for music sites, they can generate a lot of useful data without anyone having to enter it by hand. For iTunes, this is not a problem because labels give them all the information they need, but for sites where song files can come from anywhere and anyone, there’s no way you can handle the variability in data quality and availability. By having a system that could automatically fetch the required info, within certain bounds of error, you create a vast collection of information that you can use to generate recommendations, provide more accurate searches, and create better categorization of all that music.

The problem with MIR systems is that they require large amounts of storage space and processing power. The cost of both storage and processing are dropping everyday which is great for the future of MIR systems. Processing power is the largest inhibiting factor, especially when you try to analyze millions of songs. The only companies that could probably do a project like this on a large scale would be Google, Amazon and their ilk. Currently, I’m very hopeful that a startup with the right mix of programmers, hardware, and music can compete with the big boys ;)

Javascript Tip: getElementsByID

Monday, March 17th, 2008

Browsers do a lousy job of providing an interface to the HTML document. The DOM is supposed to be that interface but it is horribly slow, and clunky. Traversing the DOM tree extensively is one of the sure-fire way to slow down your site. In an effort to help out Javascript coders, the DOM does have functions like getElementsByTagName, getElementsByClassName and getElementsByName, but they do not all work across all browsers. This why you should create a function called getElementsByID:

var groupCache = {};
function getElementsById(id){
  if(!groupCache[id]){
    groupCache[id] = [];
  }
  var nodes = groupCache[id];
  for(var x=0; x<nodes .length; x++){
    if(nodes[x].id != ""){
      nodes.splice(x, 1);
      x--;
    }
  }
  var tmpNode = document.getElementById(id);
  while(tmpNode){
    nodes.push(tmpNode);
    tmpNode.id = "";
    tmpNode = document.getElementById(id);
  }
  return nodes;
}

Now whenever you want a collection of DOM objects, just give all of them the same id and call this function to grab an array of the objects you want. This is not the most ideal way and its actually a pretty big hack. But sometimes, speed is more important than form.

OpenLaszlo

Saturday, March 15th, 2008

I just discovered this really cool application called OpenLaszlo. They are an “open-source platform for the development and delivery of rich Internet applications on the World Wide Web.” One of their claims is “write once, run anywhere” (I think we’ve heard this before - Java).

To me, their syntax and coding style is very similar to Flex. What’s most impressive is that when you “compile” your page/site, you have the option of compiling to flash or DHTML. Working with HTML/CSS all day, they must have a pretty impressive algorithm to ensure browser compatibility, especially on the CSS side.

OpenLaszlo has a lot of promise and is really neat technology. If they really take off and become popular, they will effectively put me out of a job. In the end, they automated the entire process of building a Web2.0 AJAX-powered website and provides another way to produce flash applications in a way that makes sense to traditional web developers.

Using jQuery at Grooveshark

Monday, March 3rd, 2008

On a daily basis for the last year, I use jquery at work (Grooveshark) to handle different DOM and animation methods. I started out using jquery because I was new to the browser world and Javascript in particular. Jquery made it really easy to come right in and start creating usable modules.

As I got more comfortable with javascript, I started moving away from jquery and building my own methods or extending jquery. One problem with browsers in general is accessing the DOM and any javascript heavy site will always go over this bump. At Grooveshark, because everything is so heavily based on accessing/modifying song/artist/album information, we already have a unique identifier from the MySQL database. By using a contextual name based system, we can get really fast DOM lookups for very cheap:

HTML:

<table class="songs">
	<tr id="song_123"><td>A Song</td></tr>
	<tr id="song_456"><td>My Song</td></tr>
		...	MORE SONGS ...
	<tr id="song_789><td>Your Song</td></tr>
</table>

Javascript:

function song_click() {
	$("table.songs tr").click(function() {
		var domid = $(this).attr("id");
		var songid = domid.split("_")[1];
		// possibilities endless...
	}
}

This is not exactly what we do (currently using inline event handler - I know, not very web2.0-ish of me), but this is the basic idea. The html elements themselves contain a reference to the data they represent so whenever a user interacts with that element, it’s very easy to identify what it is.

Google Maps: why are you so difficult?

Monday, March 3rd, 2008

So recently a friend asked me if I could map a list of ip addresses to physical real-world locations. I thought this would be a somewhat simple en devour, but little did I know what the universe had in store for me.

My first step was seeing how to map ip addresses to a location. This was the easiest part using Host.info. This site provides an easy to use api and fairly accurate results. It’s always nice having open source solutions for problems like this.

The next step was taking these locations and putting them on a map. This is where Google and its almost infinite tools come into play. I have used their search api before, which really sucks since they moved to the AJAX model, and I mistakenly thought the examples would give me everything I need. The basic setup is the same but the data formats Google used in the examples (KML or XML) did not provide a basic structure that Google maps api would render correctly. Looking around their api and searching the internet didn’t get me anywhere either.

At one point, I found someone who developed their own solution by using Google’s XML Javascript object. This made it really easy to get any information I need and allowed me to use any accepted XML standard, applying a location/marker tag and displaying the result and any contextual information. I’m glad Google provides this functionality but they could do a little better in providing a standard for their data formats. That, or I am a total dunce and totally missed that information.