IDAT204 – Global Record Collection

The aim of this system overall is to create a decentralised database containing information on all music using semantic web principles.

The original proposal outlines the idea of a standardised format for this information and a way to navigate through it that could potentially be a ‘killer application’ for the semantic web. With a field such as music, discovering new music isn’t something that is easily done with traditional web searches which require you to know what you are looking for to an extent and return cluttered and often unhelpful information. A GRC would work on the simple idea that individuals that like one thing are statistically inclined to also share a common like for another thing.

Operation

At first the software presents the user with data from a GRCML document that is either their own, one belonging to a specific person or entity or a community. This gives them a number of options as to which direction to go in, they select ‘Track A’ of which they already know. A query is then sent to a server with access to a collated copy of the entire GRC along with a user ID and session key to prevent repeat information being returned. The server then generates a list of the most popular music that people who have ‘Track A’ also have. The user ID allows the server to lookup what the client already has and the session key allows the server to keep track of what not to show again. A GRCML document is then generated and returned to the software which displays the new information allowing the user to explore it. A user could then select another piece of music within and repeat this indefinitely.

On the servers

Systems managing this data would need to crawl the web for GRCML documents and metadata. This data would need to combine this data removing duplicates and verifying the data’s integrity using some form of reliability ranking system and a mechanism to correct errors such as spelling mistakes. Each user and each track would be designated a unique ID allowing a relational database to be formed. Queries would be sent through URLs to the server containing the unique ID of the item selected, the data base would be searched and return the unique IDs of all or a random sample of the users with the same item, each of these would then be searched and then the results from each merged and ranked by popularity. From this a new GRCML document is generated and sent to the user. A generated example of the response is available here.

An individual item with a unique ID.

<song id="fz54n0aw2nw576d">
	<artist>Pendulum</artist>
	<title>Granite</title>
	<track>9</track>
	<album>In Silico</album>
	<art>In Silico.jpg</art>
	<bitrate>227300</bitrate>
	<length>4:42</length>
</song>

Interface design

There is large scope for a number of interfaces for navigating a GRC. I opted for a design which would be intuitive on touch screen handheld devices.

Demo download

A copy of the interface with a mock GRCML dataset is available here:
Windows
OSX

Twitter adaptation

The same structure could be used for searching other related data where the user does not know exactly what they are looking for but are interested in branching out from what they already know. I adapted my interface to use data from twitter. This allows the user to browse through all the people a person follows. They can then select a person of interest and see who they follow.

Currently it only starts from my twitter account and is slow to load new data. Navigate with the arrow keys and select with the enter key.

Download:
Windows
OSX

Processing.js version

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.