This page permanently redirects to gemini://gemini.techrights.org/2010/07/09/speaking-with-code/.
Posted in Free/Libre Software, Patents at 4:51 pm by Dr. Roy Schestowitz
Summary: This post looks at patent bullying against Free software and it calls for the spreading of source code which Shazam unlawfully tries to remove from the Internet
EARLIER TODAY we wrote about NetApp's threats against ZFS distributors. As one blogger put it:
=> we wrote | NetApp's threats against ZFS distributors | ↺ put it
Enterprise Strategy Group senior analyst Terri McClure wonders why NetApp didn’t hit Nexenta with the same letter since Nexenta supplies its ZFS software to multiple storage vendors.“If NetApp did it would make sense – stop a number of vendors instead of just one. It certainly makes you wonder why they would single out Coraid, people could read into this that NetApp sees Coraid as a threat. Coraid’s NAS product is pretty new but the underlying platform has been on the market a while and is solid, at a really aggressive price point,” said McClure.“[NetApp] just spent a couple of hundred dollars in lawyer’s fees and took a competitor out of the market. Quick and easy, but a little disappointing, too. At the end of the day, ZFS is open source, and while there is no way to predict how the settlement talks between Oracle and NetApp will turn out, you can’t really un-open source ZFS,” she said.There’s still no word from NetApp on the matter.
The “patent troll, NTP, is back, buoyed dosh from RIM,” says Glyn Moody, who found this new article.
NTP, a patent-holding company best known for prying a settlement of more than $600 million from the maker of the BlackBerry, is now suing the other big names in the smartphone industry: Apple, Google, Microsoft, HTC, LG and Motorola, writes The New York Times’s Steve Lohr.The suits, filed late Thursday afternoon in federal district court in Richmond, Va., charge that the cellphone e-mail systems of those companies are illegally using NTP’s patented technology.
We mentioned NTP before and so did Patent Troll Tracker. Speaking of trolls, earlier today we wrote about Shazam's patent bullying. That previous post gave just the gist of it and the discussion at Slashdot ought to say more. From the summary:
=> mentioned NTP before | ↺ so did Patent Troll Tracker | Shazam's patent bullying | ↺ the discussion at Slashdot
“The code wasn’t even released, and yet Roy van Rijn, a Music & Free Software enthusiast received a C&D from Landmark Digital Services, owners of Shazam, a music service that allows you to find a song, by listening to a part of it. And if that wasn’t enough, they want him to take down his blog post (Google Cache) explaining how he did it because it ‘may be viewed internationally. As a result, [it] may contribute to someone infringing our patents in any part of the world.’”
Jan Wildeboer calls it “Patent Infringement Madness” and another post Wildeboer says “is (a) a blog entry or (b) patent infringement? I say (a) Shazam says (b)”
=> ↺ it | ↺ another post
Two readers urged us to make a mirror just in case (other people ought to mirror this too, in order to ensure that Shazam will lose hope of successfully censoring perfectly legal Dutch code).
Patents are supposed to encourage publication of ideas, not to suppress them. The following code is not in any way infringing Shazam copyrights. █
A couple of days ago I encountered this article: How Shazam WorksThis got me interested in how a program like Shazam works… And more importantly, how hard is it to program something similar in Java?About ShazamShazam is an application which you can use to analyse/match music. When you install it on your phone, and hold the microphone to some music for about 20 to 30 seconds, it will tell you which song it is.When I first used it it gave me a magical feeling. “How did it do that!?”. And even today, after using it a lot, it still has a bit of magical feel to it.Wouldn’t it be great if we can program something of our own that gives that same feeling? That was my goal for the past weekend.Listen up..!First things first, get the music sample to analyse we first need to listen to the microphone in our Java application…! This is something I hadn’t done yet in Java, so I had no idea how hard this was going to be.But it turned out it was very easy:1final AudioFormat format = getFormat(); //Fill AudioFormat with the wanted settings 2DataLine.Info info = new DataLine.Info(TargetDataLine.class, format); 3final TargetDataLine line = (TargetDataLine) AudioSystem.getLine(info); 4line.open(format); 5line.start();Now we can read the data from the TargetDataLine just like a normal InputStream:01// In another thread I start: 02 03OutputStream out = new ByteArrayOutputStream(); 04running = true; 05 06try { 07 while (running) { 08 int count = line.read(buffer, 0, buffer.length); 09 if (count > 0) { 10 out.write(buffer, 0, count); 11 } 12 } 13 out.close(); 14} catch (IOException e) { 15 System.err.println("I/O problems: " + e); 16 System.exit(-1); 17}Using this method it is easy to open the microphone and record all the sounds! The AudioFormat I’m currently using is:1private AudioFormat getFormat() { 2 float sampleRate = 44100; 3 int sampleSizeInBits = 8; 4 int channels = 1; //mono 5 boolean signed = true; 6 boolean bigEndian = true; 7 return new AudioFormat(sampleRate, sampleSizeInBits, channels, signed, bigEndian); 8}So, now we have the recorded data in a ByteArrayOutputStream, great! Step 1 complete.Microphone dataThe next challenge is analyzing the data, when I outputted the data I received in my byte array I got a long list of numbers, like this:010 020 031 042 054 067 076 083 09-1 10-2 11-4 12-2 13-5 14-7 15-8 16(etc)Erhm… yes? This is sound?To see if the data could be visualized I took the output and placed it in Open Office to generate a line graph:Ah yes! This kind of looks like ’sound’. It looks like what you see when using for example Windows Sound Recorder.This data is actually known as time domain. But these numbers are currently basically useless to us… if you read the above article on how Shazam works you’ll read that they use a spectrum analysis instead of direct time domain data.So the next big question is: How do we transform the current data into a spectrum analysis?Discrete Fourier transformTo turn our data into usable data we need to apply the so called Discrete Fourier Transformation. This turns the data from time domain into frequency domain.There is just one problem, if you transform the data into the frequency domain you loose every bit of information regarding time. So you’ll know what the magnitude of all the frequencies are, but you have no idea when they appear.To solve this we need a sliding window. We take chunks of data (in my case 4096 bytes of data) and transform just this bit of information. Then we know the magnitude of all frequencies that occur during just these 4096 bytes.Implementing thisInstead of worrying about the Fourier Transformation I googled a bit and found code for the so called FFT (Fast Fourier Transformation). I’m calling this code with the chunks:01byte audio[] = out.toByteArray(); 02 03final int totalSize = audio.length; 04 05int amountPossible = totalSize/Harvester.CHUNK_SIZE; 06 07//When turning into frequency domain we'll need complex numbers: 08Complex[][] results = new Complex[amountPossible][]; 09 10//For all the chunks: 11for(int times = 0;times 1) { 16 freq += (int) (Math.log10(line) * Math.log10(line)); 17 } else { 18 freq++; 19 } 20 } 21}Introducing, Aphex TwinThis seems a bit of OT (off-topic), but I’d like to tell you about a electronic musician called Aphex Twin (Richard David James). He makes crazy electronic music… but some songs have an interesting feature. His biggest hit for example, Windowlicker has a spectrogram image in it.If you look at the song as spectral image it shows a nice spiral. Another song, called ‘Mathematical Equation’ shows the face of Twin! More information can be found here: Bastwood – Aphex Twin’s face.When running this song against my spectral analyzer I get the following result:Not perfect, but it seems to be Twin’s face!Determining the key music pointsThe next step in Shazam’s algorithm is to determine some key points in the song, save those points as a hash and then try to match on them against their database of over 8 million songs. This is done for speed, the lookup of a hash is O(1) speed. That explains a lot of the awesome performance of Shazam!Because I wanted to have everything working in one weekend (this is my maximum attention span sadly enough, then I need a new project to work on) I kept my algorithm as simple as possible. And to my surprise it worked.For each line the in spectrum analysis I take the points with the highest magnitude from certain ranges. In my case: 40-80, 80-120, 120-180, 180-300.01//For every line of data: 02 03for (int freq = LOWER_LIMIT; freq highscores[index]) { 12 highscores[index] = mag; 13 recordPoints[index] = freq; 14 } 15} 16 17//Write the points to a file: 18for (int i = 0; i (List index is Song-ID, String is songname)- Database of hashes: Map>The long in the database of hashes represents the hash itself, and it has a bucket of DataPoints.A DataPoint looks like:01private class DataPoint { 02 03 private int time; 04 private int songId; 05 06 public DataPoint(int songId, int time) { 07 this.songId = songId; 08 this.time = time; 09 } 10 11 public int getTime() { 12 return time; 13 } 14 public int getSongId() { 15 return songId; 16 } 17}Now we already have everything in place to do a lookup. First I read all the songs and generate hashes for each point of data. This is put into the hash-database.The second step is reading the data of the song we need to match. These hashes are retrieved and we look at the matching datapoints.There is just one problem, for each hash there are some hits, but how do we determine which song is the correct song..? Looking at the amount of matches? No, this doesn’t work…The most important thing is timing. We must overlap the timing…! But how can we do this if we don’t know where we are in the song? After all, we could just as easily have recorded the final chords of the song.By looking at the data I discovered something interesting, because we have the following data:- A hash of the recording- A matching hash of the possible match- A song ID of the possible match- The current time in our own recording- The time of the hash in the possible matchNow we can substract the current time in our recording (for example, line 34) with the time of the hash-match (for example, line 1352). This difference is stored together with the song ID. Because this offset, this difference, tells us where we possibly could be in the song.When we have gone through all the hashes from our recording we are left with a lot of song id’s and offsets. The cool thing is, if you have a lot of hashes with matching offsets, you’ve found your song.The resultsFor example, when listening to The Kooks – Match Box for just 20 seconds, this is the output of my program:01Done loading: 2921 songs 02 03Start matching song... 04 05Top 20 matches: 06 0701: 08_the_kooks_-match_box.mp3 with 16 matches. 0802: 04 Racoon - Smoothly.mp3 with 8 matches. 0903: 05 Röyksopp - Poor Leno.mp3 with 7 matches. 1004: 07_athlete-_yesterday_threw_everyting_a_me.mp3 with 7 matches. 1105: Flogging Molly - WMH - Dont Let Me Dia Still Wonderin.mp3 with 7 matches. 1206: coldplay - 04 - sparks.mp3 with 7 matches. 1307: Coldplay - Help Is Round The Corner (yellow b-side).mp3 with 7 matches. 1408: the arcade fire - 09 - rebellion (lies).mp3 with 7 matches. 1509: 01-coldplay-clocks.mp3 with 6 matches. 1610: 02 Scared Tonight.mp3 with 6 matches. 1711: 02-radiohead-pyramid_song-ksi.mp3 with 6 matches. 1812: 03 Shadows Fall.mp3 with 6 matches. 1913: 04 Röyksopp - In Space.mp3 with 6 matches. 2014: 04 Track04.mp3 with 6 matches. 2115: 05 - Dress Up In You.mp3 with 6 matches. 2216: 05 Supergrass - Can't Get Up.mp3 with 6 matches. 2317: 05 Track05.mp3 with 6 matches. 2418: 05The Fox In The Snow.mp3 with 6 matches. 2519: 05_athlete-wires.mp3 with 6 matches. 2620: 06 Racoon - Feel Like Flying.mp3 with 6 matches. 27 28Matching took: 259 ms 29 30Final prediction: 08_the_kooks-_match_box.mp3.song with 16 matches.It works!!Listening for 20 seconds it can match almost all the songs I have. And even this live recording of the Editors could be matched to the correct song after listening 40 seconds!Again it feels like magic! Currently, the code isn’t in a releasable state and it doesn’t work perfectly. It has been a pure weekend-hack, more like a proof-of-concept / algorithm exploration.Maybe, if enough people ask about it, I’ll clean it up and release it somewhere. Or turn it into a huge online empire like Shazam… who knows!
Share in other sites/networks: These icons link to social bookmarking sites where readers can share and discover new web pages.
Permalink Send this to a friend
=> Permalink | ↺ Send this to a friend
=> Techrights
➮ Sharing is caring. Content is available under CC-BY-SA.
text/gemini;lang=en-GB
This content has been proxied by September (ba2dc).