Music Recognition

YASH VADARIA
4 min readNov 4, 2020

--

Abstract-Have you ever came across the problem, where you have listen the music before but forget the name of the song. If yes, then here is the solution for that. In this project I have develop my own music recognition web application in React JS. The project is divided into two main parts one is the frontend of the web app and second is the backend of the application.

I.INTRODUCTION

Ever happen, that you are listing the song of neighbor and wonder the name of the song or who is the artist/singer of the song. If yes, then this project is the best way to overcome this problem. We all know that our smart phone is able/capable of doing this with the help of google sound search. But this is only smart phone user, what if you are using your laptop/desktop or your old device which don’t support such kind of search. Let’s talk technical, I have divided my project into two main section. First is to develop the frontend of the project which include the UI of the web application, collecting mp3 clips, etc. Second and main section includes creating the fingerprint for ACRCloud, Spotify API , OAuth token, etc.

II. ACRCloud

Automatic content recognition (ACR) is an identification technology to recognize content played on a media device or present in a media file. This enables users quickly obtain detailed information about the content they have just experienced without any text based input or search efforts. ACR can help users deal with multimedia more effective and make applications more intelligent. Now let’s see how ACR works. The below given image is example of one of the best song searching application in today’s industry that is Shazam.

III. AUDIO FINGERPRINT

Audio Fingerprinting (also called Acoustic Fingerprinting) is the kind of most stabled, effective algorithm of ACR and has been widely used in many applications.

An audio fingerprint is a condensed digital summary, deterministically generated from an audio signal, that can be used to identify an audio sample or quickly locate similar items in an audio database. The follow picture gives us an intuitive understanding of audio fingerprint, we can take the black lines and points as fingerprints.

Practical uses of audio fingerprinting include identifying songs, melodies, tunes, or advertisements; sound effect library management; and video file identification. Media identification using acoustic fingerprints can be used to monitor the use of specific musical works and performances on radio broadcast, records, CDs and peer-to-peer networks. This identification has been used in copyright compliance, licensing, and other monetization schemes.

The standard work flow of audio fingerprinting is shown below.

A robust acoustic fingerprint algorithm must take into account the perceptual characteristics of the audio. If two files sound alike to the human ear, their acoustic fingerprints should match, even if their binary representations are quite different. Acoustic fingerprints are not bitwise fingerprints, which must be sensitive to any small changes in the data. Acoustic fingerprints are more analogous to human fingerprints where small variations that are insignificant to the features the fingerprint uses are tolerated. One can imagine the case of a smeared human fingerprint impression which can accurately be matched to another fingerprint sample in a reference database; acoustic fingerprints work in a similar way.

Perceptual characteristics often exploited by audio fingerprints include average zero crossing rate, estimated tempo, average spectrum, spectral flatness, prominent tones across a set of frequency bands, and bandwidth.

Most audio compression techniques (AAC, MP3, WMA, Vorbis) will make radical changes to the binary encoding of an audio file, without radically affecting the way it is perceived by the human ear. A robust acoustic fingerprint will allow a recording to be identified after it has gone through such compression, even if the audio quality has been reduced significantly. For use in radio broadcast monitoring, acoustic fingerprints should also be insensitive to analog transmission artifacts.

On the other hand, a good acoustic fingerprint algorithm must be able to identify a particular master recording among all the productions of an artist or group. For use as evidence in a court of law, an acoustic fingerprint method must be forensic in its accuracy.

IV. MP3 Sampling

For input, we are providing the small clip of an mp3 file and convert it into the blob (blob is the datatype which stores the binary data). It stands for a binary large object. The clip size and length are some of the main factors which affect the result or the outcome. The ideal length of the mp3 clip is 6 to 8 sec with a size up to 5 to 10 MB. In this project, I have used one built-in the library called mic-recorder-to-mp3.

V. React and Libraries

React is one of the most booming framework when we consider building the Web applications. Facebook, Instagram, WhatsApp, Dropbox are some pf the popular examples using ReactJS. From all this research I started building various projects one react. In this project, I have used Material UI, one of the most popular library when it comes to react. It provides various built in components such as button, sliders etc. Now let’s see some other libraries. First is axios. Axios is package which is used to create various request to API. Such as POST, GET, etc. I have also created top music list in the world. Which is fully dynamic that means it will change as the trending of the song. Second is Crypto js, which is widely use to perform several encoding on request and json objects.

REFERENCES

[1] https://developer.spotify.com/console/

[2] https://reactjs.org/docs/getting-started.html

[3] https://docs.acrcloud.com/docs/acrcloud

[4] https://material-ui.com/

Github Link: https://github.com/yashfrost1410/music-recognition

--

--

No responses yet