Only out of marvel I wanted to play around with Google Cloud Platform. They give $300 gratis credit for a 12 month trial period so I thought this would exist a adept chance to attempt it out.

The APIs I wanted to sample were oral communication recognition and translation.

Setting Up SDK

I followed the quick commencement guide which is a step-past-step process so it was quite helpful to get acquainted with the basics.

To be able to follow the instructions I downloaded and installed the GCloud SDK. On Mac information technology's quite easy:

            roll | bash exec -l $SHELL gcloud init                      

And once it's complete it requires you to log in to your account and grant access to SDK:

Testing the API

Afterward the initial setup I tried the sample asking and it worked just fine:

The case worked but also raised a few questions in my mind:

  1. Sample uses gs protocol. First off, what does it mean?
  2. Can I use good ol' http instead of information technology and indicate to any audio file publicly accessible?
  3. Can I use MP3 as encoding or does it demand to be FLAC?

As learned from this Then thread, gs is used for Google Cloud Storage and "" translates to "gs://".

So the http version of the exam file is "". I was able to verify the file actually exists but when I replaced it with the original value I got this fault:

                          {                                          "error"              :                                          {                                          "code"              :                                          400              ,                                          "message"              :                                          "Request contains an invalid statement."              ,                                          "status"              :                                          "INVALID_ARGUMENT"                                          }                                          }                                                  

This likewise answered my 2d question. Co-ordinate to the documentation it only supports Google Cloud Storage currently:

            uri contains a URI pointing to the audio content.  Currently, this field must contain a Google Deject Storage URI  (of format gs://saucepan-name path_to_audio_file).                      

The respond to my third question wasn't very promising either. Patently merely the types listed below are supported:

If the authorization token expires, you can generate a new i past using the post-obit commands:

            consign GOOGLE_APPLICATION_CREDENTIALS="/Path/To/Credentials/Json/File"  gcloud auth awarding-default print-access-token                      

So no fashion of uploading a random MP3 and get text out of it. But I'll of course attempt anyway :-)

Exam Case: Get lyrics for a Rammstein song and interpret

OK, at present that I have a gratuitous trial at my disposal and have everything setup, let's create some storage, upload some files and put it to a existent examination.

Step 01: Get some media

My goal is to excerpt lyrics of a Rammstein song and interpret them to English. For that I chose the song Du Hast. Since I couldn't find a way to download FLAC version of the song I decided to download the official vide from Rammstein's YouTube aqueduct.

This is but for experimental purposes and I deleted the video after I'm done testing it so should be fine I guess. To download videos from youtube you can refer to this TechAdvisor commodity.

I simply used VLC to open the YouTube video. In Window -> Media Information dialog it shows the full path of the raw video file and I copied that path into a browser and downloaded the video.

Step 02: Prepare the media to process

Since all I need is audio I extracted it from video file using VLC. Probably can exist done in a number of means but VLC is quite straightforward to do it:

Click File –> Convert & Stream, elevate and drop the video

In the Choose Profile section, select Audio - FLAC.

The of import scrap here is is that past default VLC converts to stereo audio with 2 channels but Google doesn't support information technology which is explained in this documentation:

            All encodings support only 1 aqueduct (mono) audio                      

So make sure to customize it and enter ane as channel count:

Stride 03: Call the API

Now I was ready to call the API with my shiny single-aqueduct FLAC file. I uploaded it to the Google Storage bucket I created, gave public admission to information technology and tried the API.

Apparently, speech communication:recognize endpoint only supports audio up to a minute. This is the error I got later posting a 03:55 audio.

"Sync input too long. For audio longer than 1 min utilise LongRunningRecognize with a 'uri' parameter."

The solution is using speech communication:longrunningrecognize endpoint which simply returns a JSON with 1 value: name. This is a unique identifier assigned by Google to the job they created for united states.

Once we have this id we can query the upshot of the process past calling Become operations endpoint.

Fantastic! Some results. It's utterly disappointing of class as nosotros merely got a few words out of information technology, but still something (I guess!).

Step 04: Compare the results:

Now the following is the bodily lyrics of the song:

            Du du hast du hast mich du hast mich gefragt du hast mich gefragt, und ich hab nichts gesagt  Willst du bis der Tod euch scheidet treu ihr sein für alle Tage  Nein  Willst du bis zum Tod, der scheide sie lieben auch in schlechten Tagen  Nein                      

and this is what I got back from Google:

            du hast  du hast recht  du hast  du hast mich  du hast mich  du du hast  du hast mich  du hast mich belogen  du hast  du hast mich blockiert                      

Information technology missed most of the lyrics. Maybe information technology was headbanging too hard that information technology couldn't catch those parts!

Test Case: Dull German Podcast

Since my thought of translating High german industrial metal lyrics on the fly failed miserably I decided to try with cleaner audio where in that location is no music. Establish a dainty looking podcast called Slow High german. Prissy thing about it is that it provides transcripts equally well so I can compare the Spoken communication API results with information technology.

Obtained a random episode from their site and followed the steps to a higher place.

First 4 paragraphs of the actual transcript of the podcast is as follows (The full transcript can be found here:

            Denk ich an Deutschland in der Nacht, dann bin ich um den Schlaf gebracht." Habt Ihr diesen Satz schon einmal gehört? Er wird immer dann zitiert, wenn es Probleme in Deutschland gibt. Der Satz stammt von Heinrich Heine. Er war einer der wichtigsten deutschen Dichter. Aber keine Malaise: Auch wenn er am 13. Dezember 1797 geboren wurde, sind seine Texte sehr aktuell und relativ leicht zu lesen. Ihr werdet ihn mögen!  Harry Heine wuchs in einem jüdischen Haushalt auf. Er state of war 13 Jahre alt, als Napoleon in Düsseldorf einzog. Schon als Schüler begann er, Gedichte zu schreiben. Beruflich sollte er eigentlich im Bankgeschäft arbeiten, aber dafür hatte er kein Talent. Also versuchte er es erst mit einem eigenen Geschäft für Stoffe, das aber bald pleite state of war. Dann begann er zu studieren. Er probierte es mit Jura und mit Geschichte, besuchte verschiedene Vorlesungen.  Mit 25 Jahren veröffentlichte er erste Gedichte. Es war eine aufregende Zeit für ihn. Er wechselte die Städte und die Universitäten, er beendete sein Jura- Studium und wurde promoviert. Um seine Chancen als Anwalt zu verbessern, ließ er sich protestantisch taufen, er kehrte also dem Judentum den Rücken und wurde Christ. Daher auch der neue Name: Christian Johann Heinrich Heine. Später hat er die Taufe oft bereut.  Wenn Ihr Heines Werke lest werdet Ihr merken, dass sie etwas Besonderes sind. Sie sind often kritisch, sehr oftentimes aber auch ironisch und humorvoll. Er spielt mit der Sprache. Er kann aber auch sehr böse sein und herablassend über Menschen schreiben. Seine Kritik auch an politischen Ereignissen und die Zensur, mit der er in Deutschland leben musste, führten Heinrich Heine nach Paris. Er wanderte nach Frankreich aus.                      

And this is the result I got from Google (Trimmed to match the higher up):

            denk ich an Deutschland in der Nacht dann bin ich um den Schlaf gebracht habt ihr diesen Satz schon einmal gehört er wird immer dann zitiert wenn es Probleme in Deutschland gibt der Satz stammt von Heinrich Heine er war einer der wichtigsten deutschen Dichter aber keine Angst auch wenn er am thirteen. Dezember 1797 geboren wurde sind seine Texte sehr aktuell und relativ leicht zu lesen ihr werdet ihn mögen Harry Heine wuchs in einem jüdischen Haushalt auf er war 13 Jahre alt als Nappo  hier in Düsseldorf einen Zoo schon als Schüler begann er Gedichte zu schreiben beruflich sollte er eigentlich im Bankgeschäft arbeiten aber dafür hatte er kein Talent also versuchte er es erst mit einem eigenen Geschäft für Stoffe das aber baldheaded pleite war dann begann er zu studieren er probierte es mit Jura und mit Geschichte besuchte verschiedene Vorlesungen mit 25 Jahren veröffentlichte er erste Gedichte es state of war eine aufregende Zeit für ihn er wechselte die Städte und die Universitäten er beendete sein Jurastudium und wurde Promo  auch an politischen Ereignissen und die Zensur mit der er in Deutschland leben musste führten Heinrich Heine nach Paris er wanderte nach Frankreich aus                      

Comparison the translations

Since I don't speak German language I cannot approximate how well it did. Clearly it didn't capture all the words simply I wanted to see if what it returned makes any sense anyway. So I put both in Google Interpret and this is how they compare:

Translation of the original transcript:

            When I retrieve of Germany at dark, I'k virtually to go to slumber. "Have you always heard that phrase before? He is always quoted when there are issues in Germany. The judgement is by Heinrich Heine. He was ane of the most important German poets. But do not worry: even if he was born on Dec 13, 1797, his lyrics are very up to date and relatively easy to read. You will like him!  Harry Heine grew up in a Jewish household. He was xiii years one-time when Napoleon moved in Dusseldorf. Even as a student, he began writing poesy. Professionally, he was supposed to work in banking, simply he had no talent for that. So he kickoff tried his own business for fabrics, which was presently bankrupt. Then he began to study. He tried law and history, attended various lectures.  At the age of 25 he published his outset poems. It was an exciting fourth dimension for him. He changed cities and universities, he completed his law studies and received his doctorate. To improve his chances as a lawyer, he was baptized Protestant, so he turned his back on Judaism and became a Christian. Hence the new name: Christian Johann Heinrich Heine. Later he often regretted baptism.  When you read Heine's works, you will observe that they are special. They are often disquisitional, but often likewise ironic and humorous. He plays with the language. But he tin besides exist very angry and condescending to write about people. His criticism also of political events and the censorship with which he had to live in Germany led Heinrich Heine to Paris. He emigrated to French republic.                      

Translation of Google'due south results:

            I think of Germany in the night then I'm about to sleep Did you ever hear this sentence He is always quoted when there are problems in Germany The sentence comes from Heinrich Heine He was one of the most of import German language poets only do non exist agape he was built-in on Dec 13, 1797 his lyrics are very upward to engagement and relatively like shooting fish in a barrel to read yous will like him Harry Heine grew upwardly in a Jewish household he was 13 years old equally Nappo  Here in Dusseldorf a zoo as a student he began to write poetry professionally he should really work in the banking business just for that he had no talent then he first tried his ain business for fabrics but soon bankrupt and then began to written report he tried it with Jura and with history attended diverse lectures at historic period 25 he published his first poems information technology was an exciting time for him he changed the cities and the universities he finished his constabulary studies and became promo  as well in political events and the censorship with which he had to live in Frg led Heinrich Heine to Paris he emigrated to France                      

The translations of the podcast are very shut, especially the first function. It missed some sentences and when you read the API output at to the lowest degree you tin can get a general understanding of what the text is well-nigh. It's not a adept read maybe and information technology's non good if y'all're interested in details just information technology's probably adept enough


Spoken communication to text tin can be very useful backed with automated real-time translations. Google Speech API supports real time voice communication recognition likewise so it may be interesting to put Translation API in use too and develop a tool to become existent fourth dimension translations but that'southward for another blog mail.


  • Official Google API Page
  • Speech API Quick Offset
  • GClud SDK
  • StackOverflow: What does GS Protocol Mean
  • Slow German podcast


