python - How can I extract the audio embeddings (features) from Google’s AudioSet? -


i’m talking audio features dataset available @ https://research.google.com/audioset/download.html tar.gz archive consisting of frame-level audio tfrecords.

extracting else tfrecord files works fine (i extract keys: video_id, start_time_seconds, end_time_seconds, labels), actual embeddings needed training not seem there @ all. when iterate on contents of tfrecord file dataset, 4 keys video_id, start_time_seconds, end_time_seconds, , labels, printed.

this code i'm using:

import tensorflow tf import numpy np  def readtfrecordsamples(tfrecords_filename):      record_iterator = tf.python_io.tf_record_iterator(path=tfrecords_filename)      string_record in record_iterator:         example = tf.train.example()         example.parsefromstring(string_record)         print(example)  # prints abovementioned 4 keys not audio_embeddings          # first label can parsed this:         label = (example.features.feature['labels'].int64_list.value[0])         print('label 1: ' + str(label))          # this, however, not work:         #audio_embedding = (example.features.feature['audio_embedding'].bytes_list.value[0])  readtfrecordsamples('embeddings/01.tfrecord') 

is there trick extracting 128-dimensional embeddings? or not in dataset?

solved it, tfrecord files need read sequence examples, not examples. above code works if line

example = tf.train.example() 

is replaced by

example = tf.train.sequenceexample() 

the embeddings , other content can viewed running

print(example) 

Comments

Popular posts from this blog

ios - MKAnnotationView layer is not of expected type: MKLayer -

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -