python - How can I extract the audio embeddings (features) from Google’s AudioSet? -
i’m talking audio features dataset available @ https://research.google.com/audioset/download.html tar.gz archive consisting of frame-level audio tfrecords.
extracting else tfrecord files works fine (i extract keys: video_id, start_time_seconds, end_time_seconds, labels), actual embeddings needed training not seem there @ all. when iterate on contents of tfrecord file dataset, 4 keys video_id, start_time_seconds, end_time_seconds, , labels, printed.
this code i'm using:
import tensorflow tf import numpy np  def readtfrecordsamples(tfrecords_filename):      record_iterator = tf.python_io.tf_record_iterator(path=tfrecords_filename)      string_record in record_iterator:         example = tf.train.example()         example.parsefromstring(string_record)         print(example)  # prints abovementioned 4 keys not audio_embeddings          # first label can parsed this:         label = (example.features.feature['labels'].int64_list.value[0])         print('label 1: ' + str(label))          # this, however, not work:         #audio_embedding = (example.features.feature['audio_embedding'].bytes_list.value[0])  readtfrecordsamples('embeddings/01.tfrecord') is there trick extracting 128-dimensional embeddings? or not in dataset?
solved it, tfrecord files need read sequence examples, not examples. above code works if line
example = tf.train.example() is replaced by
example = tf.train.sequenceexample() the embeddings , other content can viewed running
print(example) 
Comments
Post a Comment