How to apply a 'moving window' to analyse chunks of text sequentially in Python? -


i calculate simple moving window average type/ token ratio (ttr) of text sample. know how calculate ttr of whole text, or select first 50 words , calculate ttr that. think need create loop iterates on 50 words @ time, start moving +1 each time window moves through text, appending resulting ttr each window in list can average. it's looping/ chunking/ +1 part i'm stuck on.

this (think) want in loop. text has been lowered etc.:

window = text[0:50] wordcount = collections.counter(window) uniquewords = list(wordcount.keys()) ttr = (len(uniquewords))/(len(window))  windowsttr.append(ttr)  

i have read other answers here, documentation enumerate , itertools.islice, still can't seem solve problem. gratefully receieved, i'm new python.

parametrize loop body according start position. iterate through possible start positions.

window_width = 50 last_index = len(text) - window_width start in range (last_index):     window = text[start:start+window_width]     wordcount = collections.counter(window)     uniquewords = list(wordcount.keys())     ttr = (len(uniquewords))/(len(window))      windowsttr.append(ttr) 

if need take larger steps through text, parametrize that, well:

window_width = 50 last_index = len(text) - window_width step = 4    # shift 4 positions @ time start in range (0, last_index, step): 

Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -