python - download remote gz files that reside in a tree like directories does snot work -


i have been scratching head more 2 days, still cannot figure out how following! want download geo data sets in ftp://ftp.ncbi.nlm.nih.gov , in each data set, need see if contain keywords interested in. able manually download 1 of data sets , checked file desired keywords. however, since number of data sets huge, cannot manually. want write program me. first step, tried see if can download them. structure follows:

 hots->    /geo/       -> datasets/         ->  gds1nnn/ .... way through gds6nnn , each of them             contain more 600 directories; ordered number i.e.              gds1001. now, in each of these directories:            --->  soft  inside folder there 2 files named              this: folder name (gds1001)+_full.soft.gz 

this file think need download , see if keywords looking inside file.

here code:

ftp = ftp('ftp.ncbi.nlm.nih.gov') # remember need provide host name not complete address! ftp.login() #ftp.retrlines('list') ftp.cwd("/geo/datasets/gds1nnn/") ftp.retrlines('list') filenames = ftp.nlst()  count = len(filenames) curr = 0 print ("found {} files".format(count)) filename in filenames:     first_path=filename+"/soft/"     second_path=first_path+filename+"_full.soft.gz"     #print(second_path)       local_filename = os.path.join(r'full path folder           created')     file = open(local_filename, 'wb')     ftp.retrbinary('retr ' + second_path, file.write)     file.close() ftp.quit() 

output:

file = open(local_filename, 'wb') permissionerror: [errno 13] permission denied: full path folder created' 

however, have both read , write permission on folder. help

the following code shows how can create folder each dataset , save content folder.

 import sys, ftplib, os, itertools     ftplib import ftp     zipfile import zipfile     ftp = ftp('ftp.ncbi.nlm.nih.gov')     ftp.login()     #ftp.retrlines('list')     ftp.cwd("/geo/datasets/gds1nnn/")     ftp.retrlines('list')      filenames = ftp.nlst()     curr = 0     #print ("found {} files".format(count))     count = 0     filename in filenames:         array_db=[]            os.mkdir( os.path.join('folder called "output' + filename ) )         first_path=filename+"/soft/"         os.mkdir( os.path.join('folder called "output' + first_path ) )         second_path=first_path+filename+"_full.soft.gz"         array_db.append(second_path)             array in array_db:             print(array)             local_filename = os.path.join('folder called "output' + array )             file = open(local_filename, 'wb')             ftp.retrbinary('retr ' + array, file.write)             file.flush()             file.close()         ftp.quit() 

Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -