pca - Need to perform Principal component analysis on a dataframe collection in python using numpy or sklearn -
i having 'dataframe collection' df data below. trying perform principal component analysis(pca) on dataframe collection using sklearn. getting typeerror
from sklearn.decomposition import pca df # dataframe collection pca = pca(n_components=5) pca.fit(x) how convert dataframe collection array matrix sequence. think if convert array matrix, able pca
data:
{'ussp2 cmpn curncy': 0 0.297453 1 0.320505 2 0.345978 3 0.427871 name: (ussp2 cmpn curncy, px_last), length: 1747, dtype: float64, 'margdebt index': 0 0.095478 1 0.167469 2 0.186317 3 0.203729 name: (margdebt index, px_last), length: 79, dtype: float64, 'sl% smt% index': 0 0.163636 1 0.000000 2 0.000000 3 0.363636 name: (sl% smt% index, px_last), dtype: float64, 'ffsraiws index': 0 0.157234 1 0.278174 2 0.530603 3 0.526519 name: (ffsraiws index, px_last), dtype: float64, 'usphnsa index': 0 0.107330 1 0.213351 2 0.544503 3 0.460733 name: (usphnsa index, px_last), length: 79, dtype: float64] can in pca on dataframe collection. thanks!
your dataframe collection dictionary (dict) of dataframe objects.
to perform analysis need have array of data work with. first step convert data single dataframe. pandas natively supports concatenating dictionary of dataframes, e.g.
import pandas pd df = { 'currency1': pd.dataframe([[0.297453,0.5]]), 'currency2': pd.dataframe([[0.297453,0.5]]) } x = pd.concat(df) you can perform pca on values dataframe, e.g.
pca = pca(n_components=5) pca.fit(x.values)
Comments
Post a Comment