pca - Need to perform Principal component analysis on a dataframe collection in python using numpy or sklearn -
i having 'dataframe collection' df data below. trying perform principal component analysis(pca) on dataframe collection using sklearn. getting typeerror
from sklearn.decomposition import pca df # dataframe collection pca = pca(n_components=5) pca.fit(x)
how convert dataframe collection array matrix sequence. think if convert array matrix, able pca
data:
{'ussp2 cmpn curncy': 0 0.297453 1 0.320505 2 0.345978 3 0.427871 name: (ussp2 cmpn curncy, px_last), length: 1747, dtype: float64, 'margdebt index': 0 0.095478 1 0.167469 2 0.186317 3 0.203729 name: (margdebt index, px_last), length: 79, dtype: float64, 'sl% smt% index': 0 0.163636 1 0.000000 2 0.000000 3 0.363636 name: (sl% smt% index, px_last), dtype: float64, 'ffsraiws index': 0 0.157234 1 0.278174 2 0.530603 3 0.526519 name: (ffsraiws index, px_last), dtype: float64, 'usphnsa index': 0 0.107330 1 0.213351 2 0.544503 3 0.460733 name: (usphnsa index, px_last), length: 79, dtype: float64]
can in pca on dataframe collection. thanks!
your dataframe collection dictionary (dict
) of dataframe
objects.
to perform analysis need have array of data work with. first step convert data single dataframe
. pandas natively supports concatenating dictionary of dataframes, e.g.
import pandas pd df = { 'currency1': pd.dataframe([[0.297453,0.5]]), 'currency2': pd.dataframe([[0.297453,0.5]]) } x = pd.concat(df)
you can perform pca on values dataframe
, e.g.
pca = pca(n_components=5) pca.fit(x.values)
Comments
Post a Comment