Factor analysis using MINRES or ML, with optional rotation using Varimax or Promax. Bases: The main exploratory factor analysis class. The type of rotation
to perform after fitting the factor analysis model. If set to Possible values include: Defaults
to ‘promax’. The factor loadings matrix. The original correlation matrix. The rotation matrix, if a rotation has been performed. The structure loading matrix. This only exists if The factor correlations matrix. This only exists if Notes This code was partly derived from the excellent R package psych. References [1] https://github.com/cran/psych/blob/master/R/fa.R Examples fit (X,
y=None)[source]¶Fit factor analysis model using either MINRES, ML, or principal factor analysis. By default, use SMC as starting guesses.
Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer >>> df_features = pd.read_csv('tests/data/test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) FactorAnalyzer(bounds=(0.005, 1), impute='median', is_corr_matrix=False, method='minres', n_factors=3, rotation=None, rotation_kwargs={}, use_smc=True) >>> fa.loadings_ array([[-0.12991218, 0.16398154, 0.73823498], [ 0.03899558, 0.04658425, 0.01150343], [ 0.34874135, 0.61452341, -0.07255667], [ 0.45318006, 0.71926681, -0.07546472], [ 0.36688794, 0.44377343, -0.01737067], [ 0.74141382, -0.15008235, 0.29977512], [ 0.741675 , -0.16123009, -0.20744495], [ 0.82910167, -0.20519428, 0.04930817], [ 0.76041819, -0.23768727, -0.1206858 ], [ 0.81533404, -0.12494695, 0.17639683]]) get_communalities ()[source]¶Calculate the communalities, given the factor loading matrix.
Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer >>> df_features = pd.read_csv('tests/data/test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) FactorAnalyzer(bounds=(0.005, 1), impute='median', is_corr_matrix=False, method='minres', n_factors=3, rotation=None, rotation_kwargs={}, use_smc=True) >>> fa.get_communalities() array([0.588758 , 0.00382308, 0.50452402, 0.72841183, 0.33184336, 0.66208428, 0.61911036, 0.73194557, 0.64929612, 0.71149718]) get_eigenvalues ()[source]¶Calculate the eigenvalues, given the factor correlation matrix.
Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer >>> df_features = pd.read_csv('tests/data/test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) FactorAnalyzer(bounds=(0.005, 1), impute='median', is_corr_matrix=False, method='minres', n_factors=3, rotation=None, rotation_kwargs={}, use_smc=True) >>> fa.get_eigenvalues() (array([ 3.51018854, 1.28371018, 0.73739507, 0.1334704 , 0.03445558, 0.0102918 , -0.00740013, -0.03694786, -0.05959139, -0.07428112]), array([ 3.51018905, 1.2837105 , 0.73739508, 0.13347082, 0.03445601, 0.01029184, -0.0074 , -0.03694834, -0.05959057, -0.07428059])) get_factor_variance ()[source]¶Calculate factor variance information. The factor variance information including the variance, proportional variance, and cumulative variance for each factor.
Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer >>> df_features = pd.read_csv('tests/data/test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) FactorAnalyzer(bounds=(0.005, 1), impute='median', is_corr_matrix=False, method='minres', n_factors=3, rotation=None, rotation_kwargs={}, use_smc=True) >>> # 1. Sum of squared loadings (variance) ... # 2. Proportional variance ... # 3. Cumulative variance >>> fa.get_factor_variance() (array([3.51018854, 1.28371018, 0.73739507]), array([0.35101885, 0.12837102, 0.07373951]), array([0.35101885, 0.47938987, 0.55312938])) get_uniquenesses ()[source]¶Calculate the uniquenesses, given the factor loading matrix.
Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer >>> df_features = pd.read_csv('tests/data/test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) FactorAnalyzer(bounds=(0.005, 1), impute='median', is_corr_matrix=False, method='minres', n_factors=3, rotation=None, rotation_kwargs={}, use_smc=True) >>> fa.get_uniquenesses() array([0.411242 , 0.99617692, 0.49547598, 0.27158817, 0.66815664, 0.33791572, 0.38088964, 0.26805443, 0.35070388, 0.28850282]) transform (X)[source]¶Get factor scores for a new data set.
Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer >>> df_features = pd.read_csv('tests/data/test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) FactorAnalyzer(bounds=(0.005, 1), impute='median', is_corr_matrix=False, method='minres', n_factors=3, rotation=None, rotation_kwargs={}, use_smc=True) >>> fa.transform(df_features) array([[-1.05141425, 0.57687826, 0.1658788 ], [-1.59940101, 0.89632125, 0.03824552], [-1.21768164, -1.16319406, 0.57135189], ..., [ 0.13601554, 0.03601086, 0.28813877], [ 1.86904519, -0.3532394 , -0.68170573], [ 0.86133386, 0.18280695, -0.79170903]]) factor_analyzer.factor_analyzer. calculate_bartlett_sphericity (x)[source]¶Compute the Bartlett sphericity test. H0: The matrix of population correlations is equal to I. H1: The matrix of population correlations is not equal to I. The formula for Bartlett’s Sphericity test is: \[-1 * (n - 1 - ((2p + 5) / 6)) * ln(det(R))\] Where R det(R) is the determinant of the correlation matrix, and p is the number of variables.
factor_analyzer.factor_analyzer. calculate_kmo (x)[source]¶
Calculate the Kaiser-Meyer-Olkin criterion for items and overall. This statistic represents the degree to which each observed variable is predicted, without error, by the other variables in the dataset. In general, a KMO < 0.6 is considered inadequate.
factor_analyzer.confirmatory_factor_analyzer Module¶Confirmatory factor analysis using machine learning methods.
factor_analyzer.confirmatory_factor_analyzer. ConfirmatoryFactorAnalyzer (specification=None, n_obs=None, is_cov_matrix=False, bounds=None, max_iter=200, tol=None, impute='median',
disp=True)[source]¶Bases: Fit a confirmatory factor analysis model using maximum likelihood.
model ¶The model specification object.
loadings_ ¶The factor loadings matrix.
error_vars_ ¶The error variance matrix
factor_varcovs_ ¶The factor covariance matrix.
log_likelihood_ ¶The log likelihood from the optimization routine.
aic_ ¶The Akaike information criterion.
bic_ ¶The Bayesian information criterion.
Examples >>> import pandas as pd >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_dict = {"F1": ["V1", "V2", "V3", "V4"], ... "F2": ["V5", "V6", "V7", "V8"]} >>> model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict) >>> cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False) >>> cfa.fit(X.values) >>> cfa.loadings_ array([[0.99131285, 0. ], [0.46074919, 0. ], [0.3502267 , 0. ], [0.58331488, 0. ], [0. , 0.98621042], [0. , 0.73389239], [0. , 0.37602988], [0. , 0.50049507]]) >>> cfa.factor_varcovs_ array([[1. , 0.17385704], [0.17385704, 1. ]]) >>> cfa.get_standard_errors() (array([[0.06779949, 0. ], [0.04369956, 0. ], [0.04153113, 0. ], [0.04766645, 0. ], [0. , 0.06025341], [0. , 0.04913149], [0. , 0.0406604 ], [0. , 0.04351208]]), array([0.11929873, 0.05043616, 0.04645803, 0.05803088, 0.10176889, 0.06607524, 0.04742321, 0.05373646])) >>> cfa.transform(X.values) array([[-0.46852166, -1.08708035], [ 2.59025301, 1.20227783], [-0.47215977, 2.65697245], ..., [-1.5930886 , -0.91804114], [ 0.19430887, 0.88174818], [-0.27863554, -0.7695101 ]]) fit (X,
y=None)[source]¶Perform confirmatory factor analysis.
Examples >>> import pandas as pd >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_dict = {"F1": ["V1", "V2", "V3", "V4"], ... "F2": ["V5", "V6", "V7", "V8"]} >>> model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict) >>> cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False) >>> cfa.fit(X.values) >>> cfa.loadings_ array([[0.99131285, 0. ], [0.46074919, 0. ], [0.3502267 , 0. ], [0.58331488, 0. ], [0. , 0.98621042], [0. , 0.73389239], [0. , 0.37602988], [0. , 0.50049507]]) get_model_implied_cov ()[source]¶Get the model-implied covariance matrix (sigma) for an estimated model.
Examples >>> import pandas as pd >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_dict = {"F1": ["V1", "V2", "V3", "V4"], ... "F2": ["V5", "V6", "V7", "V8"]} >>> model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict) >>> cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False) >>> cfa.fit(X.values) >>> cfa.get_model_implied_cov() array([[2.07938612, 0.45674659, 0.34718423, 0.57824753, 0.16997013, 0.12648394, 0.06480751, 0.08625868], [0.45674659, 1.16703337, 0.16136667, 0.26876186, 0.07899988, 0.05878807, 0.03012168, 0.0400919 ], [0.34718423, 0.16136667, 1.07364855, 0.20429245, 0.06004974, 0.04468625, 0.02289622, 0.03047483], [0.57824753, 0.26876186, 0.20429245, 1.28809317, 0.10001495, 0.07442652, 0.03813447, 0.05075691], [0.16997013, 0.07899988, 0.06004974, 0.10001495, 2.0364391 , 0.72377232, 0.37084458, 0.49359346], [0.12648394, 0.05878807, 0.04468625, 0.07442652, 0.72377232, 1.48080077, 0.27596546, 0.36730952], [0.06480751, 0.03012168, 0.02289622, 0.03813447, 0.37084458, 0.27596546, 1.11761918, 0.1882011 ], [0.08625868, 0.0400919 , 0.03047483, 0.05075691, 0.49359346, 0.36730952, 0.1882011 , 1.28888233]]) get_standard_errors ()[source]¶
Get standard errors from the implied covariance matrix and implied means.
Examples >>> import pandas as pd >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_dict = {"F1": ["V1", "V2", "V3", "V4"], ... "F2": ["V5", "V6", "V7", "V8"]} >>> model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict) >>> cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False) >>> cfa.fit(X.values) >>> cfa.get_standard_errors() (array([[0.06779949, 0. ], [0.04369956, 0. ], [0.04153113, 0. ], [0.04766645, 0. ], [0. , 0.06025341], [0. , 0.04913149], [0. , 0.0406604 ], [0. , 0.04351208]]), array([0.11929873, 0.05043616, 0.04645803, 0.05803088, 0.10176889, 0.06607524, 0.04742321, 0.05373646])) transform (X)[source]¶Get the factor scores for a new data set.
Examples >>> import pandas as pd >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_dict = {"F1": ["V1", "V2", "V3", "V4"], ... "F2": ["V5", "V6", "V7", "V8"]} >>> model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict) >>> cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False) >>> cfa.fit(X.values) >>> cfa.transform(X.values) array([[-0.46852166, -1.08708035], [ 2.59025301, 1.20227783], [-0.47215977, 2.65697245], ..., [-1.5930886 , -0.91804114], [ 0.19430887, 0.88174818], [-0.27863554, -0.7695101 ]]) References https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157408/ classfactor_analyzer.confirmatory_factor_analyzer. ModelSpecification (loadings, n_factors, n_variables, factor_names=None,
variable_names=None)[source]¶Bases:
Encapsulate the model specification for CFA. This class contains a number of specification properties that are used in the CFA procedure.
copy ()[source]¶Return a copy of the model specification. error_vars ¶Get the error variance specification. error_vars_free ¶Get the indices of “free” error variance parameters. factor_covs ¶Get the factor covariance specification. factor_covs_free ¶Get the indices of “free” factor covariance parameters. factor_names ¶Get list of factor names, if available. get_model_specification_as_dict ()[source]¶Get the model specification as a dictionary.
loadings ¶Get the factor loadings specification. loadings_free ¶Get the indices of “free” factor loading parameters. n_factors ¶Get the number of factors. n_lower_diag ¶Get the lower diagonal of the factor covariance matrix. n_variables ¶Get the number of variables. variable_names ¶Get list of variable names, if available. classfactor_analyzer.confirmatory_factor_analyzer. ModelSpecificationParser [source]¶
Bases: Generate the model specification for CFA. This class includes two static methods to generate a parse_model_specification_from_array (X,
specification=None)[source]¶Generate the model specification from a numpy array. The columns should correspond to the factors, and the rows should correspond to the variables. If this method is used to create the
Examples >>> import pandas as pd >>> import numpy as np >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_array = np.array([[1, 1, 1, 1, 0, 0, 0, 0], [0, 0, 0, 0, 1, 1, 1, 1]]) >>> model_spec = ModelSpecificationParser.parse_model_specification_from_array(X, ... model_array)static parse_model_specification_from_dict (X,
specification=None)[source]¶Generate the model specification from a dictionary. The keys in the dictionary should be the factor names, and the values should be the feature names. If this method is used to create the
Examples >>> import pandas as pd >>> from factor_analyzer import (ConfirmatoryFactorAnalyzer, ... ModelSpecificationParser) >>> X = pd.read_csv('tests/data/test11.csv') >>> model_dict = {"F1": ["V1", "V2", "V3", "V4"], ... "F2": ["V5", "V6", "V7", "V8"]} >>> model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict) factor_analyzer.rotator Module¶Class to perform various rotations of factor loading matrices.
factor_analyzer.rotator. Rotator (method='varimax', normalize=True, power=4, kappa=0, gamma=0, delta=0.01, max_iter=500,
tol=1e-05)[source]¶Bases: Perform rotations on an unrotated factor loading matrix. The Rotator class takes an (unrotated) factor loading matrix and performs one of several rotations.
loadings_ ¶The loadings matrix.
rotation_ ¶The rotation matrix.
phi_ ¶The factor correlations matrix. This only exists if
Notes Most of the rotations in this class are ported from R’s References [1] https://cran.r-project.org/web/packages/GPArotation/index.html Examples >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer, Rotator >>> df_features = pd.read_csv('test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) >>> rotator = Rotator() >>> rotator.fit_transform(fa.loadings_) array([[-0.07693215, 0.04499572, 0.76211208], [ 0.01842035, 0.05757874, 0.01297908], [ 0.06067925, 0.70692662, -0.03311798], [ 0.11314343, 0.84525117, -0.03407129], [ 0.15307233, 0.5553474 , -0.00121802], [ 0.77450832, 0.1474666 , 0.20118338], [ 0.7063001 , 0.17229555, -0.30093981], [ 0.83990851, 0.15058874, -0.06182469], [ 0.76620579, 0.1045194 , -0.22649615], [ 0.81372945, 0.20915845, 0.07479506]]) fit (X,
y=None)[source]¶Compute the factor rotation.
Example >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer, Rotator >>> df_features = pd.read_csv('test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) >>> rotator = Rotator() >>> rotator.fit(fa.loadings_) fit_transform (X,
y=None)[source]¶Compute the factor rotation, and return the new loading matrix.
Example >>> import pandas as pd >>> from factor_analyzer import FactorAnalyzer, Rotator >>> df_features = pd.read_csv('test02.csv') >>> fa = FactorAnalyzer(rotation=None) >>> fa.fit(df_features) >>> rotator = Rotator() >>> rotator.fit_transform(fa.loadings_) array([[-0.07693215, 0.04499572, 0.76211208], [ 0.01842035, 0.05757874, 0.01297908], [ 0.06067925, 0.70692662, -0.03311798], [ 0.11314343, 0.84525117, -0.03407129], [ 0.15307233, 0.5553474 , -0.00121802], [ 0.77450832, 0.1474666 , 0.20118338], [ 0.7063001 , 0.17229555, -0.30093981], [ 0.83990851, 0.15058874, -0.06182469], [ 0.76620579, 0.1045194 , -0.22649615], [ 0.81372945, 0.20915845, 0.07479506]]) factor_analyzer.utils Module¶Utility functions, used primarily by the confirmatory factor analysis module.
factor_analyzer.utils. apply_impute_nan (x, how='mean')[source]¶Apply
a function to impute
factor_analyzer.utils. commutation_matrix (p, q)[source]¶
Calculate the commutation matrix. This matrix transforms the vectorized form of the matrix into the vectorized form of its transpose.
References https://en.wikipedia.org/wiki/Commutation_matrix factor_analyzer.utils. corr (x)[source]¶Calculate the correlation matrix.
factor_analyzer.utils. cov (x, ddof=0)[source]¶Calculate the covariance matrix.
factor_analyzer.utils. covariance_to_correlation (m)[source]¶
Compute cross-correlations from the given covariance matrix. This is a port of R
factor_analyzer.utils. duplication_matrix (n=1)[source]¶Calculate the duplication matrix. A function to create the duplication matrix (Dn), which is the unique n2 × n(n+1)/2 matrix which, for any n × n symmetric matrix A, transforms vech(A) into vec(A), as in Dn vech(A) = vec(A).
References https://en.wikipedia.org/wiki/Duplication_and_elimination_matrices factor_analyzer.utils. duplication_matrix_pre_post (x)[source]¶Transform given input symmetric matrix using pre-post duplication.
factor_analyzer.utils. fill_lower_diag (x)[source]¶Fill the lower diagonal of a square matrix, given a 1-D input array.
References [1] https://stackoverflow.com/questions/51439271/convert-1d-array-to-lower-triangular-matrixfactor_analyzer.utils. get_first_idxs_from_values (x, eq=1,
use_columns=True)[source]¶Get the indexes for a given value.
factor_analyzer.utils. get_free_parameter_idxs (x,
eq=1)[source]¶Get the free parameter indices from the flattened matrix.
factor_analyzer.utils. get_symmetric_lower_idxs (n=1,
diag=True)[source]¶Get the indices for the lower triangle of a symmetric matrix.
factor_analyzer.utils. get_symmetric_upper_idxs (n=1,
diag=True)[source]¶Get the indices for the upper triangle of a symmetric matrix.
factor_analyzer.utils. impute_values (x, how='mean')[source]¶Impute
factor_analyzer.utils. inv_chol (x, logdet=False)[source]¶Calculate matrix inverse using Cholesky decomposition. Optionally, calculate the log determinant of the Cholesky.
factor_analyzer.utils. merge_variance_covariance (variances,
covariances=None)[source]¶Merge variances and covariances into a single variance-covariance matrix.
factor_analyzer.utils. partial_correlations (x)[source]¶Compute partial correlations between variable pairs. This is a python port of the
factor_analyzer.utils. smc (corr_mtx, sort=False)[source]¶Calculate the squared multiple correlations. This is equivalent to regressing each variable on all others and calculating the r-squared values.
factor_analyzer.utils. unique_elements (seq)[source]¶Get first unique instance of every list element, while maintaining order.
What is oblique rotation in SPSS?An oblique rotation, which allows factors to be correlated. This rotation can be calculated more quickly than a direct oblimin rotation, so it is useful for large datasets. Results. For orthogonal rotations, the rotated pattern matrix and factor transformation matrix are displayed.
What is oblique rotation in factor analysis?a transformational system used in factor analysis when two or more factors (i.e., latent variables) are correlated. Oblique rotation reorients the factors so that they fall closer to clusters of vectors representing manifest variables, thereby simplifying the mathematical description of the manifest variables.
What is varimax rotation SPSS?Varimax rotation (also called Kaiser-Varimax rotation) maximizes the sum of the variance of the squared loadings, where 'loadings' means correlations between variables and factors. This usually results in high factor loadings for a smaller number of variables and low factor loadings for the rest.
Is Oblimin rotation oblique?As an oblique rotation method, Oblimin permits correlations among the constructed sets, and in case of uncorrelated data, rotations produce similar results as orthogonal rotation. ...
|