http://statwiki.kolobkreations.com/api.php?action=feedcontributions&user=Jgaskin&feedformat=atomStatWiki - User contributions [en]2019-03-26T02:17:21ZUser contributionsMediaWiki 1.24.4http://statwiki.kolobkreations.com/index.php?title=References&diff=2552448References2019-03-17T01:42:09Z<p>Jgaskin: /* General Topics */</p>
<hr />
<div>'''Here are some helpful references for structural equation modeling (in no particular order - I just keep adding to the list as they come).''' <br />
<br />
'''To search for a specific term, in Windows hit CTRL+F, on a Mac hit COMMAND+F.''' <br />
<br />
==Constructs and Validity==<br />
*Devellis, R. F. (2003). Scale Development: Theory and Applications Second Edition (Applied Social Research Methods).<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organizational Research Methods, 19(2), 159-203.<br />
*Churchill Jr, G. A. (1979). A paradigm for developing better measures of marketing constructs. Journal of marketing research, 64-73.<br />
*Yaniv, E. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
* Editor’s Comments. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
*Law, K. S., Wong, C. S., & Mobley, W. M. (1998). Toward a taxonomy of multidimensional constructs. Academy of management review, 23(4), 741-755.<br />
*Shaffer, J. A., DeGeest, D., & Li, A. (2016). Tackling the problem of construct proliferation: A guide to assessing the discriminant validity of conceptually related constructs. Organizational Research Methods, 19(1), 80-110.<br />
*Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838.<br />
*Krosnick, J. A. (1999). Survey research. Annual review of psychology, 50(1), 537-567.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35(2), 293-334.<br />
*Bolton, R. N. (1993). Pretesting questionnaires: content analyses of respondents' concurrent verbal protocols. Marketing science, 12(3), 280-303.<br />
*Podsakoff, N. P., Podsakoff, P. M., MacKenzie, S. B., & Klinger, R. L. (2013). Are we really measuring what we say we're measuring? Using video techniques to supplement traditional construct validation procedures. Journal of Applied Psychology, 98(1), 99.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS quarterly, 35(2), 293-334.<br />
*Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. S. (2002). The Q-sort method: assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal of Modern Applied Statistical Methods, 1(1), 15.<br />
*Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of consumer research, 30(2), 199-218.<br />
*MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323-326.<br />
*Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social indicators research, 46(2), 137-155.<br />
*Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. Structural equation modeling: Present and future, 195-216.<br />
*Hancock, Gregory R., and Ralph O. Mueller. "Rethinking construct reliability within latent variable systems." Structural equation modeling: Present and future (2001): 195-216. (discusses MaxR(H))<br />
<br />
==Measurement Models==<br />
===Exploratory Factor Analysis===<br />
*Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.<br />
*Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most From Your Analysis. Practical Assessment, Research & Evaluation,10(7), 1-9.<br />
*Reio Jr, T. G., & Shuck, B. (2015). Exploratory factor analysis: Implications for theory, research, and practice. Advances in Developing Human Resources, 17(1), 12-25.<br />
*Treiblmaier, H., & Filzmoser, P. (2010). Exploratory factor analysis revisited: How robust methods support the detection of hidden multivariate data structures in IS research. Information & management, 47(4), 197-207.<br />
*Ferguson, E., & Cox, T. (1993). Exploratory factor analysis: A users’ guide. International Journal of Selection and Assessment, 1(2), 84-94.<br />
<br />
===Confirmatory Factor Analysis===<br />
*Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational research methods, 3(1), 4-70.<br />
*Byrne, B. M. (2008). Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20(4), 872-882.<br />
*Byrne, B. M. (2004). Testing for multigroup invariance using AMOS graphics: A road less traveled. Structural Equation Modeling, 11(2), 272-300.<br />
*Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: Review of practice and implications. Human Resource Management Review, 18(4), 210-222.<br />
*Brown, T. A. (2014). Confirmatory factor analysis for applied research (2nd ed.). Guilford Publications.<br />
*Matsunaga, M. (2015). How to factor-analyze your data right: do’s, don’ts, and how-to’s. International Journal of Psychological Research, 3(1), 97-110.<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
*Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
====Method Bias, Response Bias, Specific Bias====<br />
*Fuller et al., (2016) "Common methods variance detection in business research", Journal of Business Research, Volume 69, Issue 8, pp. 3192-3198 (suggests Harman's single factor test is useful under certain circumstances).<br />
*Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of applied psychology, 88(5), 879.<br />
*MacKenzie, S. B., & Podsakoff, P. M. (2012). Common method bias in marketing: causes, mechanisms, and procedural remedies. Journal of Retailing, 88(4), 542-555.<br />
*Williams, L. J., Hartman, N., & Cavazotte, F. (2010). Method variance and marker variables: A review and comprehensive CFA marker technique. Organizational Research Methods, 13(3), 477-514.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569. <br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569.<br />
*Doty, D. H., & Glick, W. H. (1998). Common methods bias: does common methods variance really bias results?. Organizational research methods, 1(4), 374-406.<br />
*Estabrook, Ryne, and Michael Neale. “A Comparison of Factor Score Estimation Methods in the Presence of Missing Data: Reliability and an Application to Nicotine Dependence.” Multivariate behavioral research 48.1 (2013): 1–27. PMC. Web. 1 Nov. 2017. <br />
*Arbuckle JL. Amos 7.0 user’s guide. Chicago, IL: SPSS; 2006. <br />
*Bartlett MS. The statistical conception of mental factors. British Journal of Psychology. 1937;28:97–104.<br />
*Lawley DN, Maxwell MA. Factor analysis as a statistical method. 2. London, UK: Butterworths; 1971. <br />
*Horn JL, McArdle JJ, Mason R. When invariance is not invariant: A practical scientist’s view of the ethereal concept of factorial invariance. The Southern Psychologist. 1983; 1:179–188.<br />
*Muthén L, Muthén B. Mplus user’s guide. 5. Los Angeles, CA: Author; 1998–2007.<br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569<br />
<br />
===Other===<br />
*Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological methods, 5(2), 155.<br />
*Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological assessment, 7(3), 286.<br />
*Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of marketing research, 186-192.<br />
*Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality and social psychology bulletin, 28(12), 1629-1646.<br />
*Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research, 39-50.<br />
*Bagozzi, R. P. (2011). Measurement and meaning in information systems and organizational research: Methodological and philosophical foundations. Mis Quarterly, 261-292.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Jarvis, C. B. (2005). The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. Journal of Applied Psychology, 90(4), 710.<br />
*Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of business research, 61(12), 1203-1218.<br />
<br />
==Mediation, Moderation, and Moderated Mediation==<br />
===Mediation===<br />
*Mathieu, J. E., & Taylor, S. R. (2006). Clarifying conditions and decision points for mediational type inferences in organizational behavior. Journal of Organizational Behavior, 27(8), 1031-1056.<br />
*Mathieu, J. E., DeShon, R. P., & Bergh, D. D. (2008). Mediational inferences in organizational research: Then, now, and beyond. Organizational Research Methods, 11(2), 203-223.<br />
*MacKinnon, D. P., Coxe, S., & Baraldi, A. N. (2012). Guidelines for the investigation of mediating variables in business research. Journal of Business and Psychology, 27(1), 1-14.<br />
*MacKinnon, D. P., & Pirlott, A. G. (2015). Statistical approaches for enhancing causal interpretation of the M to Y relation in mediation analysis. Personality and Social Psychology Review, 19(1), 30-43.<br />
*Preacher, K. J. (2015). Advances in mediation analysis: A survey and synthesis of new developments. Annual Review of Psychology, 66, 825-852.<br />
*Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and truths about mediation analysis. Journal of consumer research, 37(2), 197-206.<br />
*Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication monographs, 76(4), 408-420.<br />
<br />
===Moderation and Multigroup===<br />
*Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: The MACS approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287-321.<br />
*Schumacker, R. E., & Marcoulides, G. A. (1998). Interaction and nonlinear effects in structural equation modeling. Lawrence Erlbaum Associates Publishers.<br />
*Li, F., Harmer, P., Duncan, T. E., Duncan, S. C., Acock, A., & Boles, S. (1998). Approaches to testing interaction effects using structural equation modeling methodology. Multivariate Behavioral Research, 33(1), 1-39.<br />
*Floh, A., & Treiblmaier, H. (2006). What keeps the e-banking customer loyal? A multigroup analysis of the moderating role of consumer characteristics on e-loyalty in the financial service industry.<br />
<br />
===Both or Other===<br />
*Aguinis, H., Edwards, J. R., & Bradley, K. J. (2016). Improving our understanding of moderation and mediation in strategic management research. Organizational Research Methods, 1094428115627498.<br />
*Sardeshmukh, S. R., & Vandenberg, R. J. (2016). Integrating Moderation and Mediation A Structural Equation Modeling Approach. Organizational Research Methods, 1094428115621609.<br />
*Preacher, K. J., Rucker, D. D., & Hayes, A. F. (2007). Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate behavioral research, 42(1), 185-227.<br />
*Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.<br />
<br />
==Partial Least Squares==<br />
*Becker, J. M., Klein, K., and Wetzels, M. (2012). Hierarchical Latent Variable Models in PLS-SEM: Guidelines for Using Reflective-Formative Type Models. Long Range Planning, 45(5), 359-394.<br />
*Becker, J.-M., Rai, A., Ringle, C. M., and Völckner, F. (2013). Discovering Unobserved Heterogeneity in Structural Equation Models to Avert Validity Threats. MIS Quarterly, 37 (3), 665-694.<br />
*Gefen, D., & Straub, D. (2005). A practical guide to factorial validity using PLS-Graph: Tutorial and annotated example. Communications of the Association for Information systems, 16(1), 5.<br />
*Hair, J. F., C. M. Ringle, and M. Sarstedt (2011). PLS-SEM. Indeed a Silver Bullet, Journal of Marketing Theory & Practice, 19 (2), 139-151. <br />
*Hair, J. F., M. Sarstedt, C. M. Ringle, and J. A. Mena (2012). An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research, Journal of the Academy of Marketing Science, 40 (3), 414-433. <br />
*Hair, J. F., M. Sarstedt, T. Pieper, and C. M. Ringle (2012). The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications, Long Range Planning, 45(5/6), 320-340. <br />
*Hair, J. F., Ringle, C. M., & Sarstedt, M. (2013). Editorial-partial least squares structural equation modeling: Rigorous applications, better results and higher acceptance.<br />
*Hair, J., Sarstedt, M., Hopkins, L., & G. Kuppelwieser, V. (2014). Partial least squares structural equation modeling (PLS-SEM) An emerging tool in business research. European Business Review, 26(2), 106-121.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2015). A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling, Journal of the Academy of Marketing Science, 43 (1), 115–135.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2016). Testing Measurement Invariance of Composites Using Partial Least Squares, International Marketing Review, 33 (3), 405-431.<br />
*Henseler, J., Dijkstra, T.K., Sarstedt, M., Ringle, C.M., Diamantopoulos, A., Straub, D.W., Ketchen, D.J., Hair, J.F., Hult, G.T.M., and Calantone, R.J. (2014). Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013). Organizational Research Methods, 17(2), 182-209. <br />
*Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. In New challenges to international marketing (pp. 277-319). Emerald Group Publishing Limited.<br />
*Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach. International Journal of e-Collaboration, 11(4), 1-10.<br />
*Lowry, P. B., & Gaskin, J. (2014). Partial least squares (PLS) structural equation modeling (SEM) for building and testing behavioral causal theory: When to choose it and how to use it. IEEE Transactions on Professional Communication, 57(2), 123-146.<br />
*McIntosh, C. N., Edwards, J. R., & Antonakis, J. (2014). Reflections on partial least squares path modeling. Organizational Research Methods, 17(2), 210-251.<br />
*Monge, C., Cruz, J., & López, F. (2014). Manufacturing and continuous improvement areas using partial least squares path modeling with multiple regression comparison. In Proceedings of CBU International Conference on Innovation, Technology Transfer and Education (2014), February (pp. 3-5).<br />
*Rigdon, E. E. (2014). Rethinking partial least squares path modeling: breaking chains and forging ahead. Long Range Planning, 47(3), 161-167.<br />
*Ringle, C. M., M. Sarstedt, and D. W. Straub (2012). A Critical look at the Use of PLS-SEM in MIS Quarterly, MIS Quarterly, 36(1), iii-xiv.<br />
*Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. In Measurement and research methods in international marketing (pp. 195-218). Emerald Group Publishing Limited.<br />
*Wong, K. K. K. (2013). Partial least squares structural equation modeling (PLS-SEM) techniques using SmartPLS. Marketing Bulletin, 24(1), 1-32.<br />
<br />
==General Topics==<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
*Tabachnick & Fidell (2014). Using Multivariate Statistics (6th ed), Chapter 14: Structural Equation Modeling. Pp. 731-836.<br />
*Urdan, T. C. 2011. Statistics in Plain English. Routledge.<br />
*Newbold, P., Carlson, W., and Thorne, B. 2012. Statistics for Business and Economics. Pearson.<br />
*Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage learning.<br />
*Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological bulletin, 103(3), 411.<br />
*Suits, D. B. (1957). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548-551.<br />
*Gefen, D., Rigdon, E. E., & Straub, D. (2011). Editor's comments: an update and extension to SEM guidelines for administrative and social science research. MIS Quarterly, iii-xiv.<br />
*Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.<br />
*Blunch, N. (2013). Introduction to structural equation modeling using IBM SPSS statistics and AMOS (2nd ed.). Los Angeles, CA: Sage.<br />
*Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford publications.<br />
*Argyrous, G. (2011). Statistics for research: with a guide to SPSS (3rd ed.). Thousand Oaks, CA: Sage Publications.<br />
*Byrne, B. M. (2009). Structural equation modeling with AMOS: basic concepts, applications, and programming (2nd ed.). Abingdon-on-Thames: Routledge.<br />
*Williams, L. J., Vandenberg, R. J., & Edwards, J. R. (2009). Structural equation modeling in management research: A guide for improved analysis. The Academy of Management Annals, 3 (1), 543-604.<br />
<br />
===Model Fit===<br />
*Kenny, D. A. (2012). Measuring Model Fit. http://davidakenny.net/cm/fit.htm<br />
*Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55.<br />
*Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.<br />
*Hooper, D., Coughlan, J., & Mullen, M. (2008) Structural Equation Modelling: Guidelines for Determining Model Fit. Journal of Business Research, 6(1), 53-60.<br />
<br />
==Miscellaneous==<br />
*Kolenikov, S., and Bollen, K. A. 2012. "Testing Negative Error Variances: Is a Heywood Case a Symptom of Misspecification?," Sociological Methods & Research (41:1), pp. 124-167.<br />
*Jalayer Khalilzadeh, Asli D.A. Tasci, Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research, In Tourism Management, Volume 62, 2017, Pages 89-96, http://www.sciencedirect.com/science/article/pii/S026151771730078X<br />
*Green, J. P., Tonidandel, S., & Cortina, J. M. (2016). Getting through the gate: Statistical and methodological issues raised in the reviewing process. Organizational Research Methods, 19(3), 402-432.<br />
*Malhotra, Naresh K. Marketing research: An applied orientation, 5/e. Pearson Education India, 2008.<br />
*Gravetter, F. J., & Wallnau, L. B. (2016). Statistics for the behavioral sciences (2nd ed.). Los Angeles: SAGE Publications, Inc.<br />
*Blair, J., Czaja, R. F., & Blair, E. A. (2014). Designing surveys: A guide to decisions and procedures (3rd ed.). Sage Publications.<br />
*Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability.<br />
*Kenny, D. A. (2011). Respecification of Latent Variable Models. http://davidakenny.net/cm/respec.htm<br />
*Bagozzi, R. P., & Yi, Y. (2012). Specification, evaluation, and interpretation of structural equation models. Journal of the academy of marketing science, 40(1), 8-34.<br />
*Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270-301. (for Cook's distance)<br />
*Winklhofer, H. M., & Diamantopoulos, A. (2002). Managerial evaluation of sales forecasting effectiveness: A MIMIC modeling approach. International Journal of Research in Marketing, 19(2), 151-166.<br />
*Thomas, D. M., & Watson, R. T. (2002). Q-sorting and MIS research: A primer. Communications of the Association for Information Systems, 8(1), 9.<br />
*Osborne, J. W. (2012). Power and Planning for Data Collection. In Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage Publications.<br />
*Steenkamp, J. B. E., De Jong, M. G., & Baumgartner, H. (2010). Socially desirable response tendencies in survey research. Journal of Marketing Research, 47(2), 199-214.<br />
*Bacharach, S. B. (1989). Organizational theories: Some criteria for evaluation. Academy of management review, 14(4), 496-515.<br />
*Becker, T. E. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274-289.<br />
*Dietz, W. H., & Gortmaker, S. L. (1985). Do we fatten our children at the television set? Obesity and television viewing in children and adolescents. Pediatrics, 75(5), 807-812.<br />
*Peterson, C., Park, N., & Seligman, M. E. (2005). Orientations to happiness and life satisfaction: The full life versus the empty life. Journal of happiness studies, 6(1), 25-41.<br />
*Sposito, V. A., Hand, M. L., & Skarpness, B. (1983). On the efficiency of using the sample kurtosis in selecting optimal lpestimators. Communications in Statistics-simulation and Computation, 12(3), 265-272.<br />
*McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of mathematical and statistical Psychology, 34(1), 100-117.<br />
*Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Cincinnati, OH:Atomic Dog.<br />
*Gravetter, F., & Wallnau, L. (2014). Essentials of statistics for the behavioral sciences (8th ed.). Belmont, CA: Wadsworth.<br />
*Field, A. (2000). Discovering statistics using spss for windows. London-Thousand Oaks- New Delhi: Sage publications.<br />
*Field, A. (2009). Discovering statistics using SPSS. London: SAGE.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Plugins&diff=2536502Plugins2019-03-07T17:22:54Z<p>Jgaskin: /* Troubleshooting */</p>
<hr />
<div>==Overview==<br />
AMOS does not do everything I want it to do, so with the help of some research assistants, we have created some plugins and estimands to make up the difference. A plugin is a macro that can be used to automate AMOS. An estimand is a custom function that can add calculations and output to the AMOS analysis. We hope you find these useful. If they do not work for you, please refer to the troubleshooting section below. '''Here is a link to the Google Drive folder containing the plugins and estimands: [https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing Plugins and Estimands].''' <br />
Here are a few YouTube videos explaining how to use them:<br />
*[[File:YouTube.png]] [https://youtu.be/sLtMOFcojZY '''Installing Plugins for AMOS v23- (plus a demo of EFA->CFA)''']<br />
*[[File:YouTube.png]] [https://youtu.be/nf6fzpmnpDc '''Installing Plugins for AMOS v24+''']<br />
*[[File:YouTube.png]] [https://youtu.be/ICnh3s2FG14 '''Example of Using Estimands''']<br />
*[[File:YouTube.png]] [https://www.youtube.com/user/Gaskination/search?query=plugin '''List of all Plugins videos on Gaskination''']<br />
<br />
==Plugins==<br />
===Installation Instructions:===<br />
#Download the plugin or estimand to your own computer on your Windows side.<br />
#Right click, go to properties, and then on the general tab, at the bottom, if there is a button captioned "unblock", click on this button. If not, no worries.<br />
#If a plugin, place the file into the following folder: <br />
*if using AMOS version 23 or lower:<br />
**C:\Program Files (x86)\IBM\SPSS\Amos\23\Plugins<br />
**In this case, 23 is the AMOS version number.<br />
*if using AMOS version 24 or higher:<br />
**C:\Users\{username}\AppData\Local\AmosDevelopment\Amos\{AmosVersion}\Plugins<br />
**Make sure to replace username and AmosVersion with your own local directories.<br />
<br />
===List of Plugins===<br />
'''CleanEstimatesTable'''<br />
*This plugin creates a new table that includes the IV, DV, and standardized regression weights (with p-value significance indication). This helps because AMOS makes these separate, and puts the DV before the IV. <br />
'''CLF24'''<br />
*This is the old plugin for testing method bias. I discourage you from using this one, as I've updated it with the ModelBias plugin described below. I leave this one up for now so that users won't email me asking where it is after having viewed my video about it.<br />
'''EraseAll'''<br />
*AMOS does not provide a way to clear the canvas, but keep the datafile linked. So, this plugin will erase all objects on the canvas, but will retain the link to your dataset. <br />
'''EraseSelected'''<br />
*This plugin erases just the objects that you've selected (highlighted in blue). This is slightly faster than deleting each object individually with the X tool. <br />
'''Magiclean'''<br />
*This plugin centers your model on the page, resizes it to fit the page, and adjusts line angles and entry points to make them appear more symmetric. <br />
'''MasterValidity'''<br />
*This plugin produces an HTML file with a correlation table of constructs, including the square root of the AVE on the diagonal, the CR and the AVE, as well as the less used MSV and MaxR. It also provides some interpretation and indication of validity issues. When validity issues occur, it also provides some recommendations. References for validity thresholds are provided.<br />
'''ModelBias'''<br />
*This plugin automates the tedious job of testing the a model for specific bias or common method bias by running multiple contrained and unconstrained models through chi-square difference tests. The output is an HTML file that includes a table of the results, as well as interpretation, recommendations, and a reference. <br />
'''ModelFit'''<br />
*This plugin creates an HTML file with all the relevant model fit measures, their thresholds, and an interpretation, as well as references for the suggested thresholds.<br />
'''Multigroup'''<br />
*This plugin conducts a multigroup analysis on a causal path model (no latent variables allowed). It conducts multiple chi-square difference tests to determine whether there are path-wise differences between groups.<br />
'''PatternMatrixBuilder'''<br />
*This plugin automates the tedious job of creating a CFA from a pattern matrix. You can paste a pattern matrix from SPSS into the plugin window and it will automatically generate your model for you. All you have to do after that is to rename the latent factors appropriately.<br />
'''IndirectEffects'''<br />
*This plugin automatically estimates all possible indirect effects in the model. It currently only works for models without latent variables. It also must be used in conjunction with the "SpecificIndirectEffects" estimand (not to be confused with the "MyIndirectEffects" estimand...).<br />
<br />
==Estimands==<br />
===GENERAL INSTRUCTIONS:===<br />
-Click on the bottom left part of AMOS where it says "not estimating any user defined estimand".<br />
-Then click 'select estimand', which will let you go find the estimand.<br />
-Make sure to include bootstrapping in the analysis (Analysis Properties, Bootstrap, Perform Bootstrap).<br />
<br />
===LIST OF ESTIMANDS:===<br />
'''ABCindirectEffect'''<br />
*DESCRIPTION: This is for specific serial mediation where there are two mediators in a row. <br />
*INSTRUCTION: Name the first path A, second path B, third path C (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''MyGroupDifferences'''<br />
*DESCRIPTION: This is for testing the difference between two regression coefficients. I made it specifically for comparing paths across multiple groups, but it can be used for comparing any two regression weights, even within groups.<br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. If doing this for the same path, but for different groups, then make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''MyIndirectEffects'''<br />
*DESCRIPTION: This is for specific mediation, where you want to isolate the indirect effect of a specific mediator when there are multiple mediators. This works for one indirect effect at a time.<br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''SpecificIndirectEffects'''<br />
*DESCRIPTION: This is for estimating ALL specific indirect effects automatically along with the IndirectEffects Plugin. <br />
*INSTRUCTION: Just make sure this one is being estimated when using the IndirectEffects plugin. The IndirectEffects plugin will not work with the "MyIndirectEffects" estimand.<br />
'''MyModMed'''<br />
*DESCRIPTION: This is for moderated mediation, where the mediation occurs in two different groups. This estimand can also be used to compare mediation within the same group, if there are multiple indirect paths. <br />
*INSTRUCTION: For Group 1, name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. For Group 2, name the first path C and the second path D. This should be the same paths as A and B, but for the second group. So, make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''PathComparison'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
'''PlayingAround'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
<br />
==Troubleshooting==<br />
Estimands should work every time as long as you follow the instructions provided above. However, sometimes the plugins don't work. They might not show up, or they may display with a question mark in front of them, or they may throw an error. Here are some common errors and how to fix them.<br />
<br />
'''Question Mark in front of Plugin name'''<br />
*This happens when you haven't unblocked the plugin. Sometimes when you download driver files (.dll) from the internet, your security protocols prevent the file from being active. To fix this, right click the plugin file, select properties from the menu, then, in the general tab, check the box at the bottom that says "unblock". If no unblock box appears, then this is not the problem. <br />
*This might also happen if you are not the administrator of your laptop. In this case, make sure to run AMOS as administrator. You can do this by closing all AMOS windows, then right clicking the AMOS Graphics icon and selecting "run as administrator". <br />
*In rare cases, this can be due to an extension misdirection. Here is a solution for that: [https://drive.google.com/drive/u/1/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c Extension Misdirection]<br />
'''Plugin does not appear in plugins menu'''<br />
*This happens when you stick the plugin in the wrong folder. Please make sure you have followed the correct installation instructions based on which version of AMOS you are running. If you are not sure which version of AMOS you are running, click on the Help menu in AMOS, and then select About. This will popup your version number. <br />
'''Plugin appears in plugin menu correctly, and runs, but fails'''<br />
*This can happen if your model is not specified correctly (e.g., violates some modeling assumption). Make sure to covary all your exogenous variables. Make sure to name your variables appropriately (no spaces or hard returns). <br />
*This can also happen if you don't follow the instructions in the video demonstrating how to use the plugin. A link to these videos is provided at the top of this page.<br />
*Specific to the PatternMatrixBuilder, the error can occur if you are using comma notation instead of decimal notation, or if you are using variable labels in SPSS (instead of variable names). Here is how to fix this specific issue of names and labels: <br />
**[[File:YouTube.png]] [https://youtu.be/3bAPwFern_4 '''SPSS Names and Labels Issue''']<br />
'''Specific Indirect Effects plugin shows syntax error'''<br />
*The specific indirect effects plugin uses underscores as part of its process. So, if you have underscores in your variable names, this will break the plugin. Just remove the underscores in your variable names (replace them with nothing). For example, if your variable used to be "Var_Fun", change it to "VarFun" (no space).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=2527326Guidelines2019-03-04T22:23:21Z<p>Jgaskin: /* Standard outline for quantitative model building/testing paper */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run] <br />
*The SEM Speed Run does almost everything listed below. However, I've also added below a few more links for the few items that either are not covered in the speed run, or have been updated since the speed run was made.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin] (if method bias was detected, remove the CLF or whatever variable is affecting all observed variables, while conducting this final validity check. You would then put it back in before imputing factor scores if there is bias.)<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages) ''The main purpose of the introduction is to convince the reader that this study is needed.''<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages) ''The main purpose of the literature review section is to establish who else has addressed a similar research question, and how your study will extend or clarify these. This helps both for positioning your contribution within extant literature, and for motivating why your study is needed (beyond these extant studies).''<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included. <br />
<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious) ''The main purpose of this section is to justify your theory (provide rational rationale for the relationships you are suggesting).''<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template) ''The main purpose of this section is to convince the reader that you chose the right method and that you did it correctly.''<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section) ''The main purpose of this section is to convince the reader that you're analysis was done correctly and your hypotheses were tested appropriately.''<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages) ''The main purpose of this section is to report the results of your hypothesis tests.''<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages) ''The main purpose of this section is to convince the reader of your contributions, and to expand their understanding of you findings.''<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs) ''The main purpose of this section is to motivate the reader to use your work in their own work.''<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=2527243Guidelines2019-03-04T21:47:46Z<p>Jgaskin: /* Structuring a Quantitative Paper */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run] <br />
*The SEM Speed Run does almost everything listed below. However, I've also added below a few more links for the few items that either are not covered in the speed run, or have been updated since the speed run was made.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin] (if method bias was detected, remove the CLF or whatever variable is affecting all observed variables, while conducting this final validity check. You would then put it back in before imputing factor scores if there is bias.)<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included. <br />
**''The main purpose of the literature review section is to establish who else has addressed a similar research question, and how your study will extend or clarify these. This helps both for positioning your contribution within extant literature, and for motivating why your study is needed (beyond these extant studies).''<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Main_Page&diff=2493125Main Page2019-02-23T16:24:47Z<p>Jgaskin: </p>
<hr />
<div>== Welcome to Gaskination's StatWiki! ==<br />
''''' Supported by the Doctor of Management Program at Case Western Reserve University and by Brigham Young University'''''<br />
<br />
This wiki has been created to provide you with all sorts of statistics tutorials to guide you through the standard statistical analyses common to hypothesis testing in the social sciences. Examples are geared toward organizational, business, and management fields. AMOS, SPSS, Excel, SmartPLS and PLS-graph are used to perform all analyses provided on this wiki. This wiki is not exhaustive, or even very comprehensive. I provide ''brief'' explanations of concepts, rather than full length instruction. My main focus is on providing guidance on ''how to perform'' the statistics. This is very much a mechanically oriented resource. For more comprehensive instruction on the methods demonstrated in this wiki, please refer to Hair et al 2010 (''Multivariate Data Analysis''), as well as to the powerpoint presentations offered for most of the topics. I hope you find the resources here useful. I will likely update them from time to time. <br />
<br />
This teaching material has been developed as part of a quantitative social science method sequence aimed to prepare [http://weatherhead.case.edu/degrees/doctor-management Doctor of Management] students for their quantitative research project. These students are working executives who carry out a rigorous quantitative project as part of their research stream. Examples of these projects and examples of how to report results of quantitative research projects in academic papers can be found at in the [http://weatherhead.case.edu/degrees/doctor-management/dm-research DM Research Library]. <br />
<br />
'''Acknowledgments'''<br />
<br />
The materials and teaching approach adopted in these materials have been developed by a team of teachers consisting of Jagdip Singh, Toni Somers, Kalle Lyytinen, Nick Berente, Shyam Giridharadas and me over the last several years. Although I have developed and refined much of the material and the resources in this Wiki, I am not the sole contributor. I greatly appreciate the work done by Kalle Lyytinen (Case Western Reserve University) , Toni Somers (Wayne State University), Nick Berente (University of Georgia), Shyam Giridharadas (University of Washington) and Jagdip Singh (Case Western Reserve University) who selected and identified much of the literature underlying the materials and also originally developed many of the Powerpoint slides. They also helped me refine these materials by providing useful feedback on the slides and videos. I also appreciate the contribution and help of Jagdip Singh (Case Western Reserve University), who is the owner of the Sohana and Bencare datasets used in the examples and which are made available below. I also acknowledge the continued support of the [http://weatherhead.case.edu/degrees/doctor-management Doctor of Management Program] at the [http://weatherhead.case.edu Weatherhead School of Management] at [http://www.case.edu Case Western Reserve University], Cleveland, Ohio for their involvement, support, and sponsoring of this wiki, as well as to Brigham Young University for encouraging me in all my SEM-related endeavors.<br />
<br />
''Please report any problems with the wiki to james.eric.gaskin@gmail.com'' [mailto:james.eric.gaskin@gmail.com]<br />
*If you are having trouble and cannot figure out what to do, even after using the resources on this wiki or on Gaskination, then you might benefit from the archive of support emails I have received and responded to over the past years: [http://www.kolobkreations.com/StatsHelpArchive.pdf Stats Help Archive].<br />
*[[File:Excelicon.jpg]]You may find this set of Excel tools useful/necessary for many of the analyses you will learn about in this wiki: [http://www.kolobkreations.com/Stats%20Tools%20Package.xlsm '''''Stats Tools Package'''''] Please note that this one is the most recently updated one, and does not include a variance column in the Validity Master sheet. This is because it was a mistake to include variances when working with standardized estimates. <br />
*You may also find this basics tutorial for AMOS and SPSS useful as a starter.<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=efC81f-Z22Q '''Basic Analysis in AMOS and SPSS'''] <br />
<br />
'''Datasets'''<br />
<br />
Here are some links to the datasets, and related resources, I use in many of the video tutorials. <br />
*[http://www.kolobkreations.com/YouTube%20SEM%20Series.sav YouTube SEM Series] (this data goes along with this YouTube playlist: [https://www.youtube.com/playlist?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series 2016'''])<br />
*[http://www.kolobkreations.com/Sohana.zip Sohana]<br />
*[http://www.kolobkreations.com/Bencare.zip Bencare]<br />
*[http://www.kolobkreations.com/SalesPerformance.sav Sales Performance]<br />
*[https://drive.google.com/open?id=1k-4v8qFGqKi3m-mrQtyct6j2AxkJ4R-h Example Models]<br />
*[http://www.kolobkreations.com/BurgersOriginal.sav Burgers]<br />
<br />
===How to cite Gaskination resources===<br />
''[http://www.kolobkreations.com/PLSIEEETPC2014.pdf IEEE TPC PLS article]'': <br />
*Paul Benjamin Lowry & James Gaskin (2014). “Partial Least Squares (PLS) Structural Equation Modeling (SEM) for Building and Testing Behavioral Causal Theory: When to Choose It and How to Use It,” IEEE TPC (57:2), pp. 123-146.<br />
<br />
''Wiki'':<br />
*Gaskin, J., (2016), "Name of section", Gaskination's StatWiki. http://statwiki.kolobkreations.com<br />
<br />
''YouTube videos'':<br />
*Gaskin, J., (Year video uploaded), "Name of video", Gaskination's Statistics. http://youtube.com/Gaskination<br />
<br />
''Stats Tools Package'':<br />
*Gaskin, J., (2016), "Name of tab", Stats Tools Package. http://statwiki.kolobkreations.com<br />
<br />
''Plugin or Estimand'':<br />
*Gaskin, J., (2016), "Name of Plugin or Estimand", Gaskination's Statistics. http://statwiki.kolobkreations.com<br />
<br />
== StatWiki Contents ==<br />
1. [[Data screening]]<br />
<br />
*[[Data screening#Missing Data|Missing Data]]<br />
*[[Data screening#Outliers|Outliers]]<br />
*[[Data screening#Normality|Normality]]<br />
*[[Data screening#Linearity|Linearity]]<br />
*[[Data screening#Homoscedasticity|Homoscedasticity]]<br />
*[[Data screening#Multicollinearity|Multicollinearity]] <br />
<br />
2. [[Exploratory Factor Analysis]] (EFA)<br />
<br />
*[[Exploratory Factor Analysis#Rotation types|Rotation types]]<br />
*[[Exploratory Factor Analysis#Factoring methods|Factoring methods]]<br />
*[[Exploratory Factor Analysis#Appropriateness of data|Appropriateness of data]]<br />
*[[Exploratory Factor Analysis#Communalities|Communalities]]<br />
*[[Exploratory Factor Analysis#Dimensionality|Dimensionality]]<br />
*[[Exploratory Factor Analysis#Factor Structure|Factor Structure]]<br />
*[[Exploratory Factor Analysis#Convergent validity|Convergent validity]]<br />
*[[Exploratory Factor Analysis#Discriminant validity|Discriminant validity]]<br />
*[[Exploratory Factor Analysis#Face validity|Face validity]]<br />
*[[Exploratory Factor Analysis#Reliability|Reliability]]<br />
*[[Exploratory Factor Analysis#Formative vs. Reflective|Formative vs. Reflective]]<br />
<br />
3. [[Confirmatory Factor Analysis]] (CFA)<br />
<br />
*[[Confirmatory Factor Analysis#Model Fit|Model Fit]]<br />
*[[Confirmatory Factor Analysis#Validity and Reliability|Validity and Reliability]]<br />
*[[Confirmatory Factor Analysis#Common Method Bias (CMB)|Common Method Bias (CMB)]]<br />
*[[Confirmatory Factor Analysis#Measurement_Model_Invariance|Invariance]]<br />
*[[Confirmatory Factor Analysis#2nd Order Factors|2nd Order Factors]]<br />
<br />
4. [[Structural Equation Modeling]] (SEM)<br />
<br />
*[[Structural Equation Modeling#Hypotheses|Hypotheses]]<br />
*[[Structural Equation Modeling#Controls|Controls]]<br />
*[[Structural Equation Modeling#Mediation|Mediation]]<br />
*[[Structural Equation Modeling#Interaction|Interaction]]<br />
*[[Structural Equation Modeling#Model fit again|Model fit again]]<br />
*[[Structural Equation Modeling#Multi-group|Multi-group]]<br />
*[[Structural Equation Modeling#From Measurement Model to Structural Model|From Measurement Model to Structural Model]]<br />
*[[Structural Equation Modeling#Creating Composites from Latent Factors|Creating Composites from Latent Factors]]<br />
<br />
5. [[PLS]] (Partial Least Squares)<br />
*[[PLS#Installing PLS-graph|Installing PLS-graph]]<br />
*[[PLS#Troubleshooting|Troubleshooting]]<br />
*[[PLS#Sample Size Rule|Sample Size Rule]]<br />
*[[PLS#Factor Analysis|Factor Analysis]]<br />
*[[PLS#Testing Causal Models|Testing Causal Models]]<br />
*[[PLS#Testing Group Differences|Testing Group Differences]]<br />
*[[PLS#Handling Missing Data|Handling Missing Data]]<br />
*[[PLS#Convergent and Discriminant Validity|Convergent and Discriminant Validity]]<br />
*[[PLS#Common Method Bias|Common Method Bias]]<br />
*[[PLS#Interaction|Interaction]]<br />
*[[PLS#SmartPLS|SmartPLS]]<br />
<br />
6. [[Guidelines|General Guidelines]]<br />
<br />
*[[Guidelines#Example Analysis|Example Analysis]]<br />
*[[Guidelines#Ten Steps|Ten Steps to Building a Good Quant Model]]<br />
*[[Guidelines#Order of Operations|Order of Operations]]<br />
*[[Guidelines#Structuring a Quantitative Paper|General Guidelines to Writing a Quant Paper]]<br />
<br />
7. [[Cluster Analysis|Cluster Analysis]]<br />
*Just a bunch of videos here</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Plugins&diff=2364530Plugins2019-01-26T15:37:58Z<p>Jgaskin: /* Estimands */</p>
<hr />
<div>==Overview==<br />
AMOS does not do everything I want it to do, so with the help of some research assistants, we have created some plugins and estimands to make up the difference. A plugin is a macro that can be used to automate AMOS. An estimand is a custom function that can add calculations and output to the AMOS analysis. We hope you find these useful. If they do not work for you, please refer to the troubleshooting section below. '''Here is a link to the Google Drive folder containing the plugins and estimands: [https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing Plugins and Estimands].''' <br />
Here are a few YouTube videos explaining how to use them:<br />
*[[File:YouTube.png]] [https://youtu.be/sLtMOFcojZY '''Installing Plugins for AMOS v23- (plus a demo of EFA->CFA)''']<br />
*[[File:YouTube.png]] [https://youtu.be/nf6fzpmnpDc '''Installing Plugins for AMOS v24+''']<br />
*[[File:YouTube.png]] [https://youtu.be/ICnh3s2FG14 '''Example of Using Estimands''']<br />
*[[File:YouTube.png]] [https://www.youtube.com/user/Gaskination/search?query=plugin '''List of all Plugins videos on Gaskination''']<br />
<br />
==Plugins==<br />
===Installation Instructions:===<br />
#Download the plugin or estimand to your own computer on your Windows side.<br />
#Right click, go to properties, and then on the general tab, at the bottom, if there is a button captioned "unblock", click on this button. If not, no worries.<br />
#If a plugin, place the file into the following folder: <br />
*if using AMOS version 23 or lower:<br />
**C:\Program Files (x86)\IBM\SPSS\Amos\23\Plugins<br />
**In this case, 23 is the AMOS version number.<br />
*if using AMOS version 24 or higher:<br />
**C:\Users\{username}\AppData\Local\AmosDevelopment\Amos\{AmosVersion}\Plugins<br />
**Make sure to replace username and AmosVersion with your own local directories.<br />
<br />
===List of Plugins===<br />
'''CleanEstimatesTable'''<br />
*This plugin creates a new table that includes the IV, DV, and standardized regression weights (with p-value significance indication). This helps because AMOS makes these separate, and puts the DV before the IV. <br />
'''CLF24'''<br />
*This is the old plugin for testing method bias. I discourage you from using this one, as I've updated it with the ModelBias plugin described below. I leave this one up for now so that users won't email me asking where it is after having viewed my video about it.<br />
'''EraseAll'''<br />
*AMOS does not provide a way to clear the canvas, but keep the datafile linked. So, this plugin will erase all objects on the canvas, but will retain the link to your dataset. <br />
'''EraseSelected'''<br />
*This plugin erases just the objects that you've selected (highlighted in blue). This is slightly faster than deleting each object individually with the X tool. <br />
'''Magiclean'''<br />
*This plugin centers your model on the page, resizes it to fit the page, and adjusts line angles and entry points to make them appear more symmetric. <br />
'''MasterValidity'''<br />
*This plugin produces an HTML file with a correlation table of constructs, including the square root of the AVE on the diagonal, the CR and the AVE, as well as the less used MSV and MaxR. It also provides some interpretation and indication of validity issues. When validity issues occur, it also provides some recommendations. References for validity thresholds are provided.<br />
'''ModelBias'''<br />
*This plugin automates the tedious job of testing the a model for specific bias or common method bias by running multiple contrained and unconstrained models through chi-square difference tests. The output is an HTML file that includes a table of the results, as well as interpretation, recommendations, and a reference. <br />
'''ModelFit'''<br />
*This plugin creates an HTML file with all the relevant model fit measures, their thresholds, and an interpretation, as well as references for the suggested thresholds.<br />
'''Multigroup'''<br />
*This plugin conducts a multigroup analysis on a causal path model (no latent variables allowed). It conducts multiple chi-square difference tests to determine whether there are path-wise differences between groups.<br />
'''PatternMatrixBuilder'''<br />
*This plugin automates the tedious job of creating a CFA from a pattern matrix. You can paste a pattern matrix from SPSS into the plugin window and it will automatically generate your model for you. All you have to do after that is to rename the latent factors appropriately.<br />
'''IndirectEffects'''<br />
*This plugin automatically estimates all possible indirect effects in the model. It currently only works for models without latent variables. It also must be used in conjunction with the "SpecificIndirectEffects" estimand (not to be confused with the "MyIndirectEffects" estimand...).<br />
<br />
==Estimands==<br />
===GENERAL INSTRUCTIONS:===<br />
-Click on the bottom left part of AMOS where it says "not estimating any user defined estimand".<br />
-Then click 'select estimand', which will let you go find the estimand.<br />
-Make sure to include bootstrapping in the analysis (Analysis Properties, Bootstrap, Perform Bootstrap).<br />
<br />
===LIST OF ESTIMANDS:===<br />
'''ABCindirectEffect'''<br />
*DESCRIPTION: This is for specific serial mediation where there are two mediators in a row. <br />
*INSTRUCTION: Name the first path A, second path B, third path C (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''MyGroupDifferences'''<br />
*DESCRIPTION: This is for testing the difference between two regression coefficients. I made it specifically for comparing paths across multiple groups, but it can be used for comparing any two regression weights, even within groups.<br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. If doing this for the same path, but for different groups, then make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''MyIndirectEffects'''<br />
*DESCRIPTION: This is for specific mediation, where you want to isolate the indirect effect of a specific mediator when there are multiple mediators. This works for one indirect effect at a time.<br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''SpecificIndirectEffects'''<br />
*DESCRIPTION: This is for estimating ALL specific indirect effects automatically along with the IndirectEffects Plugin. <br />
*INSTRUCTION: Just make sure this one is being estimated when using the IndirectEffects plugin. The IndirectEffects plugin will not work with the "MyIndirectEffects" estimand.<br />
'''MyModMed'''<br />
*DESCRIPTION: This is for moderated mediation, where the mediation occurs in two different groups. This estimand can also be used to compare mediation within the same group, if there are multiple indirect paths. <br />
*INSTRUCTION: For Group 1, name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. For Group 2, name the first path C and the second path D. This should be the same paths as A and B, but for the second group. So, make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''PathComparison'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
'''PlayingAround'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
<br />
==Troubleshooting==<br />
Estimands should work every time as long as you follow the instructions provided above. However, sometimes the plugins don't work. They might not show up, or they may display with a question mark in front of them, or they may throw an error. Here are some common errors and how to fix them.<br />
<br />
'''Question Mark in front of Plugin name'''<br />
*This happens when you haven't unblocked the plugin. Sometimes when you download driver files (.dll) from the internet, your security protocols prevent the file from being active. To fix this, right click the plugin file, select properties from the menu, then, in the general tab, check the box at the bottom that says "unblock". If no unblock box appears, then this is not the problem. <br />
*This might also happen if you are not the administrator of your laptop. In this case, make sure to run AMOS as administrator. You can do this by closing all AMOS windows, then right clicking the AMOS Graphics icon and selecting "run as administrator". <br />
*In rare cases, this can be due to an extension misdirection. Here is a solution for that: [https://drive.google.com/drive/u/1/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c Extension Misdirection]<br />
'''Plugin does not appear in plugins menu'''<br />
*This happens when you stick the plugin in the wrong folder. Please make sure you have followed the correct installation instructions based on which version of AMOS you are running. If you are not sure which version of AMOS you are running, click on the Help menu in AMOS, and then select About. This will popup your version number. <br />
'''Plugin appears in plugin menu correctly, and runs, but fails'''<br />
*This can happen if your model is not specified correctly (e.g., violates some modeling assumption). Make sure to covary all your exogenous variables. Make sure to name your variables appropriately (no spaces or hard returns). <br />
*This can also happen if you don't follow the instructions in the video demonstrating how to use the plugin. A link to these videos is provided at the top of this page.<br />
*Specific to the PatternMatrixBuilder, the error can occur if you are using comma notation instead of decimal notation, or if you are using variable labels in SPSS (instead of variable names). Here is how to fix this specific issue of names and labels: <br />
**[[File:YouTube.png]] [https://youtu.be/3bAPwFern_4 '''SPSS Names and Labels Issue''']</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Plugins&diff=2364520Plugins2019-01-26T15:35:43Z<p>Jgaskin: /* List of Plugins */</p>
<hr />
<div>==Overview==<br />
AMOS does not do everything I want it to do, so with the help of some research assistants, we have created some plugins and estimands to make up the difference. A plugin is a macro that can be used to automate AMOS. An estimand is a custom function that can add calculations and output to the AMOS analysis. We hope you find these useful. If they do not work for you, please refer to the troubleshooting section below. '''Here is a link to the Google Drive folder containing the plugins and estimands: [https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing Plugins and Estimands].''' <br />
Here are a few YouTube videos explaining how to use them:<br />
*[[File:YouTube.png]] [https://youtu.be/sLtMOFcojZY '''Installing Plugins for AMOS v23- (plus a demo of EFA->CFA)''']<br />
*[[File:YouTube.png]] [https://youtu.be/nf6fzpmnpDc '''Installing Plugins for AMOS v24+''']<br />
*[[File:YouTube.png]] [https://youtu.be/ICnh3s2FG14 '''Example of Using Estimands''']<br />
*[[File:YouTube.png]] [https://www.youtube.com/user/Gaskination/search?query=plugin '''List of all Plugins videos on Gaskination''']<br />
<br />
==Plugins==<br />
===Installation Instructions:===<br />
#Download the plugin or estimand to your own computer on your Windows side.<br />
#Right click, go to properties, and then on the general tab, at the bottom, if there is a button captioned "unblock", click on this button. If not, no worries.<br />
#If a plugin, place the file into the following folder: <br />
*if using AMOS version 23 or lower:<br />
**C:\Program Files (x86)\IBM\SPSS\Amos\23\Plugins<br />
**In this case, 23 is the AMOS version number.<br />
*if using AMOS version 24 or higher:<br />
**C:\Users\{username}\AppData\Local\AmosDevelopment\Amos\{AmosVersion}\Plugins<br />
**Make sure to replace username and AmosVersion with your own local directories.<br />
<br />
===List of Plugins===<br />
'''CleanEstimatesTable'''<br />
*This plugin creates a new table that includes the IV, DV, and standardized regression weights (with p-value significance indication). This helps because AMOS makes these separate, and puts the DV before the IV. <br />
'''CLF24'''<br />
*This is the old plugin for testing method bias. I discourage you from using this one, as I've updated it with the ModelBias plugin described below. I leave this one up for now so that users won't email me asking where it is after having viewed my video about it.<br />
'''EraseAll'''<br />
*AMOS does not provide a way to clear the canvas, but keep the datafile linked. So, this plugin will erase all objects on the canvas, but will retain the link to your dataset. <br />
'''EraseSelected'''<br />
*This plugin erases just the objects that you've selected (highlighted in blue). This is slightly faster than deleting each object individually with the X tool. <br />
'''Magiclean'''<br />
*This plugin centers your model on the page, resizes it to fit the page, and adjusts line angles and entry points to make them appear more symmetric. <br />
'''MasterValidity'''<br />
*This plugin produces an HTML file with a correlation table of constructs, including the square root of the AVE on the diagonal, the CR and the AVE, as well as the less used MSV and MaxR. It also provides some interpretation and indication of validity issues. When validity issues occur, it also provides some recommendations. References for validity thresholds are provided.<br />
'''ModelBias'''<br />
*This plugin automates the tedious job of testing the a model for specific bias or common method bias by running multiple contrained and unconstrained models through chi-square difference tests. The output is an HTML file that includes a table of the results, as well as interpretation, recommendations, and a reference. <br />
'''ModelFit'''<br />
*This plugin creates an HTML file with all the relevant model fit measures, their thresholds, and an interpretation, as well as references for the suggested thresholds.<br />
'''Multigroup'''<br />
*This plugin conducts a multigroup analysis on a causal path model (no latent variables allowed). It conducts multiple chi-square difference tests to determine whether there are path-wise differences between groups.<br />
'''PatternMatrixBuilder'''<br />
*This plugin automates the tedious job of creating a CFA from a pattern matrix. You can paste a pattern matrix from SPSS into the plugin window and it will automatically generate your model for you. All you have to do after that is to rename the latent factors appropriately.<br />
'''IndirectEffects'''<br />
*This plugin automatically estimates all possible indirect effects in the model. It currently only works for models without latent variables. It also must be used in conjunction with the "SpecificIndirectEffects" estimand (not to be confused with the "MyIndirectEffects" estimand...).<br />
<br />
==Estimands==<br />
===GENERAL INSTRUCTIONS:===<br />
-Click on the bottom left part of AMOS where it says "not estimating any user defined estimand".<br />
-Then click 'select estimand', which will let you go find the estimand.<br />
-Make sure to include bootstrapping in the analysis (Analysis Properties, Bootstrap, Perform Bootstrap).<br />
<br />
===LIST OF ESTIMANDS:===<br />
'''ABCindirectEffect'''<br />
*DESCRIPTION: This is for specific serial mediation where there are two mediators in a row. <br />
*INSTRUCTION: Name the first path A, second path B, third path C (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''MyGroupDifferences'''<br />
*DESCRIPTION: This is for testing the difference between two regression coefficients. I made it specifically for comparing paths across multiple groups, but it can be used for comparing any two regression weights, even within groups.<br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. If doing this for the same path, but for different groups, then make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''MyIndirectEffects'''<br />
*DESCRIPTION: This is for specific mediation, where you want to isolate the indirect effect of a specific mediator when there are multiple mediators. <br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''MyModMed'''<br />
*DESCRIPTION: This is for moderated mediation, where the mediation occurs in two different groups. This estimand can also be used to compare mediation within the same group, if there are multiple indirect paths. <br />
*INSTRUCTION: For Group 1, name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. For Group 2, name the first path C and the second path D. This should be the same paths as A and B, but for the second group. So, make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''PathComparison'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
'''PlayingAround'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
==Troubleshooting==<br />
Estimands should work every time as long as you follow the instructions provided above. However, sometimes the plugins don't work. They might not show up, or they may display with a question mark in front of them, or they may throw an error. Here are some common errors and how to fix them.<br />
<br />
'''Question Mark in front of Plugin name'''<br />
*This happens when you haven't unblocked the plugin. Sometimes when you download driver files (.dll) from the internet, your security protocols prevent the file from being active. To fix this, right click the plugin file, select properties from the menu, then, in the general tab, check the box at the bottom that says "unblock". If no unblock box appears, then this is not the problem. <br />
*This might also happen if you are not the administrator of your laptop. In this case, make sure to run AMOS as administrator. You can do this by closing all AMOS windows, then right clicking the AMOS Graphics icon and selecting "run as administrator". <br />
*In rare cases, this can be due to an extension misdirection. Here is a solution for that: [https://drive.google.com/drive/u/1/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c Extension Misdirection]<br />
'''Plugin does not appear in plugins menu'''<br />
*This happens when you stick the plugin in the wrong folder. Please make sure you have followed the correct installation instructions based on which version of AMOS you are running. If you are not sure which version of AMOS you are running, click on the Help menu in AMOS, and then select About. This will popup your version number. <br />
'''Plugin appears in plugin menu correctly, and runs, but fails'''<br />
*This can happen if your model is not specified correctly (e.g., violates some modeling assumption). Make sure to covary all your exogenous variables. Make sure to name your variables appropriately (no spaces or hard returns). <br />
*This can also happen if you don't follow the instructions in the video demonstrating how to use the plugin. A link to these videos is provided at the top of this page.<br />
*Specific to the PatternMatrixBuilder, the error can occur if you are using comma notation instead of decimal notation, or if you are using variable labels in SPSS (instead of variable names). Here is how to fix this specific issue of names and labels: <br />
**[[File:YouTube.png]] [https://youtu.be/3bAPwFern_4 '''SPSS Names and Labels Issue''']</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Plugins&diff=1727098Plugins2018-09-19T16:27:38Z<p>Jgaskin: /* List of Plugins */</p>
<hr />
<div>==Overview==<br />
AMOS does not do everything I want it to do, so with the help of some research assistants, we have created some plugins and estimands to make up the difference. A plugin is a macro that can be used to automate AMOS. An estimand is a custom function that can add calculations and output to the AMOS analysis. We hope you find these useful. If they do not work for you, please refer to the troubleshooting section below. '''Here is a link to the Google Drive folder containing the plugins and estimands: [https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing Plugins and Estimands].''' <br />
Here are a few YouTube videos explaining how to use them:<br />
*[[File:YouTube.png]] [https://youtu.be/sLtMOFcojZY '''Installing Plugins for AMOS v23- (plus a demo of EFA->CFA)''']<br />
*[[File:YouTube.png]] [https://youtu.be/nf6fzpmnpDc '''Installing Plugins for AMOS v24+''']<br />
*[[File:YouTube.png]] [https://youtu.be/ICnh3s2FG14 '''Example of Using Estimands''']<br />
*[[File:YouTube.png]] [https://www.youtube.com/user/Gaskination/search?query=plugin '''List of all Plugins videos on Gaskination''']<br />
<br />
==Plugins==<br />
===Installation Instructions:===<br />
#Download the plugin or estimand to your own computer on your Windows side.<br />
#Right click, go to properties, and then on the general tab, at the bottom, if there is a button captioned "unblock", click on this button. If not, no worries.<br />
#If a plugin, place the file into the following folder: <br />
*if using AMOS version 23 or lower:<br />
**C:\Program Files (x86)\IBM\SPSS\Amos\23\Plugins<br />
**In this case, 23 is the AMOS version number.<br />
*if using AMOS version 24 or higher:<br />
**C:\Users\{username}\AppData\Local\AmosDevelopment\Amos\{AmosVersion}\Plugins<br />
**Make sure to replace username and AmosVersion with your own local directories.<br />
<br />
===List of Plugins===<br />
'''CleanEstimatesTable'''<br />
*This plugin creates a new table that includes the IV, DV, and standardized regression weights (with p-value significance indication). This helps because AMOS makes these separate, and puts the DV before the IV. <br />
'''CLF24'''<br />
*This is the old plugin for testing method bias. I discourage you from using this one, as I've updated it with the ModelBias plugin described below. I leave this one up for now so that users won't email me asking where it is after having viewed my video about it.<br />
'''EraseAll'''<br />
*AMOS does not provide a way to clear the canvas, but keep the datafile linked. So, this plugin will erase all objects on the canvas, but will retain the link to your dataset. <br />
'''EraseSelected'''<br />
*This plugin erases just the objects that you've selected (highlighted in blue). This is slightly faster than deleting each object individually with the X tool. <br />
'''Magiclean'''<br />
*This plugin centers your model on the page, resizes it to fit the page, and adjusts line angles and entry points to make them appear more symmetric. <br />
'''MasterValidity'''<br />
*This plugin produces an HTML file with a correlation table of constructs, including the square root of the AVE on the diagonal, the CR and the AVE, as well as the less used MSV and MaxR. It also provides some interpretation and indication of validity issues. When validity issues occur, it also provides some recommendations. References for validity thresholds are provided.<br />
'''ModelBias'''<br />
*This plugin automates the tedious job of testing the a model for specific bias or common method bias by running multiple contrained and unconstrained models through chi-square difference tests. The output is an HTML file that includes a table of the results, as well as interpretation, recommendations, and a reference. <br />
'''ModelFit'''<br />
*This plugin creates an HTML file with all the relevant model fit measures, their thresholds, and an interpretation, as well as references for the suggested thresholds.<br />
'''Multigroup'''<br />
*This plugin conducts a multigroup analysis on a causal path model (no latent variables allowed). It conducts multiple chi-square difference tests to determine whether there are path-wise differences between groups.<br />
'''PatternMatrixBuilder'''<br />
*This plugin automates the tedious job of creating a CFA from a pattern matrix. You can paste a pattern matrix from SPSS into the plugin window and it will automatically generate your model for you. All you have to do after that is to rename the latent factors appropriately.<br />
<br />
==Estimands==<br />
===GENERAL INSTRUCTIONS:===<br />
-Click on the bottom left part of AMOS where it says "not estimating any user defined estimand".<br />
-Then click 'select estimand', which will let you go find the estimand.<br />
-Make sure to include bootstrapping in the analysis (Analysis Properties, Bootstrap, Perform Bootstrap).<br />
<br />
===LIST OF ESTIMANDS:===<br />
'''ABCindirectEffect'''<br />
*DESCRIPTION: This is for specific serial mediation where there are two mediators in a row. <br />
*INSTRUCTION: Name the first path A, second path B, third path C (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''MyGroupDifferences'''<br />
*DESCRIPTION: This is for testing the difference between two regression coefficients. I made it specifically for comparing paths across multiple groups, but it can be used for comparing any two regression weights, even within groups.<br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. If doing this for the same path, but for different groups, then make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''MyIndirectEffects'''<br />
*DESCRIPTION: This is for specific mediation, where you want to isolate the indirect effect of a specific mediator when there are multiple mediators. <br />
*INSTRUCTION: Name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. <br />
'''MyModMed'''<br />
*DESCRIPTION: This is for moderated mediation, where the mediation occurs in two different groups. This estimand can also be used to compare mediation within the same group, if there are multiple indirect paths. <br />
*INSTRUCTION: For Group 1, name the first path A and the second path B (caps matter). You can name a path by double-clicking it, going to the parameters tab of the object properties, and then typing the name in the regression weights box. For Group 2, name the first path C and the second path D. This should be the same paths as A and B, but for the second group. So, make sure to FIRST uncheck the box that is 'all groups' in the object properties. <br />
'''PathComparison'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
'''PlayingAround'''<br />
*DESCRIPTION: This is an unfinished estimand with no purpose or functionality so far. <br />
*INSTRUCTION: DON'T USE THIS ONE :)<br />
==Troubleshooting==<br />
Estimands should work every time as long as you follow the instructions provided above. However, sometimes the plugins don't work. They might not show up, or they may display with a question mark in front of them, or they may throw an error. Here are some common errors and how to fix them.<br />
<br />
'''Question Mark in front of Plugin name'''<br />
*This happens when you haven't unblocked the plugin. Sometimes when you download driver files (.dll) from the internet, your security protocols prevent the file from being active. To fix this, right click the plugin file, select properties from the menu, then, in the general tab, check the box at the bottom that says "unblock". If no unblock box appears, then this is not the problem. <br />
*This might also happen if you are not the administrator of your laptop. In this case, make sure to run AMOS as administrator. You can do this by closing all AMOS windows, then right clicking the AMOS Graphics icon and selecting "run as administrator". <br />
*In rare cases, this can be due to an extension misdirection. Here is a solution for that: [https://drive.google.com/drive/u/1/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c Extension Misdirection]<br />
'''Plugin does not appear in plugins menu'''<br />
*This happens when you stick the plugin in the wrong folder. Please make sure you have followed the correct installation instructions based on which version of AMOS you are running. If you are not sure which version of AMOS you are running, click on the Help menu in AMOS, and then select About. This will popup your version number. <br />
'''Plugin appears in plugin menu correctly, and runs, but fails'''<br />
*This can happen if your model is not specified correctly (e.g., violates some modeling assumption). Make sure to covary all your exogenous variables. Make sure to name your variables appropriately (no spaces or hard returns). <br />
*This can also happen if you don't follow the instructions in the video demonstrating how to use the plugin. A link to these videos is provided at the top of this page.<br />
*Specific to the PatternMatrixBuilder, the error can occur if you are using comma notation instead of decimal notation, or if you are using variable labels in SPSS (instead of variable names). Here is how to fix this specific issue of names and labels: <br />
**[[File:YouTube.png]] [https://youtu.be/3bAPwFern_4 '''SPSS Names and Labels Issue''']</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Structural_Equation_Modeling&diff=1696909Structural Equation Modeling2018-09-11T14:26:43Z<p>Jgaskin: /* Mediation */</p>
<hr />
<div>“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/<br />
<br />
SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Hypotheses ==<br />
Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe. <br />
===Direct effects===<br />
"Diet has a positive effect on weight loss"<br />
<br />
"An increase in hours spent watching television will negatively effect weight loss"<br />
===Mediated effects===<br />
<br />
"Exercise mediates the positive relationship between diet and weight loss"<br />
<br />
"Television time mediates the positive relationship between diet and weight loss"<br />
<br />
"Diet affects weight loss indirectly through exercise"<br />
<br />
===Interaction effects===<br />
"Exercise strengthens the positive relationship between diet and weight loss"<br />
<br />
"Exercise amplifies the positive relationship between diet and weight loss"<br />
<br />
"TV time dampens the positive relationship between diet and weight loss"<br />
<br />
===Multi-group effects===<br />
"The relationship between X and Y is stronger for Group A."<br />
<br />
"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"<br />
<br />
"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"<br />
<br />
"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"<br />
<br />
===Mediated Moderation===<br />
An example of a mediated moderation hypothesis would be something like: <br />
<br />
“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.” <br />
<br />
In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator): <br />
<br />
“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”<br />
===Handling controls===<br />
When including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]"<br />
For example:<br />
<br />
"Exercise positively moderates the positive relationship between diet and weight loss ''when controlling for TV time and diet''"<br />
<br />
"Diet has a positive effect on weight loss ''when controlling for TV time and diet''"<br />
<br />
Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.<br />
<br />
=== Logical Support for Hypotheses ===<br />
Getting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for '''''WHY''''' you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like: <br />
*Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).<br />
<br />
===Statistical Support for Hypotheses through global and local tests===<br />
In order for a hypothesis to be supported, many criteria must be met. These criteria can be classified as global or local tests. In order for a hypothesis to be supported, the local test must be met, but in order for a local test to have meaning, all global tests must be met. Global tests of model fit are the first necessity. If a hypothesized relationship has a significant p-value, but the model has poor fit, we cannot have confidence in that p-value. Next is the global test of variance explained or R-squared. We might observe significant p-values and good model fit, but if R-square is only 0.025, then the relationships we are testing are not very meaningful because they do not explain sufficient variance in the dependent variable. The figure below illustrates the precedence of global and local tests. Lastly, and almost needless to explain, if a regression weight is significant, but is in the wrong direction, our hypothesis is not supported. Instead, there is counter-evidence. For example, if we theorized that exercise would increase weight loss, but instead, exercise decreased weight loss, then we would have counter-evidence.<br />
<br />
[[File:globallocal.png]]<br />
<br />
== Controls ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Controls.pptx '''Controls''']<br />
Controls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a negative effect on school performance. But there are many things that could affect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.<br />
<br />
[[File:controlsIQ.png]]<br />
<br />
As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.<br />
<br />
Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default. However, there are different schools of thought on this. The downside of covarying with all exogenous variables is that you gain no degrees of freedom. If you are in need of degrees of freedom, then try removing the non-significant covariances with controls.<br />
<br />
[[File:controlsAMOS.png]]<br />
<br />
When reporting the model, you '''''do''''' need to include the controls in '''''all''''' your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).<br />
<br />
== Mediation ==<br />
*[[File:books.jpg]]'''''Lesson:''''' [http://www.kolobkreations.com/Mediation%20Step%20by%20Step%20with%20Bootstrapping.pptx '''Testing Mediation using Bootstrapping''']<br />
*[[File:YouTube.png]] '''''Video Lecture:''''' [http://youtu.be/j_yufPUjkwk?hd=1 '''A Simpler Guide to Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/ICnh3s2FG14 '''Mediation in AMOS''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/41XgTZc66ko '''Specific Indirect Effects''']<br />
*'''''Hair et al.:''''' ''pp. 751-755''<br />
<br />
=== Concept ===<br />
<br />
Mediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, ''work effectiveness'', may be a good mediator. We would say that work effectiveness mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance is ''better'' explained through the mediator of work effectiveness. The logic is, intelligent workers tend to perform better '''because''' they work more efficiently. Thus, when intelligence leads to working smarter, then we observe greater performance. <br />
<br />
[[File:mediation.png]]<br />
<br />
<br />
We used to theorize three main types of mediation based on the Barron and Kenny approach; namely: 1) partial, 2) full, and 3) indirect. However, recent literature suggests that mediation is less nuanced than this -- that simply, if a significant indirect effect exists, then mediation is present.<br />
<br />
Here is another useful site for mediation: https://msu.edu/~falkcarl/mediation.html<br />
<br />
== Interaction ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=K34sF_AmWio '''Testing Interaction Effects''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Interaction.pptx '''Interaction Effects''']<br />
===Concept===<br />
In factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. <br />
This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example<br />
*you lose 1 pound of weight for every hour you exercise<br />
*you lose 1 pound of weight for every 500 calories you cut back from your regular diet<br />
*but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds<br />
So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:<br />
*Chocolate is yummy<br />
*Cheese is yummy<br />
*but combining chocolate and cheese is yucky!<br />
<br />
The following figure is an example of a simple interaction model.<br />
<br />
[[File:interaction.png]]<br />
<br />
===Types===<br />
Interactions enable more precise explanation of causal effects by providing a method for explaining not only ''how'' X affects Y, but also ''under what circumstances'' the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically. <br />
<br />
[[File:interactionTypes.png]]<br />
<br />
== Model fit again ==<br />
You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. Every time the model changes and a hypothesis is tested, model fit must be assessed. If multiple hypotheses are tested on the same model, model fit will not change, so it only needs to be addressed once for that set of hypotheses. The method for assessing model fit in a causal model is the same as for a measurement model: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms. '''Also, a warning that some argue there is never an appropriate argument for covarying error terms.''' (I tend to agree that they should not be covaried.)<br />
*If the correlated variables are ''not'' logically '''causally''' correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship.<br />
**e.g., burnout from customers is highly correlated with burnout from management<br />
**We expect these to have similar values (residuals) because they are logically similar and have similar wording in our survey, but they do not necessarily have any causal ties.<br />
*If the correlated variables are logically '''causally''' correlated, then simply add a regression line.<br />
**e.g., burnout from customers is highly correlated with satisfaction with customers<br />
**We expect burnC to predict satC, so ''not'' accounting for it is negligent.<br />
<br />
Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
== Multi-group ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mirI5ETQRTA '''Testing Multi-group Moderation using Chi-square difference test'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/w5ikoIgTIc0?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Testing Multi-group differences using AMOS's multigroup function''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Mediation%20and%20Multi-group%20Moderation.pptx '''Mediation versus Moderation''']<br />
Multi-group comparisons are a special form of moderation in which a dataset is split along values of a grouping variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group comparisons is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group analysis would answer the question: does dieting effect weight loss differently for males than for females?<br />
In the videos above, you will learn how to set up a multigroup analysis in AMOS, and test it using chi-square differences, and AMOS's built in multigroup function. For those who have seen my video on the critical ratios approach, be warned that currently, the chi-square approach is the most widely accepted because the critical ratios approach doesn't take into account family-wise error which affects a model when testing multiple hypotheses simultaneously. For now, I recommend using the chi-square approach. The AMOS built in multigroup function uses the chi-square approach as well.<br />
<br />
==From Measurement Model to Structural Model ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n-ULF6BGVw0 '''From CFA to SEM in AMOS''']<br />
Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.<br />
<br />
==Creating Factor Scores from Latent Factors==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=dsOS9tQjxW8 '''Imputing Factor Scores in AMOS''']<br />
If you would like to create factor scores (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:<br />
*You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or Excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data). <br />
*Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").<br />
After those two caveats are addressed, then you can simply go to the ''Analyze'' menu, and select ''Data Imputation''. Select ''Regression Imputation'', and then click on the ''Impute'' button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.<br />
<br />
==Need more degrees of freedom==<br />
Did you run your model and observe that DF=0 or CFI=1.000. Sounds like you need more degrees of freedom. There are a few ways to do this:<br />
#If there are opportunities to use latent variables instead of computed variables, use latents.<br />
#If you have control variables, do not link them to every other variable.<br />
#Do not include all paths by default. Just include the ones that make good theoretical sense.<br />
#If a path is not significant, omit it. If you do this, make sure to argue that the reason for doing this was to increase degrees of freedom (and also because the path was not significant).<br />
Increasing the degrees of freedom allows AMOS to calculate model fit measures. If you have zero degrees of freedom, model fit is irrelevant because you are "perfectly" accounting for all possible relationships in the model.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Citing_Claims&diff=1668645Citing Claims2018-09-05T20:12:28Z<p>Jgaskin: Protected "Citing Claims" ([Edit=Allow only administrators] (indefinite) [Move=Allow only administrators] (indefinite))</p>
<hr />
<div>I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and explanations below. Of course, I have also included a citation if the claim can be substantiated. If you have heard of a claim and know its source, feel free to email me and I'll determine if it should be added here. If you would like to cite this page in addition to the sources provided below, here is the recommended citation:<br />
*Gaskin, J. (2018) "Citing Claims", Gaskination's StatWiki, http://statwiki.kolobkreations.com/. <br />
<br />
<br />
== Four Indicators Per Factor ==<br />
===Claim===<br />
Have you heard the one about the "optimal number of indicators" per factor? I have heard it a few times, and I know I have read it in multiple places. I include one of those sources below.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 678: <br />
<br />
"In summary, when specifying the number of indicators per construct, the following is recommended: <br />
*Use four indicators whenever possible. <br />
*Having three indicators per construct is acceptable, particularly when other constructs have more than three.<br />
*Constructs with fewer than three indicators should be avoided."<br />
<br />
===Rationale===<br />
Joe's logic is that a minimum of three indicators are needed for identification, but four is a safer and more reliable configuration. More than four may result in a failure of unidimensionality (i.e., there may be multiple dimensions being captured). He also suggests four is the optimal number of indicators because it balances parsimony (simplest solution) with requisite reliability (all-else-equal: reliability increases as number of indicators increases). <br />
== Covarying Error Terms ==<br />
===Claim===<br />
Some claim you can covary error terms in a measurement model (CFA) under certain conditions in order to improve model fit. Others say you should always avoid it. In the past, I have taken both stances, with logic to support my decisions. However, as I've grown in understanding of SEM, I am more inclined to avoid covarying error terms if at all possible.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 675:<br />
<br />
"You also should not run CFA models that include covariances between error terms... Allowing these paths to be estimated (freeing them) will reduce the chi-square, but at the same time ''seriously question the construct validity of the construct''."<br />
===Rationale===<br />
Including a covariance arrow between errors implies that there is some relationship between the items of these variables that you are not accounting for properly in your model. Allowing their errors to covary essentially ignores the problem, much like putting a light bandage over a bullet wound without removing the bullet. It covers up the issue on the surface, but does nothing to address the underlying concerns. <br />
<br />
== More to come ==<br />
===Claim===<br />
===Source===<br />
===Rationale===</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Citing_Claims&diff=1668644Citing Claims2018-09-05T20:12:06Z<p>Jgaskin: Undo revision 1599392 by 69.167.34.65 (talk)</p>
<hr />
<div>I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and explanations below. Of course, I have also included a citation if the claim can be substantiated. If you have heard of a claim and know its source, feel free to email me and I'll determine if it should be added here. If you would like to cite this page in addition to the sources provided below, here is the recommended citation:<br />
*Gaskin, J. (2018) "Citing Claims", Gaskination's StatWiki, http://statwiki.kolobkreations.com/. <br />
<br />
<br />
== Four Indicators Per Factor ==<br />
===Claim===<br />
Have you heard the one about the "optimal number of indicators" per factor? I have heard it a few times, and I know I have read it in multiple places. I include one of those sources below.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 678: <br />
<br />
"In summary, when specifying the number of indicators per construct, the following is recommended: <br />
*Use four indicators whenever possible. <br />
*Having three indicators per construct is acceptable, particularly when other constructs have more than three.<br />
*Constructs with fewer than three indicators should be avoided."<br />
<br />
===Rationale===<br />
Joe's logic is that a minimum of three indicators are needed for identification, but four is a safer and more reliable configuration. More than four may result in a failure of unidimensionality (i.e., there may be multiple dimensions being captured). He also suggests four is the optimal number of indicators because it balances parsimony (simplest solution) with requisite reliability (all-else-equal: reliability increases as number of indicators increases). <br />
== Covarying Error Terms ==<br />
===Claim===<br />
Some claim you can covary error terms in a measurement model (CFA) under certain conditions in order to improve model fit. Others say you should always avoid it. In the past, I have taken both stances, with logic to support my decisions. However, as I've grown in understanding of SEM, I am more inclined to avoid covarying error terms if at all possible.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 675:<br />
<br />
"You also should not run CFA models that include covariances between error terms... Allowing these paths to be estimated (freeing them) will reduce the chi-square, but at the same time ''seriously question the construct validity of the construct''."<br />
===Rationale===<br />
Including a covariance arrow between errors implies that there is some relationship between the items of these variables that you are not accounting for properly in your model. Allowing their errors to covary essentially ignores the problem, much like putting a light bandage over a bullet wound without removing the bullet. It covers up the issue on the surface, but does nothing to address the underlying concerns. <br />
<br />
== More to come ==<br />
===Claim===<br />
===Source===<br />
===Rationale===</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1638294Confirmatory Factor Analysis2018-08-28T17:39:17Z<p>Jgaskin: /* Contingency Plans */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/wV6UudZSBCA '''Model Fit Thresholds''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms ('''however, some argue that there are never appropriate reasons to covary errors'''), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables. If you also are able to retain the CLF (i.e., it does not break your model), then you keep it while imputing. If you have only connected the CLF to the observed variables (and not the SB construct), then make sure to use the SB construct as a control variable in the causal model.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance (using Name Parameters tool)''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/4_ZvpU8wu3Q?t=1h57m19s '''Measurement Model Invariance (using MGA Manager)''']<br />
<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights. Keep constraints the same, but for each factor, for one of the groups, make the variance constraint = 1. This can be done in the ''Manage Models'' section of AMOS.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and address those covariances appropriately for both groups. When deleting an item, it does it for both groups. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=References&diff=1635303References2018-08-27T16:53:36Z<p>Jgaskin: /* Method Bias, Response Bias, Specific Bias */</p>
<hr />
<div>'''Here are some helpful references for structural equation modeling (in no particular order - I just keep adding to the list as they come).''' <br />
<br />
'''To search for a specific term, in Windows hit CTRL+F, on a Mac hit COMMAND+F.''' <br />
<br />
==Constructs and Validity==<br />
*Devellis, R. F. (2003). Scale Development: Theory and Applications Second Edition (Applied Social Research Methods).<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organizational Research Methods, 19(2), 159-203.<br />
*Churchill Jr, G. A. (1979). A paradigm for developing better measures of marketing constructs. Journal of marketing research, 64-73.<br />
*Yaniv, E. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
* Editor’s Comments. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
*Law, K. S., Wong, C. S., & Mobley, W. M. (1998). Toward a taxonomy of multidimensional constructs. Academy of management review, 23(4), 741-755.<br />
*Shaffer, J. A., DeGeest, D., & Li, A. (2016). Tackling the problem of construct proliferation: A guide to assessing the discriminant validity of conceptually related constructs. Organizational Research Methods, 19(1), 80-110.<br />
*Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838.<br />
*Krosnick, J. A. (1999). Survey research. Annual review of psychology, 50(1), 537-567.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35(2), 293-334.<br />
*Bolton, R. N. (1993). Pretesting questionnaires: content analyses of respondents' concurrent verbal protocols. Marketing science, 12(3), 280-303.<br />
*Podsakoff, N. P., Podsakoff, P. M., MacKenzie, S. B., & Klinger, R. L. (2013). Are we really measuring what we say we're measuring? Using video techniques to supplement traditional construct validation procedures. Journal of Applied Psychology, 98(1), 99.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS quarterly, 35(2), 293-334.<br />
*Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. S. (2002). The Q-sort method: assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal of Modern Applied Statistical Methods, 1(1), 15.<br />
*Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of consumer research, 30(2), 199-218.<br />
*MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323-326.<br />
*Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social indicators research, 46(2), 137-155.<br />
*Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. Structural equation modeling: Present and future, 195-216.<br />
*Hancock, Gregory R., and Ralph O. Mueller. "Rethinking construct reliability within latent variable systems." Structural equation modeling: Present and future (2001): 195-216. (discusses MaxR(H))<br />
<br />
==Measurement Models==<br />
===Exploratory Factor Analysis===<br />
*Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.<br />
*Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most From Your Analysis. Practical Assessment, Research & Evaluation,10(7), 1-9.<br />
*Reio Jr, T. G., & Shuck, B. (2015). Exploratory factor analysis: Implications for theory, research, and practice. Advances in Developing Human Resources, 17(1), 12-25.<br />
*Treiblmaier, H., & Filzmoser, P. (2010). Exploratory factor analysis revisited: How robust methods support the detection of hidden multivariate data structures in IS research. Information & management, 47(4), 197-207.<br />
*Ferguson, E., & Cox, T. (1993). Exploratory factor analysis: A users’ guide. International Journal of Selection and Assessment, 1(2), 84-94.<br />
<br />
===Confirmatory Factor Analysis===<br />
*Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational research methods, 3(1), 4-70.<br />
*Byrne, B. M. (2008). Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20(4), 872-882.<br />
*Byrne, B. M. (2004). Testing for multigroup invariance using AMOS graphics: A road less traveled. Structural Equation Modeling, 11(2), 272-300.<br />
*Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: Review of practice and implications. Human Resource Management Review, 18(4), 210-222.<br />
*Brown, T. A. (2014). Confirmatory factor analysis for applied research (2nd ed.). Guilford Publications.<br />
*Matsunaga, M. (2015). How to factor-analyze your data right: do’s, don’ts, and how-to’s. International Journal of Psychological Research, 3(1), 97-110.<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
*Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
====Method Bias, Response Bias, Specific Bias====<br />
*Fuller et al., (2016) "Common methods variance detection in business research", Journal of Business Research, Volume 69, Issue 8, pp. 3192-3198 (suggests Harman's single factor test is useful under certain circumstances).<br />
*Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of applied psychology, 88(5), 879.<br />
*MacKenzie, S. B., & Podsakoff, P. M. (2012). Common method bias in marketing: causes, mechanisms, and procedural remedies. Journal of Retailing, 88(4), 542-555.<br />
*Williams, L. J., Hartman, N., & Cavazotte, F. (2010). Method variance and marker variables: A review and comprehensive CFA marker technique. Organizational Research Methods, 13(3), 477-514.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569. <br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569.<br />
*Doty, D. H., & Glick, W. H. (1998). Common methods bias: does common methods variance really bias results?. Organizational research methods, 1(4), 374-406.<br />
*Estabrook, Ryne, and Michael Neale. “A Comparison of Factor Score Estimation Methods in the Presence of Missing Data: Reliability and an Application to Nicotine Dependence.” Multivariate behavioral research 48.1 (2013): 1–27. PMC. Web. 1 Nov. 2017. <br />
*Arbuckle JL. Amos 7.0 user’s guide. Chicago, IL: SPSS; 2006. <br />
*Bartlett MS. The statistical conception of mental factors. British Journal of Psychology. 1937;28:97–104.<br />
*Lawley DN, Maxwell MA. Factor analysis as a statistical method. 2. London, UK: Butterworths; 1971. <br />
*Horn JL, McArdle JJ, Mason R. When invariance is not invariant: A practical scientist’s view of the ethereal concept of factorial invariance. The Southern Psychologist. 1983; 1:179–188.<br />
*Muthén L, Muthén B. Mplus user’s guide. 5. Los Angeles, CA: Author; 1998–2007.<br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569<br />
<br />
===Other===<br />
*Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological methods, 5(2), 155.<br />
*Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological assessment, 7(3), 286.<br />
*Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of marketing research, 186-192.<br />
*Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality and social psychology bulletin, 28(12), 1629-1646.<br />
*Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research, 39-50.<br />
*Bagozzi, R. P. (2011). Measurement and meaning in information systems and organizational research: Methodological and philosophical foundations. Mis Quarterly, 261-292.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Jarvis, C. B. (2005). The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. Journal of Applied Psychology, 90(4), 710.<br />
*Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of business research, 61(12), 1203-1218.<br />
<br />
==Mediation, Moderation, and Moderated Mediation==<br />
===Mediation===<br />
*Mathieu, J. E., & Taylor, S. R. (2006). Clarifying conditions and decision points for mediational type inferences in organizational behavior. Journal of Organizational Behavior, 27(8), 1031-1056.<br />
*Mathieu, J. E., DeShon, R. P., & Bergh, D. D. (2008). Mediational inferences in organizational research: Then, now, and beyond. Organizational Research Methods, 11(2), 203-223.<br />
*MacKinnon, D. P., Coxe, S., & Baraldi, A. N. (2012). Guidelines for the investigation of mediating variables in business research. Journal of Business and Psychology, 27(1), 1-14.<br />
*MacKinnon, D. P., & Pirlott, A. G. (2015). Statistical approaches for enhancing causal interpretation of the M to Y relation in mediation analysis. Personality and Social Psychology Review, 19(1), 30-43.<br />
*Preacher, K. J. (2015). Advances in mediation analysis: A survey and synthesis of new developments. Annual Review of Psychology, 66, 825-852.<br />
*Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and truths about mediation analysis. Journal of consumer research, 37(2), 197-206.<br />
*Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication monographs, 76(4), 408-420.<br />
<br />
===Moderation and Multigroup===<br />
*Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: The MACS approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287-321.<br />
*Schumacker, R. E., & Marcoulides, G. A. (1998). Interaction and nonlinear effects in structural equation modeling. Lawrence Erlbaum Associates Publishers.<br />
*Li, F., Harmer, P., Duncan, T. E., Duncan, S. C., Acock, A., & Boles, S. (1998). Approaches to testing interaction effects using structural equation modeling methodology. Multivariate Behavioral Research, 33(1), 1-39.<br />
*Floh, A., & Treiblmaier, H. (2006). What keeps the e-banking customer loyal? A multigroup analysis of the moderating role of consumer characteristics on e-loyalty in the financial service industry.<br />
<br />
===Both or Other===<br />
*Aguinis, H., Edwards, J. R., & Bradley, K. J. (2016). Improving our understanding of moderation and mediation in strategic management research. Organizational Research Methods, 1094428115627498.<br />
*Sardeshmukh, S. R., & Vandenberg, R. J. (2016). Integrating Moderation and Mediation A Structural Equation Modeling Approach. Organizational Research Methods, 1094428115621609.<br />
*Preacher, K. J., Rucker, D. D., & Hayes, A. F. (2007). Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate behavioral research, 42(1), 185-227.<br />
*Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.<br />
<br />
==Partial Least Squares==<br />
*Becker, J. M., Klein, K., and Wetzels, M. (2012). Hierarchical Latent Variable Models in PLS-SEM: Guidelines for Using Reflective-Formative Type Models. Long Range Planning, 45(5), 359-394.<br />
*Becker, J.-M., Rai, A., Ringle, C. M., and Völckner, F. (2013). Discovering Unobserved Heterogeneity in Structural Equation Models to Avert Validity Threats. MIS Quarterly, 37 (3), 665-694.<br />
*Gefen, D., & Straub, D. (2005). A practical guide to factorial validity using PLS-Graph: Tutorial and annotated example. Communications of the Association for Information systems, 16(1), 5.<br />
*Hair, J. F., C. M. Ringle, and M. Sarstedt (2011). PLS-SEM. Indeed a Silver Bullet, Journal of Marketing Theory & Practice, 19 (2), 139-151. <br />
*Hair, J. F., M. Sarstedt, C. M. Ringle, and J. A. Mena (2012). An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research, Journal of the Academy of Marketing Science, 40 (3), 414-433. <br />
*Hair, J. F., M. Sarstedt, T. Pieper, and C. M. Ringle (2012). The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications, Long Range Planning, 45(5/6), 320-340. <br />
*Hair, J. F., Ringle, C. M., & Sarstedt, M. (2013). Editorial-partial least squares structural equation modeling: Rigorous applications, better results and higher acceptance.<br />
*Hair, J., Sarstedt, M., Hopkins, L., & G. Kuppelwieser, V. (2014). Partial least squares structural equation modeling (PLS-SEM) An emerging tool in business research. European Business Review, 26(2), 106-121.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2015). A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling, Journal of the Academy of Marketing Science, 43 (1), 115–135.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2016). Testing Measurement Invariance of Composites Using Partial Least Squares, International Marketing Review, 33 (3), 405-431.<br />
*Henseler, J., Dijkstra, T.K., Sarstedt, M., Ringle, C.M., Diamantopoulos, A., Straub, D.W., Ketchen, D.J., Hair, J.F., Hult, G.T.M., and Calantone, R.J. (2014). Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013). Organizational Research Methods, 17(2), 182-209. <br />
*Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. In New challenges to international marketing (pp. 277-319). Emerald Group Publishing Limited.<br />
*Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach. International Journal of e-Collaboration, 11(4), 1-10.<br />
*Lowry, P. B., & Gaskin, J. (2014). Partial least squares (PLS) structural equation modeling (SEM) for building and testing behavioral causal theory: When to choose it and how to use it. IEEE Transactions on Professional Communication, 57(2), 123-146.<br />
*McIntosh, C. N., Edwards, J. R., & Antonakis, J. (2014). Reflections on partial least squares path modeling. Organizational Research Methods, 17(2), 210-251.<br />
*Monge, C., Cruz, J., & López, F. (2014). Manufacturing and continuous improvement areas using partial least squares path modeling with multiple regression comparison. In Proceedings of CBU International Conference on Innovation, Technology Transfer and Education (2014), February (pp. 3-5).<br />
*Rigdon, E. E. (2014). Rethinking partial least squares path modeling: breaking chains and forging ahead. Long Range Planning, 47(3), 161-167.<br />
*Ringle, C. M., M. Sarstedt, and D. W. Straub (2012). A Critical look at the Use of PLS-SEM in MIS Quarterly, MIS Quarterly, 36(1), iii-xiv.<br />
*Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. In Measurement and research methods in international marketing (pp. 195-218). Emerald Group Publishing Limited.<br />
*Wong, K. K. K. (2013). Partial least squares structural equation modeling (PLS-SEM) techniques using SmartPLS. Marketing Bulletin, 24(1), 1-32.<br />
<br />
==General Topics==<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
*Urdan, T. C. 2011. Statistics in Plain English. Routledge.<br />
*Newbold, P., Carlson, W., and Thorne, B. 2012. Statistics for Business and Economics. Pearson.<br />
*Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage learning.<br />
*Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological bulletin, 103(3), 411.<br />
*Suits, D. B. (1957). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548-551.<br />
*Gefen, D., Rigdon, E. E., & Straub, D. (2011). Editor's comments: an update and extension to SEM guidelines for administrative and social science research. MIS Quarterly, iii-xiv.<br />
*Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.<br />
*Blunch, N. (2013). Introduction to structural equation modeling using IBM SPSS statistics and AMOS (2nd ed.). Los Angeles, CA: Sage.<br />
*Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford publications.<br />
*Argyrous, G. (2011). Statistics for research: with a guide to SPSS (3rd ed.). Thousand Oaks, CA: Sage Publications.<br />
*Byrne, B. M. (2009). Structural equation modeling with AMOS: basic concepts, applications, and programming (2nd ed.). Abingdon-on-Thames: Routledge.<br />
*Williams, L. J., Vandenberg, R. J., & Edwards, J. R. (2009). Structural equation modeling in management research: A guide for improved analysis. The Academy of Management Annals, 3 (1), 543-604.<br />
<br />
===Model Fit===<br />
*Kenny, D. A. (2012). Measuring Model Fit. http://davidakenny.net/cm/fit.htm<br />
*Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55.<br />
*Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.<br />
*Hooper, D., Coughlan, J., & Mullen, M. (2008) Structural Equation Modelling: Guidelines for Determining Model Fit. Journal of Business Research, 6(1), 53-60.<br />
<br />
==Miscellaneous==<br />
*Kolenikov, S., and Bollen, K. A. 2012. "Testing Negative Error Variances: Is a Heywood Case a Symptom of Misspecification?," Sociological Methods & Research (41:1), pp. 124-167.<br />
*Jalayer Khalilzadeh, Asli D.A. Tasci, Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research, In Tourism Management, Volume 62, 2017, Pages 89-96, http://www.sciencedirect.com/science/article/pii/S026151771730078X<br />
*Green, J. P., Tonidandel, S., & Cortina, J. M. (2016). Getting through the gate: Statistical and methodological issues raised in the reviewing process. Organizational Research Methods, 19(3), 402-432.<br />
*Malhotra, Naresh K. Marketing research: An applied orientation, 5/e. Pearson Education India, 2008.<br />
*Gravetter, F. J., & Wallnau, L. B. (2016). Statistics for the behavioral sciences (2nd ed.). Los Angeles: SAGE Publications, Inc.<br />
*Blair, J., Czaja, R. F., & Blair, E. A. (2014). Designing surveys: A guide to decisions and procedures (3rd ed.). Sage Publications.<br />
*Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability.<br />
*Kenny, D. A. (2011). Respecification of Latent Variable Models. http://davidakenny.net/cm/respec.htm<br />
*Bagozzi, R. P., & Yi, Y. (2012). Specification, evaluation, and interpretation of structural equation models. Journal of the academy of marketing science, 40(1), 8-34.<br />
*Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270-301. (for Cook's distance)<br />
*Winklhofer, H. M., & Diamantopoulos, A. (2002). Managerial evaluation of sales forecasting effectiveness: A MIMIC modeling approach. International Journal of Research in Marketing, 19(2), 151-166.<br />
*Thomas, D. M., & Watson, R. T. (2002). Q-sorting and MIS research: A primer. Communications of the Association for Information Systems, 8(1), 9.<br />
*Osborne, J. W. (2012). Power and Planning for Data Collection. In Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage Publications.<br />
*Steenkamp, J. B. E., De Jong, M. G., & Baumgartner, H. (2010). Socially desirable response tendencies in survey research. Journal of Marketing Research, 47(2), 199-214.<br />
*Bacharach, S. B. (1989). Organizational theories: Some criteria for evaluation. Academy of management review, 14(4), 496-515.<br />
*Becker, T. E. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274-289.<br />
*Dietz, W. H., & Gortmaker, S. L. (1985). Do we fatten our children at the television set? Obesity and television viewing in children and adolescents. Pediatrics, 75(5), 807-812.<br />
*Peterson, C., Park, N., & Seligman, M. E. (2005). Orientations to happiness and life satisfaction: The full life versus the empty life. Journal of happiness studies, 6(1), 25-41.<br />
*Sposito, V. A., Hand, M. L., & Skarpness, B. (1983). On the efficiency of using the sample kurtosis in selecting optimal lpestimators. Communications in Statistics-simulation and Computation, 12(3), 265-272.<br />
*McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of mathematical and statistical Psychology, 34(1), 100-117.<br />
*Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Cincinnati, OH:Atomic Dog.<br />
*Gravetter, F., & Wallnau, L. (2014). Essentials of statistics for the behavioral sciences (8th ed.). Belmont, CA: Wadsworth.<br />
*Field, A. (2000). Discovering statistics using spss for windows. London-Thousand Oaks- New Delhi: Sage publications.<br />
*Field, A. (2009). Discovering statistics using SPSS. London: SAGE.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1579556Confirmatory Factor Analysis2018-08-09T13:57:39Z<p>Jgaskin: /* Measurement Model Invariance */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/wV6UudZSBCA '''Model Fit Thresholds''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms ('''however, some argue that there are never appropriate reasons to covary errors'''), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables. If you also are able to retain the CLF (i.e., it does not break your model), then you keep it while imputing. If you have only connected the CLF to the observed variables (and not the SB construct), then make sure to use the SB construct as a control variable in the causal model.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance (using Name Parameters tool)''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/4_ZvpU8wu3Q?t=1h57m19s '''Measurement Model Invariance (using MGA Manager)''']<br />
<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights. Keep constraints the same, but for each factor, for one of the groups, make the variance constraint = 1. This can be done in the ''Manage Models'' section of AMOS.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=MediaWiki:Sidebar&diff=1520513MediaWiki:Sidebar2018-07-23T17:54:44Z<p>Jgaskin: </p>
<hr />
<div><br />
* Navigation<br />
** mainpage|Home<br />
** http://gaskination.com/forum/|Forum<br />
** Data screening|Data Screening<br />
** Exploratory Factor Analysis|EFA<br />
** Confirmatory Factor Analysis|CFA<br />
** Structural Equation Modeling|Causal SEM<br />
** PLS|PLS<br />
** Plugins|Plugins Info<br />
** Guidelines|General Guidelines<br />
** Cluster Analysis|Cluster Analysis<br />
** References|References<br />
** Citing Claims|Citing Claims<br />
<br />
* Resources<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package.xlsm|Excel StatTools<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package%20OLD.xls|OLD StatTools<br />
** http://www.youtube.com/Gaskination|YouTube Demos<br />
** http://www.kolobkreations.com/StatsHelpArchive.pdf|Stats Help Archive<br />
** https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing|Plugins & Estimands</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Citing_Claims&diff=1520507Citing Claims2018-07-23T17:53:13Z<p>Jgaskin: </p>
<hr />
<div>I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and explanations below. Of course, I have also included a citation if the claim can be substantiated. If you have heard of a claim and know its source, feel free to email me and I'll determine if it should be added here. If you would like to cite this page in addition to the sources provided below, here is the recommended citation:<br />
*Gaskin, J. (2018) "Citing Claims", Gaskination's StatWiki, http://statwiki.kolobkreations.com/. <br />
<br />
<br />
== Four Indicators Per Factor ==<br />
===Claim===<br />
Have you heard the one about the "optimal number of indicators" per factor? I have heard it a few times, and I know I have read it in multiple places. I include one of those sources below.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 678: <br />
<br />
"In summary, when specifying the number of indicators per construct, the following is recommended: <br />
*Use four indicators whenever possible. <br />
*Having three indicators per construct is acceptable, particularly when other constructs have more than three.<br />
*Constructs with fewer than three indicators should be avoided."<br />
<br />
===Rationale===<br />
Joe's logic is that a minimum of three indicators are needed for identification, but four is a safer and more reliable configuration. More than four may result in a failure of unidimensionality (i.e., there may be multiple dimensions being captured). He also suggests four is the optimal number of indicators because it balances parsimony (simplest solution) with requisite reliability (all-else-equal: reliability increases as number of indicators increases). <br />
== Covarying Error Terms ==<br />
===Claim===<br />
Some claim you can covary error terms in a measurement model (CFA) under certain conditions in order to improve model fit. Others say you should always avoid it. In the past, I have taken both stances, with logic to support my decisions. However, as I've grown in understanding of SEM, I am more inclined to avoid covarying error terms if at all possible.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 675:<br />
<br />
"You also should not run CFA models that include covariances between error terms... Allowing these paths to be estimated (freeing them) will reduce the chi-square, but at the same time ''seriously question the construct validity of the construct''."<br />
===Rationale===<br />
Including a covariance arrow between errors implies that there is some relationship between the items of these variables that you are not accounting for properly in your model. Allowing their errors to covary essentially ignores the problem, much like putting a light bandage over a bullet wound without removing the bullet. It covers up the issue on the surface, but does nothing to address the underlying concerns. <br />
<br />
== More to come ==<br />
===Claim===<br />
===Source===<br />
===Rationale===</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Citing_Claims&diff=1520504Citing Claims2018-07-23T17:52:39Z<p>Jgaskin: </p>
<hr />
<div>I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and explanations below. Of course, I have also included a citation if the claim can be substantiated. If you have heard of a claim and know its source, feel free to email me and I'll determine if it should be added here. If you would like to cite this page in addition to the sources provided below, here is the recommended citation:<br />
*Gaskin, James (2018) "Citing Claims", StatWiki, http://statwiki.kolobkreations.com/. <br />
<br />
<br />
== Four Indicators Per Factor ==<br />
===Claim===<br />
Have you heard the one about the "optimal number of indicators" per factor? I have heard it a few times, and I know I have read it in multiple places. I include one of those sources below.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 678: <br />
<br />
"In summary, when specifying the number of indicators per construct, the following is recommended: <br />
*Use four indicators whenever possible. <br />
*Having three indicators per construct is acceptable, particularly when other constructs have more than three.<br />
*Constructs with fewer than three indicators should be avoided."<br />
<br />
===Rationale===<br />
Joe's logic is that a minimum of three indicators are needed for identification, but four is a safer and more reliable configuration. More than four may result in a failure of unidimensionality (i.e., there may be multiple dimensions being captured). He also suggests four is the optimal number of indicators because it balances parsimony (simplest solution) with requisite reliability (all-else-equal: reliability increases as number of indicators increases). <br />
== Covarying Error Terms ==<br />
===Claim===<br />
Some claim you can covary error terms in a measurement model (CFA) under certain conditions in order to improve model fit. Others say you should always avoid it. In the past, I have taken both stances, with logic to support my decisions. However, as I've grown in understanding of SEM, I am more inclined to avoid covarying error terms if at all possible.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 675:<br />
<br />
"You also should not run CFA models that include covariances between error terms... Allowing these paths to be estimated (freeing them) will reduce the chi-square, but at the same time ''seriously question the construct validity of the construct''."<br />
===Rationale===<br />
Including a covariance arrow between errors implies that there is some relationship between the items of these variables that you are not accounting for properly in your model. Allowing their errors to covary essentially ignores the problem, much like putting a light bandage over a bullet wound without removing the bullet. It covers up the issue on the surface, but does nothing to address the underlying concerns. <br />
<br />
== More to come ==<br />
===Claim===<br />
===Source===<br />
===Rationale===</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Citing_Claims&diff=1520500Citing Claims2018-07-23T17:50:50Z<p>Jgaskin: </p>
<hr />
<div>I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and explanations below. Of course, I have also included a citation if the claim can be substantiated. If you have heard of a claim and know its source, feel free to email me and I'll determine if it should be added here. <br />
<br />
<br />
== Four Indicators Per Factor ==<br />
===Claim===<br />
Have you heard the one about the "optimal number of indicators" per factor? I have heard it a few times, and I know I have read it in multiple places. I include one of those sources below.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 678: <br />
<br />
"In summary, when specifying the number of indicators per construct, the following is recommended: <br />
*Use four indicators whenever possible. <br />
*Having three indicators per construct is acceptable, particularly when other constructs have more than three.<br />
*Constructs with fewer than three indicators should be avoided."<br />
<br />
===Rationale===<br />
Joe's logic is that a minimum of three indicators are needed for identification, but four is a safer and more reliable configuration. More than four may result in a failure of unidimensionality (i.e., there may be multiple dimensions being captured). He also suggests four is the optimal number of indicators because it balances parsimony (simplest solution) with requisite reliability (all-else-equal: reliability increases as number of indicators increases). <br />
== Covarying Error Terms ==<br />
===Claim===<br />
Some claim you can covary error terms in a measurement model (CFA) under certain conditions in order to improve model fit. Others say you should always avoid it. In the past, I have taken both stances, with logic to support my decisions. However, as I've grown in understanding of SEM, I am more inclined to avoid covarying error terms if at all possible.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 675:<br />
<br />
"You also should not run CFA models that include covariances between error terms... Allowing these paths to be estimated (freeing them) will reduce the chi-square, but at the same time ''seriously question the construct validity of the construct''."<br />
===Rationale===<br />
Including a covariance arrow between errors implies that there is some relationship between the items of these variables that you are not accounting for properly in your model. Allowing their errors to covary essentially ignores the problem, much like putting a light bandage over a bullet wound without removing the bullet. It covers up the issue on the surface, but does nothing to address the underlying concerns. <br />
<br />
== More to come ==<br />
===Claim===<br />
===Source===<br />
===Rationale===</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Citing_Claims&diff=1520478Citing Claims2018-07-23T17:40:12Z<p>Jgaskin: Created page with "I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and..."</p>
<hr />
<div>I find that there are specific claims in SEM research that float around, but where they come from is often forgotten. So, I've made a list of these claims with some quotes and explanations below. Of course, I have also included a citation if the claim can be substantiated. If you have heard of a claim and know its source, feel free to email me and I'll determine if it should be added here. <br />
<br />
<br />
== Four Indicators Per Factor ==<br />
===Claim===<br />
Have you heard the one about the "optimal number of indicators" per factor? I have heard it a few times, and I know I have read it in multiple places. I include one of those sources below.<br />
===Source===<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
<br />
On page 678: <br />
<br />
"In summary, when specifying the number of indicators per construct, the following is recommended: <br />
*Use four indicators whenever possible. <br />
*Having three indicators per construct is acceptable, particularly when other constructs have more than three.<br />
*Constructs with fewer than three indicators should be avoided."<br />
<br />
===Rationale===<br />
Joe's logic is that a minimum of three indicators are needed for identification, but four is a safer and more reliable configuration. More than four may result in a failure of unidimensionality (i.e., there may be multiple dimensions being captured). He also suggests four is the optimal number of indicators because it balances parsimony (simplest solution) with requisite reliability (all-else-equal: reliability increases as number of indicators increases). <br />
===Claim===<br />
===Source===<br />
===Rationale===</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=PLS&diff=1508792PLS2018-07-19T22:24:59Z<p>Jgaskin: /* SmartPLS */</p>
<hr />
<div>Partial Least Squares is another method for testing causal models (in addition to Covariance Based Methods used in AMOS and Lisrel). This section on PLS is intended to demystify the process of conducting an analysis, start to finish, using PLS-graph. Trust me, it needs demystification! I am not going to get into the deep logic and math behind the methods I outline here. This wiki is simply intended to be used as a "How To" for pls-graph. For more references and technical explanations, please refer to Wynne Chin's website: http://www.plsgraph.com/. I have also listed several videos for SmartPLS 2.0 at the bottom of this page, and an article about when to choose PLS and how to use it. I've also created an updated playlist on YouTube for [https://www.youtube.com/playlist?list=PLnMJlbz3sefKTL7KGy_JIYTSpFXizxW1X SmartPLS 3]<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
==Installing PLS-graph==<br />
Simply installing PLS-graph is a complex process. Instructions for getting the full version can be found here: http://disc-nt.cba.uh.edu/plsgraph/. ''Or'' you can click on this link: http://www.bauer.uh.edu/plsgraph/build1130.exe for the demo version. The demo version is rather limiting, and only allows you to test models with ten variables or less. In order to obtain a more useful license, you will have to contact Wynne Chin directly: wchin@uh.edu. Sadly I am not allowed to distribute it freely, and requests directed toward me for the full license must be rejected.<br />
*'''''IMPORTANT ''''' Once you have PLS-graph installed (by clicking on that link and running the file), make sure you have a "license.dat" or "license" file in your plsgraph folder, typically found at: '''C:\Program Files\plsgraph''', but sometimes also found at C:\Program Files '''x86'''\plsgraph if you are running a 64bit machine.<br />
<br />
==Troubleshooting==<br />
===Opening PLS-graph===<br />
If you get one of the following errors, then you either don't have a valid license from Wynne Chin, or you have not placed the license in the proper directory. See the installation section above for more details.<br />
<br />
[[File:error1.png]][[File:error2.png]]<br />
<br />
===Linking Data===<br />
To "open" or "link" a dataset in PLS-graph, you need to click on ''File --> Links'' '''NOT''' ''File --> Open''. From File --> Links, then browse for your dataset. <br />
*'''Your data must be in .raw format''' or else it will not show up in the browse window.<br />
<br />
To get your data in .raw format, you need to follow the guidelines in this quick tutorial: [http://www.kolobkreations.com/Manually%20creating%20a%20raw%20data%20set%20for%20PLS.pdf Creating .raw files]<br />
<br />
Basically, you need to:<br />
*Save your dataset as "tab delimited" (.dat) from SPSS or Excel, or whatever program you are using to view your data.<br />
*Then change the file extension from .dat to .raw (say yes if an error pops up)<br />
<br />
If you get the following error when linking data, then there is a problem in the dataset:<br />
<br />
[[File:error3.png]]<br />
<br />
The problem is most likely one of the following:<br />
*[1]You have blank or missing values that have not been recoded.<br />
*[2]You have non-numeric values (other than variable names in the first row)<br />
*[3]You have excessively large numbers (e.g., 0.978687677664826355281)<br />
*[4]You have scientific notation (e.g., 3.23E-08 instead of 0.0000000323)<br />
<br />
Fixes for these issues:<br />
*[1]Replace all missing values in your dataset with a constant that is otherwise unused in the dataset (something like -1). You can do this in Excel or SPSS by doing a quick ''Find and Replace'' (Control+H). Or, you can impute those missing values (if appropriate) in SPSS using the ''Replace missing values'' function in the ''Transform'' menu.<br />
*[2]If you have non-numeric data, you need to convert it to numbers (if appropriate). For example, if you have values like "Low" "Medium" "High", instead you need to use something like "1" "2" "3", where 1=Low, etc. This can also be done with a find and replace. You may also need to simply remove some columns from your dataset because they cannot be used in PLS. For example, if you have email addresses or usernames in your dataset, those can simply be removed because they cannot be meaningfully converted into numeric data.<br />
*[3]If your numbers are too large in Excel, then simply decrease the number of decimals using the [[File:decimal.png]] button. If you are using SPSS then you need to do some fancy copy and paste work. Copy the offending columns into Excel, reduce the number or decimals, then copy and paste the new values into those same columns in SPSS. <br />
*[4]If you have scientific notation, this is probably because you were using Excel at some point and the numbers were either formatted explicitly as "Scientific" or were inferred to be Scientific but formatted by default as "General". To fix this, simply change the formatting to "Number". See the picture below for how to access these formats in Excel 2010.<br />
<br />
[[File:numformat.png]]<br />
<br />
===Crashes===<br />
PLS-graph tends to crash quite frequently if you are testing a complex model (over 30 variables) and/or have a large sample size (over 500). To fix this, you need to manually increase the amount of memory allocated for running the PLS algorithms. Go to ''Options --> Memory'' and then add a couple zeros to each row. In the picture below, I've added two zeros to each row.<br />
<br />
[[File:memory1.png]][[File:memory2.png]]<br />
<br />
You may also want to just wait for a few seconds after the program runs before hitting the ''Okay'' button. This will give it time to settle, and will result in fewer crashes. <br />
*Above all '''SAVE OFTEN!'''<br />
<br />
==Sample Size Rule==<br />
PLS has the great advantage over Covariance Based Methods (CBM) because it requires fewer datapoints to accurately estimate loadings. The rule for CBM is 10 times the number of parameters or variables in the model. So if you have 20 variables then you need 200 usable rows in your dataset. In PLS, the rule is much looser. '''In PLS you need 10 times the number of indicators for the most predicted construct.''' So for example, if you have a latent construct that is predicted by 6 indicators, another predicted by 3, and another predicted by 4, then you would only need 6 times 10, or 60 usable rows. If a construct is also being predicted in a causal model by other latent constructs, then those need to be considered as well. So for example, in the model below, the required sample size would be 90: 70 for the measured indicators and 20 for the latent indicators.<br />
<br />
[[File:samplesize.png]]<br />
<br />
==Factor Analysis== <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=fDQbWD51Oec '''Factor Analysis in PLS-graph''']<br />
<br />
By ''factor analysis'' I mean the measurement and estimation of latent constructs, excluding any causal relationship between latent constructs. In PLS latent constructs can be estimated formatively or reflectively; whereas in CBM all constructs are measured reflectively. The difference between these two types of models is important and should not be disregarded. For more information on reflective versus formative measures and models please refer to the section on [[Exploratory Factor Analysis#Formative vs. Reflective|Formative vs. Reflective]] models. As for how to conduct a factor analysis in PLS-graph, it would be much simpler just to show you, so please see the video above.<br />
<br />
==Testing Causal Models==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=CCGfg3cnMGY '''Testing Causal Models''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=G4r4vL47Tm4 '''Increasing your t-statistic''']<br />
The first video listed above demonstrates the entire process of testing a causal model. The second video explains how to fix your bootstrapping options so that you can increase your t-statistic (and as a result, decrease your p-value).<br />
<br />
The basic steps for testing causal models are as follows:<br />
*After doing a factor analysis in PLS-graph, connect the constructs using the connector tool [[File:connector.png]]<br />
*Change the inner weightings to ''Path'' (instead of ''Factor'') - this is in the ''Options --> Run'' menu.<br />
*Run the model<br />
*Trim weak paths in the model<br />
*Run a bootstrap in order to obtain t-statistics, composite reliabilities, and AVEs.<br />
*Trim indicators based on the t-statistics<br />
*Compute p-values from t-statistic (using Excel's function: ''=T.DIST.2T(x,deg_freedom)'')<br />
<br />
The basic steps for increasing your t-statistic are as follows:<br />
*Go to ''Options --> Resampling''<br />
*Change the ''Number of Samples'' to be a number greater than your sample size<br />
*Change the ''Cases per Sample'' to be a number that is a majority of your sample size. Or, just put a zero there and the bootstrap will use the entire sample size, but include replacements (or estimated values) for removed cases. The latter option here (using zero) will usually give you the highest t-statistic. <br />
*Run the bootstrap now as usual.<br />
===Effect Strength (f-squared)===<br />
The t-statistic produced in pls-graph and used to calculate p-values is easily inflated when using a large sample size (something greater than 300). So you can run a model and have a path coefficient of 0.048 and yet the t-statistic will be significant. But a path coefficient of 0.048 is not practically significant, only statistically significant. In cases like these, the best thing to do is to calculate an f-squared to demonstrate the actual strength of the effect. The f-squared relies on the change in the r-squared, rather than on the size or significance of the path coefficient. The f-squared is calculated as follows: <br />
<br />
[[File:f2.png]]<br />
<br />
I have made a quick tool for calculating this in Excel. It is in the ''EffectSize'' tab of the Stats Tools Package. Looks like this:<br />
<br />
[[File:fsquared.png]]<br />
<br />
==Testing Group Differences==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=ksTmE__qzyg '''Testing Group Differences''']<br />
Testing for differences between groups for a given causal model is a big pain in PLS... just as it is in CBM software like AMOS. Hopefully the video tutorial I've made will demystify the process. You will also need the Stats Tools Package Excel workbook referenced on the StatWiki homepage. The basic steps are as follows:<br />
*Use case selection to run the model for one group at a time. This can be found in the ''Options --> Run'' menu. For example, you might use gender as the grouping variable. In your dataset, gender should be indicated by a 1 or 2, where 1 = male and 2 = female. Then in PLS-graph you can select gender as the selection variable and specify a value (such as 1) to test for just one gender at a time. (see the picture below)<br />
*Then obtain the regression weight from running the model<br />
*'''To obtain the standard errors, you need to run a bootstrap using a ''separate'' dataset for each group.''' Bootstrapping in pls-graph does not take into account specifying a case selection (as in the picture below).<br />
*Plug these values, along with the sample size for each group, into the Stats Tools Package ''X2 Threshold'' tab.<br />
*This will calculate a t-statistic and p-value for you. '''''The larger the sample sizes, the stronger the p-value'''''<br />
<br />
[[File:grouping.png]]<br />
<br />
==Handling Missing Data==<br />
PLS-graph cannot handle missing values that are left blank. If there are blank portions of your dataset, the data simply will not load. To avoid having to remove or impute all these missing values, you can just recode them using some constant number that is never used elsewhere in the dataset. For example, if your data comes from surveys that used 7-point Likert scales, then you could use the number 8 as the proxy for missing values, or the number -1, or 1,111,111,001, or whatever you wanted, as long as it wasn't a number 1 through 7. Common practice is to use -1. In SPSS or Excel, just hit Control+H and replace all blanks with a -1. '''WARNING''' in SPSS, this will only replace missing values within a specified column, whereas in Excel it will replace missing values for the entire dataset. In SPSS it looks like this, with the ''Find'' value blank, and the ''Replace'' value set to -1: <br />
<br />
[[File:replace.png]]<br />
<br />
Then, in PLS-graph, you need to specify the value for missing data. This is done in the ''Options --> Run'' menu.<br />
<br />
==Reliability and Validity==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=KyFz4rLISbE '''Reliability and Validity''']<br />
So how do you test for reliability and validity in PLS-graph? And what do you do with formative measures? There are different schools of thought, and different approaches. In the video above, I will show you one of these that is usually acceptable (depending on reviewers). The basic guidelines are as follows:<br />
*Reliability: This is demonstrated by Composite Reliability greater than 0.700. <br />
*Convergent Validity: This is demonstrated by loadings greater than 0.700, AVE greater than 0.500, and Communalities greater than 0.500<br />
*Discriminant validity: This is demonstrated by the square root of the AVE being greater than any of the inter-construct correlations. <br />
*Formative Measures: Like I said, different schools of thought. Some say that Reliability and Convergent validity are actually flawed metrics when evaluating formative measures because formative measures do not necessarily have highly correlated indicators. However, the formative measure should have some common theme. Thus, I argue that for formative measures, high loadings and communalities should still be present in order to have a strong construct. Nevertheless, if you don't achieve the recommended thresholds, you can probably argue your case. <br />
<br />
In the end, you want a table that looks something like this:<br />
<br />
[[File:reliability.png]]<br />
<br />
==Common Method Bias==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=pUKT-QvQYhM '''Testing Common Method Bias''']<br />
There are several different methods for testing whether the use of a common method introduced a bias into your data. My preferred method, and probably the most accurate, but most stringent, is to use a marker variable to draw out the common variance with theoretically unrelated constructs, which would point to some systematic variance explained by an external factor (such as a common method of data collection). To employ a marker variable in PLS-graph, you need to create a latent construct that is theoretically dissimilar to the other constructs in the model. For example, if I am doing a factor analysis with the following variables: ''Satisfaction, Burnout, Rejection, and Ethical Concerns'', I can choose a marker variable like ''Apathy'', and then look at the correlations between the other constructs and this construct. The correlations should be low - like less than 0.300. Squaring the highest correlation between the Marker and another construct will give you the maximum percentage of shared variance. Additionally, you can look at the correlations between the other factors. None of those correlations should be greater than 0.700 (for discriminant validity) and definitely no greater than 0.900 for common method bias.<br />
<br />
So, given the correlation matrix below (from the .lst output), we can say that the maximum shared variance with the Marker variable is less than 1% (.075 squared), and none of the other correlations begin to approach the 0.900 threshold. Thus, there is no evidence that a common method bias exists.<br />
<br />
[[File:CMB.png]]<br />
<br />
==Interaction==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mIuAvtJn5KE '''Interactions in PLS''']<br />
To perform an interaction in PLS-graph, you need to create an ''Interaction Construct'' that is composed of the '''products''' of the indicators for the IV and the moderating variable. The picture below on the left is the conceptual model we are testing. The picture below on the right is the way we measure it in PLS-graph.<br />
<br />
[[File:InteractionPLS.png]]<br />
<br />
*Standardizing variables before multiplying them for interactions is no longer considered necessary, as the assumed benefit of reducing multicollinearity has been debunked in several recent articles. <br />
<br />
To test the significance of the effect, just do a bootstrap like you would for any other effect, then calculate the p-value from the t-statistic as discussed in the [[PLS#Testing Causal Models|Testing Causal Models]] section.<br />
==SmartPLS==<br />
Updated SmartPLS 3 playlist: <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://www.youtube.com/playlist?list=PLnMJlbz3sefKTL7KGy_JIYTSpFXizxW1X '''SmartPLS 3 Playlist''']<br />
Here are video demonstrations using SmartPLS for most of the analyses above:<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n42EQcqqQ-U '''Getting Started''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=6G9MfgImWCw '''Basic Path Analysis''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=7bqcG0GcgQ8 '''Factor Analysis''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=upEf1brVvXQ '''Moderation - Interaction''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=fvk39T0p2iw '''Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=kPeUTKjMF7o '''Formative 2nd order Constructs''']<br />
<br />
And here is a pretty good set of slides made by Joseph Hair (as in Hair et al. 2010) about SEM and PLS, and he uses SmartPLS:<br />
*[[File:books.jpg]] '''''LESSON:''''' [https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCIQFjAA&url=https%3A%2F%2Fnoppa.lut.fi%2Fnoppa%2Fopintojakso%2Fab40aj200%2Fluennot%2Fslides.ppt&ei=p999UJOjKer3iwK414HwDw&usg=AFQjCNH1Z_KqNpwC-JIY_-H2hzWwxGgGtQ&sig2=WYn5uKs8oEm1IcLfSH1vJA SEM and PLS]<br />
<br />
To cite any of the YouTube videos, refer to our ''[http://www.kolobkreations.com/PLSIEEETPC2014.pdf IEEE TPC PLS article]'': <br />
*Paul Benjamin Lowry and James Gaskin (2014). “Partial least squares (PLS) structural equation modeling (SEM) for building and testing behavioral causal theory: When to choose it and how to use it,” '''''IEEE Transactions on Professional Communication''''' (accepted 04-Mar-2014).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=References&diff=1457778References2018-07-03T17:38:09Z<p>Jgaskin: /* General Topics */</p>
<hr />
<div>'''Here are some helpful references for structural equation modeling (in no particular order - I just keep adding to the list as they come).''' <br />
<br />
'''To search for a specific term, in Windows hit CTRL+F, on a Mac hit COMMAND+F.''' <br />
<br />
==Constructs and Validity==<br />
*Devellis, R. F. (2003). Scale Development: Theory and Applications Second Edition (Applied Social Research Methods).<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organizational Research Methods, 19(2), 159-203.<br />
*Churchill Jr, G. A. (1979). A paradigm for developing better measures of marketing constructs. Journal of marketing research, 64-73.<br />
*Yaniv, E. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
* Editor’s Comments. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
*Law, K. S., Wong, C. S., & Mobley, W. M. (1998). Toward a taxonomy of multidimensional constructs. Academy of management review, 23(4), 741-755.<br />
*Shaffer, J. A., DeGeest, D., & Li, A. (2016). Tackling the problem of construct proliferation: A guide to assessing the discriminant validity of conceptually related constructs. Organizational Research Methods, 19(1), 80-110.<br />
*Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838.<br />
*Krosnick, J. A. (1999). Survey research. Annual review of psychology, 50(1), 537-567.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35(2), 293-334.<br />
*Bolton, R. N. (1993). Pretesting questionnaires: content analyses of respondents' concurrent verbal protocols. Marketing science, 12(3), 280-303.<br />
*Podsakoff, N. P., Podsakoff, P. M., MacKenzie, S. B., & Klinger, R. L. (2013). Are we really measuring what we say we're measuring? Using video techniques to supplement traditional construct validation procedures. Journal of Applied Psychology, 98(1), 99.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS quarterly, 35(2), 293-334.<br />
*Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. S. (2002). The Q-sort method: assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal of Modern Applied Statistical Methods, 1(1), 15.<br />
*Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of consumer research, 30(2), 199-218.<br />
*MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323-326.<br />
*Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social indicators research, 46(2), 137-155.<br />
*Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. Structural equation modeling: Present and future, 195-216.<br />
*Hancock, Gregory R., and Ralph O. Mueller. "Rethinking construct reliability within latent variable systems." Structural equation modeling: Present and future (2001): 195-216. (discusses MaxR(H))<br />
<br />
==Measurement Models==<br />
===Exploratory Factor Analysis===<br />
*Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.<br />
*Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most From Your Analysis. Practical Assessment, Research & Evaluation,10(7), 1-9.<br />
*Reio Jr, T. G., & Shuck, B. (2015). Exploratory factor analysis: Implications for theory, research, and practice. Advances in Developing Human Resources, 17(1), 12-25.<br />
*Treiblmaier, H., & Filzmoser, P. (2010). Exploratory factor analysis revisited: How robust methods support the detection of hidden multivariate data structures in IS research. Information & management, 47(4), 197-207.<br />
*Ferguson, E., & Cox, T. (1993). Exploratory factor analysis: A users’ guide. International Journal of Selection and Assessment, 1(2), 84-94.<br />
<br />
===Confirmatory Factor Analysis===<br />
*Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational research methods, 3(1), 4-70.<br />
*Byrne, B. M. (2008). Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20(4), 872-882.<br />
*Byrne, B. M. (2004). Testing for multigroup invariance using AMOS graphics: A road less traveled. Structural Equation Modeling, 11(2), 272-300.<br />
*Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: Review of practice and implications. Human Resource Management Review, 18(4), 210-222.<br />
*Brown, T. A. (2014). Confirmatory factor analysis for applied research (2nd ed.). Guilford Publications.<br />
*Matsunaga, M. (2015). How to factor-analyze your data right: do’s, don’ts, and how-to’s. International Journal of Psychological Research, 3(1), 97-110.<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
*Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
====Method Bias, Response Bias, Specific Bias====<br />
*Fuller et al., (2016) "Common methods variance detection in business research", Journal of Business Research,<br />
Volume 69, Issue 8, pp. 3192-3198 (suggests Harman's single factor test is useful under certain circumstances).<br />
*Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of applied psychology, 88(5), 879.<br />
*MacKenzie, S. B., & Podsakoff, P. M. (2012). Common method bias in marketing: causes, mechanisms, and procedural remedies. Journal of Retailing, 88(4), 542-555.<br />
*Williams, L. J., Hartman, N., & Cavazotte, F. (2010). Method variance and marker variables: A review and comprehensive CFA marker technique. Organizational Research Methods, 13(3), 477-514.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569. <br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569.<br />
*Doty, D. H., & Glick, W. H. (1998). Common methods bias: does common methods variance really bias results?. Organizational research methods, 1(4), 374-406.<br />
*Estabrook, Ryne, and Michael Neale. “A Comparison of Factor Score Estimation Methods in the Presence of Missing Data: Reliability and an Application to Nicotine Dependence.” Multivariate behavioral research 48.1 (2013): 1–27. PMC. Web. 1 Nov. 2017. <br />
*Arbuckle JL. Amos 7.0 user’s guide. Chicago, IL: SPSS; 2006. <br />
*Bartlett MS. The statistical conception of mental factors. British Journal of Psychology. 1937;28:97–104.<br />
*Lawley DN, Maxwell MA. Factor analysis as a statistical method. 2. London, UK: Butterworths; 1971. <br />
*Horn JL, McArdle JJ, Mason R. When invariance is not invariant: A practical scientist’s view of the ethereal concept of factorial invariancesnce. The Southern Psychologist. 1983;1:179–188.<br />
*Muthén L, Muthén B. Mplus user’s guide. 5. Los Angeles, CA: Author; 1998–2007.<br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569<br />
<br />
===Other===<br />
*Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological methods, 5(2), 155.<br />
*Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological assessment, 7(3), 286.<br />
*Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of marketing research, 186-192.<br />
*Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality and social psychology bulletin, 28(12), 1629-1646.<br />
*Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research, 39-50.<br />
*Bagozzi, R. P. (2011). Measurement and meaning in information systems and organizational research: Methodological and philosophical foundations. Mis Quarterly, 261-292.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Jarvis, C. B. (2005). The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. Journal of Applied Psychology, 90(4), 710.<br />
*Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of business research, 61(12), 1203-1218.<br />
<br />
==Mediation, Moderation, and Moderated Mediation==<br />
===Mediation===<br />
*Mathieu, J. E., & Taylor, S. R. (2006). Clarifying conditions and decision points for mediational type inferences in organizational behavior. Journal of Organizational Behavior, 27(8), 1031-1056.<br />
*Mathieu, J. E., DeShon, R. P., & Bergh, D. D. (2008). Mediational inferences in organizational research: Then, now, and beyond. Organizational Research Methods, 11(2), 203-223.<br />
*MacKinnon, D. P., Coxe, S., & Baraldi, A. N. (2012). Guidelines for the investigation of mediating variables in business research. Journal of Business and Psychology, 27(1), 1-14.<br />
*MacKinnon, D. P., & Pirlott, A. G. (2015). Statistical approaches for enhancing causal interpretation of the M to Y relation in mediation analysis. Personality and Social Psychology Review, 19(1), 30-43.<br />
*Preacher, K. J. (2015). Advances in mediation analysis: A survey and synthesis of new developments. Annual Review of Psychology, 66, 825-852.<br />
*Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and truths about mediation analysis. Journal of consumer research, 37(2), 197-206.<br />
*Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication monographs, 76(4), 408-420.<br />
<br />
===Moderation and Multigroup===<br />
*Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: The MACS approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287-321.<br />
*Schumacker, R. E., & Marcoulides, G. A. (1998). Interaction and nonlinear effects in structural equation modeling. Lawrence Erlbaum Associates Publishers.<br />
*Li, F., Harmer, P., Duncan, T. E., Duncan, S. C., Acock, A., & Boles, S. (1998). Approaches to testing interaction effects using structural equation modeling methodology. Multivariate Behavioral Research, 33(1), 1-39.<br />
*Floh, A., & Treiblmaier, H. (2006). What keeps the e-banking customer loyal? A multigroup analysis of the moderating role of consumer characteristics on e-loyalty in the financial service industry.<br />
<br />
===Both or Other===<br />
*Aguinis, H., Edwards, J. R., & Bradley, K. J. (2016). Improving our understanding of moderation and mediation in strategic management research. Organizational Research Methods, 1094428115627498.<br />
*Sardeshmukh, S. R., & Vandenberg, R. J. (2016). Integrating Moderation and Mediation A Structural Equation Modeling Approach. Organizational Research Methods, 1094428115621609.<br />
*Preacher, K. J., Rucker, D. D., & Hayes, A. F. (2007). Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate behavioral research, 42(1), 185-227.<br />
*Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.<br />
<br />
==Partial Least Squares==<br />
*Becker, J. M., Klein, K., and Wetzels, M. (2012). Hierarchical Latent Variable Models in PLS-SEM: Guidelines for Using Reflective-Formative Type Models. Long Range Planning, 45(5), 359-394.<br />
*Becker, J.-M., Rai, A., Ringle, C. M., and Völckner, F. (2013). Discovering Unobserved Heterogeneity in Structural Equation Models to Avert Validity Threats. MIS Quarterly, 37 (3), 665-694.<br />
*Gefen, D., & Straub, D. (2005). A practical guide to factorial validity using PLS-Graph: Tutorial and annotated example. Communications of the Association for Information systems, 16(1), 5.<br />
*Hair, J. F., C. M. Ringle, and M. Sarstedt (2011). PLS-SEM. Indeed a Silver Bullet, Journal of Marketing Theory & Practice, 19 (2), 139-151. <br />
*Hair, J. F., M. Sarstedt, C. M. Ringle, and J. A. Mena (2012). An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research, Journal of the Academy of Marketing Science, 40 (3), 414-433. <br />
*Hair, J. F., M. Sarstedt, T. Pieper, and C. M. Ringle (2012). The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications, Long Range Planning, 45(5/6), 320-340. <br />
*Hair, J. F., Ringle, C. M., & Sarstedt, M. (2013). Editorial-partial least squares structural equation modeling: Rigorous applications, better results and higher acceptance.<br />
*Hair, J., Sarstedt, M., Hopkins, L., & G. Kuppelwieser, V. (2014). Partial least squares structural equation modeling (PLS-SEM) An emerging tool in business research. European Business Review, 26(2), 106-121.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2015). A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling, Journal of the Academy of Marketing Science, 43 (1), 115–135.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2016). Testing Measurement Invariance of Composites Using Partial Least Squares, International Marketing Review, 33 (3), 405-431.<br />
*Henseler, J., Dijkstra, T.K., Sarstedt, M., Ringle, C.M., Diamantopoulos, A., Straub, D.W., Ketchen, D.J., Hair, J.F., Hult, G.T.M., and Calantone, R.J. (2014). Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013). Organizational Research Methods, 17(2), 182-209. <br />
*Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. In New challenges to international marketing (pp. 277-319). Emerald Group Publishing Limited.<br />
*Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach. International Journal of e-Collaboration, 11(4), 1-10.<br />
*Lowry, P. B., & Gaskin, J. (2014). Partial least squares (PLS) structural equation modeling (SEM) for building and testing behavioral causal theory: When to choose it and how to use it. IEEE Transactions on Professional Communication, 57(2), 123-146.<br />
*McIntosh, C. N., Edwards, J. R., & Antonakis, J. (2014). Reflections on partial least squares path modeling. Organizational Research Methods, 17(2), 210-251.<br />
*Monge, C., Cruz, J., & López, F. (2014). Manufacturing and continuous improvement areas using partial least squares path modeling with multiple regression comparison. In Proceedings of CBU International Conference on Innovation, Technology Transfer and Education (2014), February (pp. 3-5).<br />
*Rigdon, E. E. (2014). Rethinking partial least squares path modeling: breaking chains and forging ahead. Long Range Planning, 47(3), 161-167.<br />
*Ringle, C. M., M. Sarstedt, and D. W. Straub (2012). A Critical look at the Use of PLS-SEM in MIS Quarterly, MIS Quarterly, 36(1), iii-xiv.<br />
*Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. In Measurement and research methods in international marketing (pp. 195-218). Emerald Group Publishing Limited.<br />
*Wong, K. K. K. (2013). Partial least squares structural equation modeling (PLS-SEM) techniques using SmartPLS. Marketing Bulletin, 24(1), 1-32.<br />
<br />
==General Topics==<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
*Urdan, T. C. 2011. Statistics in Plain English. Routledge.<br />
*Newbold, P., Carlson, W., and Thorne, B. 2012. Statistics for Business and Economics. Pearson.<br />
*Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage learning.<br />
*Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological bulletin, 103(3), 411.<br />
*Suits, D. B. (1957). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548-551.<br />
*Gefen, D., Rigdon, E. E., & Straub, D. (2011). Editor's comments: an update and extension to SEM guidelines for administrative and social science research. MIS Quarterly, iii-xiv.<br />
*Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.<br />
*Blunch, N. (2013). Introduction to structural equation modeling using IBM SPSS statistics and AMOS (2nd ed.). Los Angeles, CA: Sage.<br />
*Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford publications.<br />
*Argyrous, G. (2011). Statistics for research: with a guide to SPSS (3rd ed.). Thousand Oaks, CA: Sage Publications.<br />
*Byrne, B. M. (2009). Structural equation modeling with AMOS: basic concepts, applications, and programming (2nd ed.). Abingdon-on-Thames: Routledge.<br />
*Williams, L. J., Vandenberg, R. J., & Edwards, J. R. (2009). Structural equation modeling in management research: A guide for improved analysis. The Academy of Management Annals, 3 (1), 543-604.<br />
<br />
===Model Fit===<br />
*Kenny, D. A. (2012). Measuring Model Fit. http://davidakenny.net/cm/fit.htm<br />
*Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55.<br />
*Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.<br />
*Hooper, D., Coughlan, J., & Mullen, M. (2008) Structural Equation Modelling: Guidelines for Determining Model Fit. Journal of Business Research, 6(1), 53-60.<br />
<br />
==Miscellaneous==<br />
*Kolenikov, S., and Bollen, K. A. 2012. "Testing Negative Error Variances: Is a Heywood Case a Symptom of Misspecification?," Sociological Methods & Research (41:1), pp. 124-167.<br />
*Jalayer Khalilzadeh, Asli D.A. Tasci, Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research, In Tourism Management, Volume 62, 2017, Pages 89-96, http://www.sciencedirect.com/science/article/pii/S026151771730078X<br />
*Green, J. P., Tonidandel, S., & Cortina, J. M. (2016). Getting through the gate: Statistical and methodological issues raised in the reviewing process. Organizational Research Methods, 19(3), 402-432.<br />
*Malhotra, Naresh K. Marketing research: An applied orientation, 5/e. Pearson Education India, 2008.<br />
*Gravetter, F. J., & Wallnau, L. B. (2016). Statistics for the behavioral sciences (2nd ed.). Los Angeles: SAGE Publications, Inc.<br />
*Blair, J., Czaja, R. F., & Blair, E. A. (2014). Designing surveys: A guide to decisions and procedures (3rd ed.). Sage Publications.<br />
*Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability.<br />
*Kenny, D. A. (2011). Respecification of Latent Variable Models. http://davidakenny.net/cm/respec.htm<br />
*Bagozzi, R. P., & Yi, Y. (2012). Specification, evaluation, and interpretation of structural equation models. Journal of the academy of marketing science, 40(1), 8-34.<br />
*Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270-301. (for Cook's distance)<br />
*Winklhofer, H. M., & Diamantopoulos, A. (2002). Managerial evaluation of sales forecasting effectiveness: A MIMIC modeling approach. International Journal of Research in Marketing, 19(2), 151-166.<br />
*Thomas, D. M., & Watson, R. T. (2002). Q-sorting and MIS research: A primer. Communications of the Association for Information Systems, 8(1), 9.<br />
*Osborne, J. W. (2012). Power and Planning for Data Collection. In Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage Publications.<br />
*Steenkamp, J. B. E., De Jong, M. G., & Baumgartner, H. (2010). Socially desirable response tendencies in survey research. Journal of Marketing Research, 47(2), 199-214.<br />
*Bacharach, S. B. (1989). Organizational theories: Some criteria for evaluation. Academy of management review, 14(4), 496-515.<br />
*Becker, T. E. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274-289.<br />
*Dietz, W. H., & Gortmaker, S. L. (1985). Do we fatten our children at the television set? Obesity and television viewing in children and adolescents. Pediatrics, 75(5), 807-812.<br />
*Peterson, C., Park, N., & Seligman, M. E. (2005). Orientations to happiness and life satisfaction: The full life versus the empty life. Journal of happiness studies, 6(1), 25-41.<br />
*Sposito, V. A., Hand, M. L., & Skarpness, B. (1983). On the efficiency of using the sample kurtosis in selecting optimal lpestimators. Communications in Statistics-simulation and Computation, 12(3), 265-272.<br />
*McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of mathematical and statistical Psychology, 34(1), 100-117.<br />
*Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Cincinnati, OH:Atomic Dog.<br />
*Gravetter, F., & Wallnau, L. (2014). Essentials of statistics for the behavioral sciences (8th ed.). Belmont, CA: Wadsworth.<br />
*Field, A. (2000). Discovering statistics using spss for windows. London-Thousand Oaks- New Delhi: Sage publications.<br />
*Field, A. (2009). Discovering statistics using SPSS. London: SAGE.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Exploratory_Factor_Analysis&diff=1431158Exploratory Factor Analysis2018-06-24T21:10:11Z<p>Jgaskin: /* Common EFA Problems */</p>
<hr />
<div>Exploratory Factor Analysis (EFA) is a statistical approach for determining the correlation among the variables in a dataset. This type of analysis provides a factor structure (a grouping of variables based on strong correlations). In general, an EFA prepares the variables to be used for cleaner structural equation modeling. An EFA should always be conducted for new datasets. The beauty of an EFA over a CFA (confirmatory) is that no ''a priori'' theory about which items belong to which constructs is applied. This means the EFA will be able to spot problematic variables much more easily than the CFA. '''A critical assumption of the EFA is that it is only appropriate for sets of non-nominal items which theoretically belong to ''reflective latent'' factors.''' Categorical/nominal variables (e.g., marital status, gender) should not be included. Formative measures should not be included. Very rarely should objective (rather than perceptual) variables be included, as objective variables rarely belong to reflective latent factors. For those wondering why I default to Maximum Likelihood and Promax, here is a good explanation: https://jonathantemplin.com/files/sem/sem13psyc948/sem13psyc948_lecture10.pdf<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=jNDD5WSsOXI '''How to do an EFA''']<br />
<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Factor%20Analysis.pptx '''Exploratory Factor Analysis''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Rotation types ==<br />
Rotation causes factor loadings to be more clearly differentiated, which is often necessary to facilitate interpretation. Several types of rotation are available for your use.<br />
<br />
===Orthogonal===<br />
<br />
'''Varimax''' (most common)<br />
<br />
*minimizes number of variables with extreme loadings (high or low) on a factor<br />
*makes it possible to identify a variable with a factor<br />
<br />
'''Quartimax'''<br />
<br />
*minimizes the number of factors needed to explain each variable<br />
*tends to generate a general factor on which most variables load with medium to high values<br />
*not very helpful for research<br />
<br />
'''Equimax'''<br />
<br />
*combination of Varimax and Quartimax<br />
<br />
===Oblique===<br />
The variables are assessed for the unique relationship between each factor and the variables (removing relationships that are shared by multiple factors).<br />
<br />
'''Direct oblimin''' (DO) <br />
<br />
*factors are allowed to be correlated <br />
*diminished interpretability<br />
<br />
'''Promax''' (''Use this one if you're not sure'')<br />
<br />
*computationally faster than DO<br />
*used for large datasets<br />
<br />
== Factoring methods ==<br />
There are three main methods for factor extraction.<br />
===Principal Component Analysis (PCA)===<br />
''Use for a softer solution''<br />
*Considers all of the available variance (common + unique) (places 1’s on diagonal of correlation matrix).<br />
*Seeks a linear combination of variables such that maximum variance is extracted—repeats this step.<br />
*Use when there is concern with prediction, parsimony and you know the specific and error variance are small.<br />
*Results in orthogonal (uncorrelated factors).<br />
<br />
===Principal Axis Factoring (PAF)===<br />
*Considers only common variance (places communality estimates on diagonal of correlation matrix).<br />
*Seeks least number of factors that can account for the common variance (correlation) of a set of variables. <br />
*PAF is only analyzing common factor variability; removing the uniqueness or unexplained variability from the model.<br />
*PAF is preferred because it accounts for co-variation, whereas PCA accounts for total variance.<br />
<br />
===Maximum Likelihood (ML)===<br />
''Use this method if you are unsure''<br />
*Maximizes differences between factors. Provides Model Fit estimate.<br />
*This is the approach used in AMOS, so if you are going to use AMOS for CFA and structural modeling, you should use this one during the EFA.<br />
<br />
== Appropriateness of data (adequacy) ==<br />
===KMO Statistics===<br />
*Marvelous: .90s<br />
*Meritorious: .80s<br />
*Middling: .70s<br />
*Mediocre: .60s<br />
*Miserable: .50s<br />
*Unacceptable: <.50<br />
<br />
===Bartlett’s Test of Sphericity===<br />
Tests hypothesis that correlation matrix is an identity matrix.<br />
*Diagonals are ones<br />
*Off-diagonals are zeros<br />
A significant result (Sig. < 0.05) indicates matrix is not an identity matrix; i.e., the variables do relate to one another enough to run a meaningful EFA.<br />
<br />
[[File:KMO.png]]<br />
<br />
== Communalities ==<br />
A communality is the extent to which an item correlates with '''all''' other items. Higher communalities are better. If communalities for a particular variable are low (between 0.0-0.4), then that variable may struggle to load significantly on any factor. In the table below, you should identify low values in the "Extraction" column. Low values indicate candidates for removal after you examine the pattern matrix.<br />
<br />
[[File:Communalities.png]]<br />
<br />
== Factor Structure ==<br />
Factor structure refers to the intercorrelations among the variables being tested in the EFA. Using the pattern matrix below as an illustration, we can see that variables group into factors - more precisely, they "load" onto factors. The example below illustrates a very clean factor structure in which convergent and discriminant validity are evident by the high loadings within factors, and no major cross-loadings between factors (i.e., a primary loading should be at least 0.200 larger than secondary loading).<br />
<br />
[[File:Patternmatrix.png]]<br />
<br />
== Convergent validity ==<br />
Convergent validity means that the variables within a single factor are highly correlated. This is evident by the factor loadings. Sufficient/significant loadings depend on the sample size of your dataset. The table below outlines the thresholds for sufficient/significant factor loadings. Generally, the smaller the sample size, the higher the required loading. We can see that in the pattern matrix above, we would need a sample size of 60-70 at a minimum to achieve significant loadings for variables loyalty1 and loyalty7. Regardless of sample size, it is best to have loadings greater than 0.500 and averaging out to greater than 0.700 for each factor. <br />
<br />
[[File:LoadingsThresholds.png]]<br />
<br />
== Discriminant validity ==<br />
Discriminant validity refers to the extent to which factors are distinct and uncorrelated. The rule is that variables should relate more strongly to their own factor than to another factor. Two primary methods exist for determining discriminant validity during an EFA. The first method is to examine the pattern matrix. Variables should load significantly only on one factor. If "cross-loadings" do exist (variable loads on multiple factors), then the cross-loadings should differ by more than 0.2. The second method is to examine the factor correlation matrix, as shown below. Correlations between factors should not exceed 0.7. A correlation greater than 0.7 indicates a majority of shared variance (0.7 * 0.7 = 49% shared variance). As we can see from the factor correlation matrix below, factor 2 is too highly correlated with factors 1, 3, and 4. <br />
<br />
[[File:FCM.png]]<br />
<br />
What if you have discriminant validity problems - for example, the items from two theoretically different factors end up loading on the same extracted factor (instead of on separate factors). I have found the best way to resolve this type of issue is to do a separate EFA with just the items from the offending factors. Work out this smaller EFA (by removing items one at a time that have the worst cross-loadings), then reinsert the remaining items into the full EFA. This will usually resolve the issue. If it doesn't, then consider whether these two factors are actually just two dimensions or manifestations of some higher order factor. If this is the case, then you might consider doing the EFA for this higher order factor separate from all the items belonging to first order factors. Then during the CFA, make sure to model the higher order factor properly by making a 2nd order latent variable.<br />
<br />
== Face validity ==<br />
Face validity is very simple. Do the factors make sense? For example, are variables that are similar in nature loading together on the same factor? If there are exceptions, are they explainable? Factors that demonstrate sufficient face validity should be easy to label. For example, in the pattern matrix above, we could easily label factor 1 "Trust in the Agent" (assuming the variable names are representative of the measure used to collect data for this variable). If all the "Trust" variables in the pattern matrix above loaded onto a single factor, we may have to abstract a bit and call this factor "Trust" rather than "Trust in Agent" and "Trust in Company".<br />
<br />
== Reliability ==<br />
Reliability refers to the consistency of the item-level errors within a single factor. Reliability means just what it sounds like: a "reliable" set of variables will consistently load on the same factor. The way to test reliability in an EFA is to compute Cronbach's alpha for each factor. Cronbach's alpha should be above 0.7; although, ''ceteris paribus'', the value will generally increase for factors with more variables, and decrease for factors with fewer variables. Each factor should aim to have at least 3 variables, although 2 variables is sometimes permissible. <br />
<br />
[[File:CronbachsAlpha.png]]<br />
<br />
== Formative vs. Reflective ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.youtube.com/watch?v=R-jg026t0F8 '''Variables and Factor Analysis, including Specification''']<br />
Specifying formative versus reflective constructs is a critical preliminary step prior to further statistical analysis. Formative constructs should not be expected to properly factor in SPSS, and cannot be modeled appropriately in AMOS. If you need to work with formative factors, either use a Partial Least Squares approach (see PLS section), or create a score (new variable) for each set of formative indicators. This score could be an average or a sum, or some sort of weighted scoring. Here is how you know whether you're working with formative or reflective constructs:<br />
<br />
'''Formative'''<br />
*Direction of causality is from measure to construct<br />
*No reason to expect the measures are correlated<br />
*Indicators are not interchangeable<br />
<br />
'''Reflective'''<br />
*Direction of causality is from construct to measure<br />
*Measures expected to be correlated<br />
*Indicators are interchangeable<br />
<br />
An example of formative versus reflective constructs is given in the figure below.<br />
<br />
[[File:Specification.png]]<br />
<br />
== Common EFA Problems ==<br />
1. EFA that results in too many or too few factors (contrary to expected number of factors).<br />
*This happens all the time when you extract based on eigenvalues. I encourage students to use eigenvalues first, but then also to try constraining to the exact number of expected factors. Concerns arise when the eigenvalues extract fewer than expected, so constraining ends up extracting factors with very low eigenvalues (and therefore not very useful factors). <br />
2. EFA with low communalities for some items.<br />
*This is a sign of low correlation and is usually corroborated by a low pattern matrix loading. I tell students not to remove an item just because of a low communality, but to watch it carefully throughout the rest of the EFA. <br />
3. EFA with a 2nd order construct involved, as well as several first order constructs. <br />
*Often when there is a 2nd order factor in an EFA, the subdimensions of that factor will all load together, instead of in separate factors. In such cases, I recommend doing a separate EFA for the items of that 2nd order factor (use Promax and Principal Components Analysis). Then, if that EFA results in removing some items to achieve discriminant validity, you can try putting the EFA back together with the remaining items (although it still might not work). Then, during the CFA, be sure to properly model the 2nd order factor with an additional latent variable connected to its sub-factors.<br />
4. EFA with Heywood cases<br />
*Sometimes loadings are greater than 1.00. I don’t address these until I’ve addressed all other problems. Once I have a good EFA solution, then if the Heywood case is still there (usually it resolves itself), then I try a different rotation method (Varimax will fix it every time).<br />
<br />
== Some Thoughts on Messy EFAs ==<br />
Let us say that you are doing an EFA and your pattern matrix ends up a mess. Let’s say that the items from one or two constructs do not load as expected no matter how you manipulate the EFA. What can you do about it? There is no right answer (this is statistics after all), but you do have a few options:<br />
<br />
1. You can remove those constructs from the model and move forward without them.<br />
*This option is not recommended as it is usually the last course of action to take. You should always do everything in your power to retain constructs that are key to your theory.<br />
2. You can run the EFA using a more exploratory approach without regard to expected loadings. For example, if you expected item foo3 to load with items foo1 and foo2, but instead it loaded with items moo1-3, then you should just let it. Then rename your factors according to what loaded on them. <br />
*This option is acceptable, but will lead you to produce a model that is probably somewhat different from the one you had expected to end up with. <br />
3. You can say to yourself, “Why am I doing an EFA? These are established scales and I already know which items belong to which constructs (theoretically). I do not need to explore the relationships between the items because I already know the relationships. So shouldn’t I be doing a CFA instead – to confirm these expectations?” And then you would simply jump to the CFA first to refine your measurement model '''(but then you return to your EFA after your CFA)'''.<br />
*Surveys are usually built with ''a priori'' constructs and theory in mind – or surveys are built from existing scales that have been validated in previous literature. Thus, we are less inclined to “explore” and more inclined to “confirm” when doing factor analysis. The point of a factor analysis is to show that you have distinct constructs (discriminant validity) that each measures a single thing (convergent validity), and that are reliable (reliability). This can all be achieved in the CFA. '''''However, you should then go back to the EFA and "confirm" the CFA in the EFA by setting up the EFA as your CFA turned out.''''' <br />
<br />
Why do I bring this up? Mainly because your EFAs are nearly always going to run messy, and because you can endlessly mess around with an EFA and if you believe everything your EFA is telling you, you will end up throwing away items and constructs unnecessarily and thus you will end up letting statistics drive your theory, instead of letting theory drive your theory. EFAs are exploratory and they can be treated as such. We want to retain as much as possible and still be producing valid results. I don’t know if this is emphasized enough in our quant courses. <br />
I also bring this up because I ran an EFA recently and got something I could not salvage without hacking a couple constructs. However, after running the CFA with the full model (ignoring the EFA), I was able to retain all constructs by only removing a few items (and not the ones I expected based on the EFA!). I now have excellent reliability, convergent validity, and only a minor issue with discriminant validity that I’m willing to justify for the greater good of the model. I can now go back and reconcile my CFA with an EFA.<br />
<br />
For a very rocky but successful demonstration of handling a troublesome EFA, watch my SEM Boot Camp 2014 Day 3 Afternoon Video towards the end. The link below will start you at the right time position. In this video, I take one of the seminar participant's data, which I had never seen before, and with which he had been unable to arrive at a clean EFA, and I struggle through it until we arrive at something valid and usable. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/XYHrmDs68Bg?t=1h22m55s '''Tackling a Difficult EFA'</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=References&diff=1408772References2018-06-16T16:33:40Z<p>Jgaskin: /* Miscellaneous */</p>
<hr />
<div>'''Here are some helpful references for structural equation modeling (in no particular order - I just keep adding to the list as they come).''' <br />
<br />
'''To search for a specific term, in Windows hit CTRL+F, on a Mac hit COMMAND+F.''' <br />
<br />
==Constructs and Validity==<br />
*Devellis, R. F. (2003). Scale Development: Theory and Applications Second Edition (Applied Social Research Methods).<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organizational Research Methods, 19(2), 159-203.<br />
*Churchill Jr, G. A. (1979). A paradigm for developing better measures of marketing constructs. Journal of marketing research, 64-73.<br />
*Yaniv, E. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
* Editor’s Comments. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
*Law, K. S., Wong, C. S., & Mobley, W. M. (1998). Toward a taxonomy of multidimensional constructs. Academy of management review, 23(4), 741-755.<br />
*Shaffer, J. A., DeGeest, D., & Li, A. (2016). Tackling the problem of construct proliferation: A guide to assessing the discriminant validity of conceptually related constructs. Organizational Research Methods, 19(1), 80-110.<br />
*Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838.<br />
*Krosnick, J. A. (1999). Survey research. Annual review of psychology, 50(1), 537-567.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35(2), 293-334.<br />
*Bolton, R. N. (1993). Pretesting questionnaires: content analyses of respondents' concurrent verbal protocols. Marketing science, 12(3), 280-303.<br />
*Podsakoff, N. P., Podsakoff, P. M., MacKenzie, S. B., & Klinger, R. L. (2013). Are we really measuring what we say we're measuring? Using video techniques to supplement traditional construct validation procedures. Journal of Applied Psychology, 98(1), 99.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS quarterly, 35(2), 293-334.<br />
*Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. S. (2002). The Q-sort method: assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal of Modern Applied Statistical Methods, 1(1), 15.<br />
*Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of consumer research, 30(2), 199-218.<br />
*MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323-326.<br />
*Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social indicators research, 46(2), 137-155.<br />
*Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. Structural equation modeling: Present and future, 195-216.<br />
*Hancock, Gregory R., and Ralph O. Mueller. "Rethinking construct reliability within latent variable systems." Structural equation modeling: Present and future (2001): 195-216. (discusses MaxR(H))<br />
<br />
==Measurement Models==<br />
===Exploratory Factor Analysis===<br />
*Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.<br />
*Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most From Your Analysis. Practical Assessment, Research & Evaluation,10(7), 1-9.<br />
*Reio Jr, T. G., & Shuck, B. (2015). Exploratory factor analysis: Implications for theory, research, and practice. Advances in Developing Human Resources, 17(1), 12-25.<br />
*Treiblmaier, H., & Filzmoser, P. (2010). Exploratory factor analysis revisited: How robust methods support the detection of hidden multivariate data structures in IS research. Information & management, 47(4), 197-207.<br />
*Ferguson, E., & Cox, T. (1993). Exploratory factor analysis: A users’ guide. International Journal of Selection and Assessment, 1(2), 84-94.<br />
<br />
===Confirmatory Factor Analysis===<br />
*Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational research methods, 3(1), 4-70.<br />
*Byrne, B. M. (2008). Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20(4), 872-882.<br />
*Byrne, B. M. (2004). Testing for multigroup invariance using AMOS graphics: A road less traveled. Structural Equation Modeling, 11(2), 272-300.<br />
*Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: Review of practice and implications. Human Resource Management Review, 18(4), 210-222.<br />
*Brown, T. A. (2014). Confirmatory factor analysis for applied research (2nd ed.). Guilford Publications.<br />
*Matsunaga, M. (2015). How to factor-analyze your data right: do’s, don’ts, and how-to’s. International Journal of Psychological Research, 3(1), 97-110.<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
*Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
====Method Bias, Response Bias, Specific Bias====<br />
*Fuller et al., (2016) "Common methods variance detection in business research", Journal of Business Research,<br />
Volume 69, Issue 8, pp. 3192-3198 (suggests Harman's single factor test is useful under certain circumstances).<br />
*Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of applied psychology, 88(5), 879.<br />
*MacKenzie, S. B., & Podsakoff, P. M. (2012). Common method bias in marketing: causes, mechanisms, and procedural remedies. Journal of Retailing, 88(4), 542-555.<br />
*Williams, L. J., Hartman, N., & Cavazotte, F. (2010). Method variance and marker variables: A review and comprehensive CFA marker technique. Organizational Research Methods, 13(3), 477-514.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569. <br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569.<br />
*Doty, D. H., & Glick, W. H. (1998). Common methods bias: does common methods variance really bias results?. Organizational research methods, 1(4), 374-406.<br />
*Estabrook, Ryne, and Michael Neale. “A Comparison of Factor Score Estimation Methods in the Presence of Missing Data: Reliability and an Application to Nicotine Dependence.” Multivariate behavioral research 48.1 (2013): 1–27. PMC. Web. 1 Nov. 2017. <br />
*Arbuckle JL. Amos 7.0 user’s guide. Chicago, IL: SPSS; 2006. <br />
*Bartlett MS. The statistical conception of mental factors. British Journal of Psychology. 1937;28:97–104.<br />
*Lawley DN, Maxwell MA. Factor analysis as a statistical method. 2. London, UK: Butterworths; 1971. <br />
*Horn JL, McArdle JJ, Mason R. When invariance is not invariant: A practical scientist’s view of the ethereal concept of factorial invariancesnce. The Southern Psychologist. 1983;1:179–188.<br />
*Muthén L, Muthén B. Mplus user’s guide. 5. Los Angeles, CA: Author; 1998–2007.<br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569<br />
<br />
===Other===<br />
*Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological methods, 5(2), 155.<br />
*Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological assessment, 7(3), 286.<br />
*Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of marketing research, 186-192.<br />
*Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality and social psychology bulletin, 28(12), 1629-1646.<br />
*Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research, 39-50.<br />
*Bagozzi, R. P. (2011). Measurement and meaning in information systems and organizational research: Methodological and philosophical foundations. Mis Quarterly, 261-292.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Jarvis, C. B. (2005). The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. Journal of Applied Psychology, 90(4), 710.<br />
*Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of business research, 61(12), 1203-1218.<br />
<br />
==Mediation, Moderation, and Moderated Mediation==<br />
===Mediation===<br />
*Mathieu, J. E., & Taylor, S. R. (2006). Clarifying conditions and decision points for mediational type inferences in organizational behavior. Journal of Organizational Behavior, 27(8), 1031-1056.<br />
*Mathieu, J. E., DeShon, R. P., & Bergh, D. D. (2008). Mediational inferences in organizational research: Then, now, and beyond. Organizational Research Methods, 11(2), 203-223.<br />
*MacKinnon, D. P., Coxe, S., & Baraldi, A. N. (2012). Guidelines for the investigation of mediating variables in business research. Journal of Business and Psychology, 27(1), 1-14.<br />
*MacKinnon, D. P., & Pirlott, A. G. (2015). Statistical approaches for enhancing causal interpretation of the M to Y relation in mediation analysis. Personality and Social Psychology Review, 19(1), 30-43.<br />
*Preacher, K. J. (2015). Advances in mediation analysis: A survey and synthesis of new developments. Annual Review of Psychology, 66, 825-852.<br />
*Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and truths about mediation analysis. Journal of consumer research, 37(2), 197-206.<br />
*Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication monographs, 76(4), 408-420.<br />
<br />
===Moderation and Multigroup===<br />
*Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: The MACS approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287-321.<br />
*Schumacker, R. E., & Marcoulides, G. A. (1998). Interaction and nonlinear effects in structural equation modeling. Lawrence Erlbaum Associates Publishers.<br />
*Li, F., Harmer, P., Duncan, T. E., Duncan, S. C., Acock, A., & Boles, S. (1998). Approaches to testing interaction effects using structural equation modeling methodology. Multivariate Behavioral Research, 33(1), 1-39.<br />
*Floh, A., & Treiblmaier, H. (2006). What keeps the e-banking customer loyal? A multigroup analysis of the moderating role of consumer characteristics on e-loyalty in the financial service industry.<br />
<br />
===Both or Other===<br />
*Aguinis, H., Edwards, J. R., & Bradley, K. J. (2016). Improving our understanding of moderation and mediation in strategic management research. Organizational Research Methods, 1094428115627498.<br />
*Sardeshmukh, S. R., & Vandenberg, R. J. (2016). Integrating Moderation and Mediation A Structural Equation Modeling Approach. Organizational Research Methods, 1094428115621609.<br />
*Preacher, K. J., Rucker, D. D., & Hayes, A. F. (2007). Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate behavioral research, 42(1), 185-227.<br />
*Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.<br />
<br />
==Partial Least Squares==<br />
*Becker, J. M., Klein, K., and Wetzels, M. (2012). Hierarchical Latent Variable Models in PLS-SEM: Guidelines for Using Reflective-Formative Type Models. Long Range Planning, 45(5), 359-394.<br />
*Becker, J.-M., Rai, A., Ringle, C. M., and Völckner, F. (2013). Discovering Unobserved Heterogeneity in Structural Equation Models to Avert Validity Threats. MIS Quarterly, 37 (3), 665-694.<br />
*Gefen, D., & Straub, D. (2005). A practical guide to factorial validity using PLS-Graph: Tutorial and annotated example. Communications of the Association for Information systems, 16(1), 5.<br />
*Hair, J. F., C. M. Ringle, and M. Sarstedt (2011). PLS-SEM. Indeed a Silver Bullet, Journal of Marketing Theory & Practice, 19 (2), 139-151. <br />
*Hair, J. F., M. Sarstedt, C. M. Ringle, and J. A. Mena (2012). An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research, Journal of the Academy of Marketing Science, 40 (3), 414-433. <br />
*Hair, J. F., M. Sarstedt, T. Pieper, and C. M. Ringle (2012). The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications, Long Range Planning, 45(5/6), 320-340. <br />
*Hair, J. F., Ringle, C. M., & Sarstedt, M. (2013). Editorial-partial least squares structural equation modeling: Rigorous applications, better results and higher acceptance.<br />
*Hair, J., Sarstedt, M., Hopkins, L., & G. Kuppelwieser, V. (2014). Partial least squares structural equation modeling (PLS-SEM) An emerging tool in business research. European Business Review, 26(2), 106-121.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2015). A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling, Journal of the Academy of Marketing Science, 43 (1), 115–135.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2016). Testing Measurement Invariance of Composites Using Partial Least Squares, International Marketing Review, 33 (3), 405-431.<br />
*Henseler, J., Dijkstra, T.K., Sarstedt, M., Ringle, C.M., Diamantopoulos, A., Straub, D.W., Ketchen, D.J., Hair, J.F., Hult, G.T.M., and Calantone, R.J. (2014). Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013). Organizational Research Methods, 17(2), 182-209. <br />
*Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. In New challenges to international marketing (pp. 277-319). Emerald Group Publishing Limited.<br />
*Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach. International Journal of e-Collaboration, 11(4), 1-10.<br />
*Lowry, P. B., & Gaskin, J. (2014). Partial least squares (PLS) structural equation modeling (SEM) for building and testing behavioral causal theory: When to choose it and how to use it. IEEE Transactions on Professional Communication, 57(2), 123-146.<br />
*McIntosh, C. N., Edwards, J. R., & Antonakis, J. (2014). Reflections on partial least squares path modeling. Organizational Research Methods, 17(2), 210-251.<br />
*Monge, C., Cruz, J., & López, F. (2014). Manufacturing and continuous improvement areas using partial least squares path modeling with multiple regression comparison. In Proceedings of CBU International Conference on Innovation, Technology Transfer and Education (2014), February (pp. 3-5).<br />
*Rigdon, E. E. (2014). Rethinking partial least squares path modeling: breaking chains and forging ahead. Long Range Planning, 47(3), 161-167.<br />
*Ringle, C. M., M. Sarstedt, and D. W. Straub (2012). A Critical look at the Use of PLS-SEM in MIS Quarterly, MIS Quarterly, 36(1), iii-xiv.<br />
*Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. In Measurement and research methods in international marketing (pp. 195-218). Emerald Group Publishing Limited.<br />
*Wong, K. K. K. (2013). Partial least squares structural equation modeling (PLS-SEM) techniques using SmartPLS. Marketing Bulletin, 24(1), 1-32.<br />
<br />
==General Topics==<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
*Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage learning.<br />
*Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological bulletin, 103(3), 411.<br />
*Suits, D. B. (1957). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548-551.<br />
*Gefen, D., Rigdon, E. E., & Straub, D. (2011). Editor's comments: an update and extension to SEM guidelines for administrative and social science research. MIS Quarterly, iii-xiv.<br />
*Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.<br />
*Blunch, N. (2013). Introduction to structural equation modeling using IBM SPSS statistics and AMOS (2nd ed.). Los Angeles, CA: Sage.<br />
*Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford publications.<br />
*Argyrous, G. (2011). Statistics for research: with a guide to SPSS (3rd ed.). Thousand Oaks, CA: Sage Publications.<br />
*Byrne, B. M. (2009). Structural equation modeling with AMOS: basic concepts, applications, and programming (2nd ed.). Abingdon-on-Thames: Routledge.<br />
*Williams, L. J., Vandenberg, R. J., & Edwards, J. R. (2009). Structural equation modeling in management research: A guide for improved analysis. The Academy of Management Annals, 3 (1), 543-604.<br />
<br />
===Model Fit===<br />
*Kenny, D. A. (2012). Measuring Model Fit. http://davidakenny.net/cm/fit.htm<br />
*Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55.<br />
*Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.<br />
*Hooper, D., Coughlan, J., & Mullen, M. (2008) Structural Equation Modelling: Guidelines for Determining Model Fit. Journal of Business Research, 6(1), 53-60.<br />
<br />
==Miscellaneous==<br />
*Kolenikov, S., and Bollen, K. A. 2012. "Testing Negative Error Variances: Is a Heywood Case a Symptom of Misspecification?," Sociological Methods & Research (41:1), pp. 124-167.<br />
*Jalayer Khalilzadeh, Asli D.A. Tasci, Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research, In Tourism Management, Volume 62, 2017, Pages 89-96, http://www.sciencedirect.com/science/article/pii/S026151771730078X<br />
*Green, J. P., Tonidandel, S., & Cortina, J. M. (2016). Getting through the gate: Statistical and methodological issues raised in the reviewing process. Organizational Research Methods, 19(3), 402-432.<br />
*Malhotra, Naresh K. Marketing research: An applied orientation, 5/e. Pearson Education India, 2008.<br />
*Gravetter, F. J., & Wallnau, L. B. (2016). Statistics for the behavioral sciences (2nd ed.). Los Angeles: SAGE Publications, Inc.<br />
*Blair, J., Czaja, R. F., & Blair, E. A. (2014). Designing surveys: A guide to decisions and procedures (3rd ed.). Sage Publications.<br />
*Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability.<br />
*Kenny, D. A. (2011). Respecification of Latent Variable Models. http://davidakenny.net/cm/respec.htm<br />
*Bagozzi, R. P., & Yi, Y. (2012). Specification, evaluation, and interpretation of structural equation models. Journal of the academy of marketing science, 40(1), 8-34.<br />
*Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270-301. (for Cook's distance)<br />
*Winklhofer, H. M., & Diamantopoulos, A. (2002). Managerial evaluation of sales forecasting effectiveness: A MIMIC modeling approach. International Journal of Research in Marketing, 19(2), 151-166.<br />
*Thomas, D. M., & Watson, R. T. (2002). Q-sorting and MIS research: A primer. Communications of the Association for Information Systems, 8(1), 9.<br />
*Osborne, J. W. (2012). Power and Planning for Data Collection. In Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage Publications.<br />
*Steenkamp, J. B. E., De Jong, M. G., & Baumgartner, H. (2010). Socially desirable response tendencies in survey research. Journal of Marketing Research, 47(2), 199-214.<br />
*Bacharach, S. B. (1989). Organizational theories: Some criteria for evaluation. Academy of management review, 14(4), 496-515.<br />
*Becker, T. E. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274-289.<br />
*Dietz, W. H., & Gortmaker, S. L. (1985). Do we fatten our children at the television set? Obesity and television viewing in children and adolescents. Pediatrics, 75(5), 807-812.<br />
*Peterson, C., Park, N., & Seligman, M. E. (2005). Orientations to happiness and life satisfaction: The full life versus the empty life. Journal of happiness studies, 6(1), 25-41.<br />
*Sposito, V. A., Hand, M. L., & Skarpness, B. (1983). On the efficiency of using the sample kurtosis in selecting optimal lpestimators. Communications in Statistics-simulation and Computation, 12(3), 265-272.<br />
*McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of mathematical and statistical Psychology, 34(1), 100-117.<br />
*Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Cincinnati, OH:Atomic Dog.<br />
*Gravetter, F., & Wallnau, L. (2014). Essentials of statistics for the behavioral sciences (8th ed.). Belmont, CA: Wadsworth.<br />
*Field, A. (2000). Discovering statistics using spss for windows. London-Thousand Oaks- New Delhi: Sage publications.<br />
*Field, A. (2009). Discovering statistics using SPSS. London: SAGE.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=MediaWiki:Sidebar&diff=1353252MediaWiki:Sidebar2018-05-21T17:47:35Z<p>Jgaskin: </p>
<hr />
<div><br />
* Navigation<br />
** mainpage|Home<br />
** http://gaskination.com/forum/|Forum<br />
** Data screening|Data Screening<br />
** Exploratory Factor Analysis|EFA<br />
** Confirmatory Factor Analysis|CFA<br />
** Structural Equation Modeling|Causal SEM<br />
** PLS|PLS<br />
** Plugins|Plugins Info<br />
** Guidelines|General Guidelines<br />
** Cluster Analysis|Cluster Analysis<br />
** References|References<br />
<br />
* Resources<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package.xlsm|Excel StatTools<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package%20OLD.xls|OLD StatTools<br />
** http://www.youtube.com/Gaskination|YouTube Demos<br />
** http://www.kolobkreations.com/StatsHelpArchive.pdf|Stats Help Archive<br />
** https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing|Plugins & Estimands</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=MediaWiki:Sidebar&diff=1353250MediaWiki:Sidebar2018-05-21T17:46:52Z<p>Jgaskin: </p>
<hr />
<div><br />
* Navigation<br />
** mainpage|Home<br />
** http://gaskination.com/forum/|Forum<br />
** Data screening|Data Screening<br />
** Exploratory Factor Analysis|EFA<br />
** Confirmatory Factor Analysis|CFA<br />
** Structural Equation Modeling|Causal Modeling<br />
** PLS|PLS<br />
** Plugins|Plugins Info<br />
** Guidelines|General Guidelines<br />
** Cluster Analysis|Cluster Analysis<br />
** References|References<br />
<br />
* Resources<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package.xlsm|Excel StatTools<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package%20OLD.xls|OLD StatTools<br />
** http://www.youtube.com/Gaskination|YouTube Demos<br />
** http://www.kolobkreations.com/StatsHelpArchive.pdf|Stats Help Archive<br />
** https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing|Plugins & Estimands</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=MediaWiki:Sidebar&diff=1342088MediaWiki:Sidebar2018-05-16T19:20:23Z<p>Jgaskin: </p>
<hr />
<div><br />
* Navigation<br />
** mainpage|Home<br />
** http://gaskination.com/forum/|Forum<br />
** Data screening|Data Screening<br />
** Exploratory Factor Analysis|EFA<br />
** Confirmatory Factor Analysis|CFA<br />
** Structural Equation Modeling|SEM<br />
** PLS|PLS<br />
** Plugins|Plugins Info<br />
** Guidelines|General Guidelines<br />
** Cluster Analysis|Cluster Analysis<br />
** References|References<br />
<br />
* Resources<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package.xlsm|Excel StatTools<br />
** http://www.kolobkreations.com/Stats%20Tools%20Package%20OLD.xls|OLD StatTools<br />
** http://www.youtube.com/Gaskination|YouTube Demos<br />
** http://www.kolobkreations.com/StatsHelpArchive.pdf|Stats Help Archive<br />
** https://drive.google.com/drive/folders/0B3T1TGdHG9aEbFg1eEpqOWtrR3c?usp=sharing|Plugins & Estimands</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=1332857Guidelines2018-05-12T15:31:34Z<p>Jgaskin: /* Some general guidelines for the order to conduct each procedure */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run] <br />
*The SEM Speed Run does almost everything listed below. However, I've also added below a few more links for the few items that either are not covered in the speed run, or have been updated since the speed run was made.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin] (if method bias was detected, remove the CLF or whatever variable is affecting all observed variables, while conducting this final validity check. You would then put it back in before imputing factor scores if there is bias.)<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included.<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=1330910Guidelines2018-05-11T17:32:18Z<p>Jgaskin: /* Order of Operations for Testing your Model */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run] <br />
*The SEM Speed Run does almost everything listed below. However, I've also added below a few more links for the few items that either are not covered in the speed run, or have been updated since the speed run was made.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin]<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included.<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1329650Confirmatory Factor Analysis2018-05-11T04:18:31Z<p>Jgaskin: /* Zero and Equal Constraints */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/wV6UudZSBCA '''Model Fit Thresholds''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms ('''however, some argue that there are never appropriate reasons to covary errors'''), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables. If you also are able to retain the CLF (i.e., it does not break your model), then you keep it while imputing. If you have only connected the CLF to the observed variables (and not the SB construct), then make sure to use the SB construct as a control variable in the causal model.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights. Keep constraints the same, but for each factor, for one of the groups, make the variance constraint = 1. This can be done in the ''Manage Models'' section of AMOS.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Structural_Equation_Modeling&diff=1303334Structural Equation Modeling2018-04-27T21:41:46Z<p>Jgaskin: /* Multi-group effects */</p>
<hr />
<div>“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/<br />
<br />
SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Hypotheses ==<br />
Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe. <br />
===Direct effects===<br />
"Diet has a positive effect on weight loss"<br />
<br />
"An increase in hours spent watching television will negatively effect weight loss"<br />
===Mediated effects===<br />
<br />
"Exercise mediates the positive relationship between diet and weight loss"<br />
<br />
"Television time mediates the positive relationship between diet and weight loss"<br />
<br />
"Diet affects weight loss indirectly through exercise"<br />
<br />
===Interaction effects===<br />
"Exercise strengthens the positive relationship between diet and weight loss"<br />
<br />
"Exercise amplifies the positive relationship between diet and weight loss"<br />
<br />
"TV time dampens the positive relationship between diet and weight loss"<br />
<br />
===Multi-group effects===<br />
"The relationship between X and Y is stronger for Group A."<br />
<br />
"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"<br />
<br />
"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"<br />
<br />
"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"<br />
<br />
===Mediated Moderation===<br />
An example of a mediated moderation hypothesis would be something like: <br />
<br />
“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.” <br />
<br />
In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator): <br />
<br />
“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”<br />
===Handling controls===<br />
When including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]"<br />
For example:<br />
<br />
"Exercise positively moderates the positive relationship between diet and weight loss ''when controlling for TV time and diet''"<br />
<br />
"Diet has a positive effect on weight loss ''when controlling for TV time and diet''"<br />
<br />
Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.<br />
<br />
=== Logical Support for Hypotheses ===<br />
Getting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for '''''WHY''''' you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like: <br />
*Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).<br />
<br />
===Statistical Support for Hypotheses through global and local tests===<br />
In order for a hypothesis to be supported, many criteria must be met. These criteria can be classified as global or local tests. In order for a hypothesis to be supported, the local test must be met, but in order for a local test to have meaning, all global tests must be met. Global tests of model fit are the first necessity. If a hypothesized relationship has a significant p-value, but the model has poor fit, we cannot have confidence in that p-value. Next is the global test of variance explained or R-squared. We might observe significant p-values and good model fit, but if R-square is only 0.025, then the relationships we are testing are not very meaningful because they do not explain sufficient variance in the dependent variable. The figure below illustrates the precedence of global and local tests. Lastly, and almost needless to explain, if a regression weight is significant, but is in the wrong direction, our hypothesis is not supported. Instead, there is counter-evidence. For example, if we theorized that exercise would increase weight loss, but instead, exercise decreased weight loss, then we would have counter-evidence.<br />
<br />
[[File:globallocal.png]]<br />
<br />
== Controls ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Controls.pptx '''Controls''']<br />
Controls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a negative effect on school performance. But there are many things that could affect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.<br />
<br />
[[File:controlsIQ.png]]<br />
<br />
As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.<br />
<br />
Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default. However, there are different schools of thought on this. The downside of covarying with all exogenous variables is that you gain no degrees of freedom. If you are in need of degrees of freedom, then try removing the non-significant covariances with controls.<br />
<br />
[[File:controlsAMOS.png]]<br />
<br />
When reporting the model, you '''''do''''' need to include the controls in '''''all''''' your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).<br />
<br />
== Mediation ==<br />
*[[File:books.jpg]]'''''Lesson:''''' [http://www.kolobkreations.com/Mediation%20Step%20by%20Step%20with%20Bootstrapping.pptx '''Testing Mediation using Bootstrapping''']<br />
*[[File:YouTube.png]] '''''Video Lecture:''''' [http://youtu.be/j_yufPUjkwk?hd=1 '''A Simpler Guide to Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/0artfnxyF_A '''Mediation in AMOS''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/41XgTZc66ko '''Specific Indirect Effects''']<br />
*'''''Hair et al.:''''' ''pp. 751-755''<br />
<br />
=== Concept ===<br />
<br />
Mediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, ''work effectiveness'', may be a good mediator. We would say that work effectiveness mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance is ''better'' explained through the mediator of work effectiveness. The logic is, intelligent workers tend to perform better '''because''' they work more efficiently. Thus, when intelligence leads to working smarter, then we observe greater performance. <br />
<br />
[[File:mediation.png]]<br />
<br />
<br />
We used to theorize three main types of mediation based on the Barron and Kenny approach; namely: 1) partial, 2) full, and 3) indirect. However, recent literature suggests that mediation is less nuanced than this -- that simply, if a significant indirect effect exists, then mediation is present.<br />
<br />
Here is another useful site for mediation: https://msu.edu/~falkcarl/mediation.html<br />
<br />
== Interaction ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=K34sF_AmWio '''Testing Interaction Effects''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Interaction.pptx '''Interaction Effects''']<br />
===Concept===<br />
In factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. <br />
This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example<br />
*you lose 1 pound of weight for every hour you exercise<br />
*you lose 1 pound of weight for every 500 calories you cut back from your regular diet<br />
*but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds<br />
So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:<br />
*Chocolate is yummy<br />
*Cheese is yummy<br />
*but combining chocolate and cheese is yucky!<br />
<br />
The following figure is an example of a simple interaction model.<br />
<br />
[[File:interaction.png]]<br />
<br />
===Types===<br />
Interactions enable more precise explanation of causal effects by providing a method for explaining not only ''how'' X affects Y, but also ''under what circumstances'' the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically. <br />
<br />
[[File:interactionTypes.png]]<br />
<br />
== Model fit again ==<br />
You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. Every time the model changes and a hypothesis is tested, model fit must be assessed. If multiple hypotheses are tested on the same model, model fit will not change, so it only needs to be addressed once for that set of hypotheses. The method for assessing model fit in a causal model is the same as for a measurement model: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms. '''Also, a warning that some argue there is never an appropriate argument for covarying error terms.''' (I tend to agree that they should not be covaried.)<br />
*If the correlated variables are ''not'' logically '''causally''' correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship.<br />
**e.g., burnout from customers is highly correlated with burnout from management<br />
**We expect these to have similar values (residuals) because they are logically similar and have similar wording in our survey, but they do not necessarily have any causal ties.<br />
*If the correlated variables are logically '''causally''' correlated, then simply add a regression line.<br />
**e.g., burnout from customers is highly correlated with satisfaction with customers<br />
**We expect burnC to predict satC, so ''not'' accounting for it is negligent.<br />
<br />
Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
== Multi-group ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mirI5ETQRTA '''Testing Multi-group Moderation using Chi-square difference test'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/w5ikoIgTIc0?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Testing Multi-group differences using AMOS's multigroup function''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Mediation%20and%20Multi-group%20Moderation.pptx '''Mediation versus Moderation''']<br />
Multi-group comparisons are a special form of moderation in which a dataset is split along values of a grouping variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group comparisons is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group analysis would answer the question: does dieting effect weight loss differently for males than for females?<br />
In the videos above, you will learn how to set up a multigroup analysis in AMOS, and test it using chi-square differences, and AMOS's built in multigroup function. For those who have seen my video on the critical ratios approach, be warned that currently, the chi-square approach is the most widely accepted because the critical ratios approach doesn't take into account family-wise error which affects a model when testing multiple hypotheses simultaneously. For now, I recommend using the chi-square approach. The AMOS built in multigroup function uses the chi-square approach as well.<br />
<br />
==From Measurement Model to Structural Model ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n-ULF6BGVw0 '''From CFA to SEM in AMOS''']<br />
Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.<br />
<br />
==Creating Factor Scores from Latent Factors==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=dsOS9tQjxW8 '''Imputing Factor Scores in AMOS''']<br />
If you would like to create factor scores (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:<br />
*You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or Excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data). <br />
*Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").<br />
After those two caveats are addressed, then you can simply go to the ''Analyze'' menu, and select ''Data Imputation''. Select ''Regression Imputation'', and then click on the ''Impute'' button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.<br />
<br />
==Need more degrees of freedom==<br />
Did you run your model and observe that DF=0 or CFI=1.000. Sounds like you need more degrees of freedom. There are a few ways to do this:<br />
#If there are opportunities to use latent variables instead of computed variables, use latents.<br />
#If you have control variables, do not link them to every other variable.<br />
#Do not include all paths by default. Just include the ones that make good theoretical sense.<br />
#If a path is not significant, omit it. If you do this, make sure to argue that the reason for doing this was to increase degrees of freedom (and also because the path was not significant).<br />
Increasing the degrees of freedom allows AMOS to calculate model fit measures. If you have zero degrees of freedom, model fit is irrelevant because you are "perfectly" accounting for all possible relationships in the model.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Structural_Equation_Modeling&diff=1303332Structural Equation Modeling2018-04-27T21:40:59Z<p>Jgaskin: /* Mediated effects */</p>
<hr />
<div>“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/<br />
<br />
SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Hypotheses ==<br />
Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe. <br />
===Direct effects===<br />
"Diet has a positive effect on weight loss"<br />
<br />
"An increase in hours spent watching television will negatively effect weight loss"<br />
===Mediated effects===<br />
<br />
"Exercise mediates the positive relationship between diet and weight loss"<br />
<br />
"Television time mediates the positive relationship between diet and weight loss"<br />
<br />
"Diet affects weight loss indirectly through exercise"<br />
<br />
===Interaction effects===<br />
"Exercise strengthens the positive relationship between diet and weight loss"<br />
<br />
"Exercise amplifies the positive relationship between diet and weight loss"<br />
<br />
"TV time dampens the positive relationship between diet and weight loss"<br />
<br />
===Multi-group effects===<br />
"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"<br />
<br />
"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"<br />
<br />
"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"<br />
===Mediated Moderation===<br />
An example of a mediated moderation hypothesis would be something like: <br />
<br />
“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.” <br />
<br />
In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator): <br />
<br />
“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”<br />
===Handling controls===<br />
When including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]"<br />
For example:<br />
<br />
"Exercise positively moderates the positive relationship between diet and weight loss ''when controlling for TV time and diet''"<br />
<br />
"Diet has a positive effect on weight loss ''when controlling for TV time and diet''"<br />
<br />
Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.<br />
<br />
=== Logical Support for Hypotheses ===<br />
Getting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for '''''WHY''''' you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like: <br />
*Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).<br />
<br />
===Statistical Support for Hypotheses through global and local tests===<br />
In order for a hypothesis to be supported, many criteria must be met. These criteria can be classified as global or local tests. In order for a hypothesis to be supported, the local test must be met, but in order for a local test to have meaning, all global tests must be met. Global tests of model fit are the first necessity. If a hypothesized relationship has a significant p-value, but the model has poor fit, we cannot have confidence in that p-value. Next is the global test of variance explained or R-squared. We might observe significant p-values and good model fit, but if R-square is only 0.025, then the relationships we are testing are not very meaningful because they do not explain sufficient variance in the dependent variable. The figure below illustrates the precedence of global and local tests. Lastly, and almost needless to explain, if a regression weight is significant, but is in the wrong direction, our hypothesis is not supported. Instead, there is counter-evidence. For example, if we theorized that exercise would increase weight loss, but instead, exercise decreased weight loss, then we would have counter-evidence.<br />
<br />
[[File:globallocal.png]]<br />
<br />
== Controls ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Controls.pptx '''Controls''']<br />
Controls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a negative effect on school performance. But there are many things that could affect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.<br />
<br />
[[File:controlsIQ.png]]<br />
<br />
As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.<br />
<br />
Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default. However, there are different schools of thought on this. The downside of covarying with all exogenous variables is that you gain no degrees of freedom. If you are in need of degrees of freedom, then try removing the non-significant covariances with controls.<br />
<br />
[[File:controlsAMOS.png]]<br />
<br />
When reporting the model, you '''''do''''' need to include the controls in '''''all''''' your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).<br />
<br />
== Mediation ==<br />
*[[File:books.jpg]]'''''Lesson:''''' [http://www.kolobkreations.com/Mediation%20Step%20by%20Step%20with%20Bootstrapping.pptx '''Testing Mediation using Bootstrapping''']<br />
*[[File:YouTube.png]] '''''Video Lecture:''''' [http://youtu.be/j_yufPUjkwk?hd=1 '''A Simpler Guide to Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/0artfnxyF_A '''Mediation in AMOS''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/41XgTZc66ko '''Specific Indirect Effects''']<br />
*'''''Hair et al.:''''' ''pp. 751-755''<br />
<br />
=== Concept ===<br />
<br />
Mediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, ''work effectiveness'', may be a good mediator. We would say that work effectiveness mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance is ''better'' explained through the mediator of work effectiveness. The logic is, intelligent workers tend to perform better '''because''' they work more efficiently. Thus, when intelligence leads to working smarter, then we observe greater performance. <br />
<br />
[[File:mediation.png]]<br />
<br />
<br />
We used to theorize three main types of mediation based on the Barron and Kenny approach; namely: 1) partial, 2) full, and 3) indirect. However, recent literature suggests that mediation is less nuanced than this -- that simply, if a significant indirect effect exists, then mediation is present.<br />
<br />
Here is another useful site for mediation: https://msu.edu/~falkcarl/mediation.html<br />
<br />
== Interaction ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=K34sF_AmWio '''Testing Interaction Effects''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Interaction.pptx '''Interaction Effects''']<br />
===Concept===<br />
In factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. <br />
This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example<br />
*you lose 1 pound of weight for every hour you exercise<br />
*you lose 1 pound of weight for every 500 calories you cut back from your regular diet<br />
*but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds<br />
So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:<br />
*Chocolate is yummy<br />
*Cheese is yummy<br />
*but combining chocolate and cheese is yucky!<br />
<br />
The following figure is an example of a simple interaction model.<br />
<br />
[[File:interaction.png]]<br />
<br />
===Types===<br />
Interactions enable more precise explanation of causal effects by providing a method for explaining not only ''how'' X affects Y, but also ''under what circumstances'' the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically. <br />
<br />
[[File:interactionTypes.png]]<br />
<br />
== Model fit again ==<br />
You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. Every time the model changes and a hypothesis is tested, model fit must be assessed. If multiple hypotheses are tested on the same model, model fit will not change, so it only needs to be addressed once for that set of hypotheses. The method for assessing model fit in a causal model is the same as for a measurement model: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms. '''Also, a warning that some argue there is never an appropriate argument for covarying error terms.''' (I tend to agree that they should not be covaried.)<br />
*If the correlated variables are ''not'' logically '''causally''' correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship.<br />
**e.g., burnout from customers is highly correlated with burnout from management<br />
**We expect these to have similar values (residuals) because they are logically similar and have similar wording in our survey, but they do not necessarily have any causal ties.<br />
*If the correlated variables are logically '''causally''' correlated, then simply add a regression line.<br />
**e.g., burnout from customers is highly correlated with satisfaction with customers<br />
**We expect burnC to predict satC, so ''not'' accounting for it is negligent.<br />
<br />
Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
== Multi-group ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mirI5ETQRTA '''Testing Multi-group Moderation using Chi-square difference test'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/w5ikoIgTIc0?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Testing Multi-group differences using AMOS's multigroup function''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Mediation%20and%20Multi-group%20Moderation.pptx '''Mediation versus Moderation''']<br />
Multi-group comparisons are a special form of moderation in which a dataset is split along values of a grouping variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group comparisons is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group analysis would answer the question: does dieting effect weight loss differently for males than for females?<br />
In the videos above, you will learn how to set up a multigroup analysis in AMOS, and test it using chi-square differences, and AMOS's built in multigroup function. For those who have seen my video on the critical ratios approach, be warned that currently, the chi-square approach is the most widely accepted because the critical ratios approach doesn't take into account family-wise error which affects a model when testing multiple hypotheses simultaneously. For now, I recommend using the chi-square approach. The AMOS built in multigroup function uses the chi-square approach as well.<br />
<br />
==From Measurement Model to Structural Model ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n-ULF6BGVw0 '''From CFA to SEM in AMOS''']<br />
Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.<br />
<br />
==Creating Factor Scores from Latent Factors==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=dsOS9tQjxW8 '''Imputing Factor Scores in AMOS''']<br />
If you would like to create factor scores (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:<br />
*You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or Excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data). <br />
*Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").<br />
After those two caveats are addressed, then you can simply go to the ''Analyze'' menu, and select ''Data Imputation''. Select ''Regression Imputation'', and then click on the ''Impute'' button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.<br />
<br />
==Need more degrees of freedom==<br />
Did you run your model and observe that DF=0 or CFI=1.000. Sounds like you need more degrees of freedom. There are a few ways to do this:<br />
#If there are opportunities to use latent variables instead of computed variables, use latents.<br />
#If you have control variables, do not link them to every other variable.<br />
#Do not include all paths by default. Just include the ones that make good theoretical sense.<br />
#If a path is not significant, omit it. If you do this, make sure to argue that the reason for doing this was to increase degrees of freedom (and also because the path was not significant).<br />
Increasing the degrees of freedom allows AMOS to calculate model fit measures. If you have zero degrees of freedom, model fit is irrelevant because you are "perfectly" accounting for all possible relationships in the model.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Structural_Equation_Modeling&diff=1303330Structural Equation Modeling2018-04-27T21:40:02Z<p>Jgaskin: /* Interaction effects */</p>
<hr />
<div>“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/<br />
<br />
SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Hypotheses ==<br />
Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe. <br />
===Direct effects===<br />
"Diet has a positive effect on weight loss"<br />
<br />
"An increase in hours spent watching television will negatively effect weight loss"<br />
===Mediated effects===<br />
For mediated effects, be sure to indicate the direction of the mediation (positive or negative), the degree of the mediation (partial, full, or simply indirect), and the direction of the mediated relationship (positive or negative).<br />
<br />
"Exercise positively and partially mediates the positive relationship between diet and weight loss"<br />
<br />
"Television time positively and fully mediates the positive relationship between diet and weight loss"<br />
<br />
"Diet affects weight loss positively and indirectly through exercise"<br />
<br />
===Interaction effects===<br />
"Exercise strengthens the positive relationship between diet and weight loss"<br />
<br />
"Exercise amplifies the positive relationship between diet and weight loss"<br />
<br />
"TV time dampens the positive relationship between diet and weight loss"<br />
<br />
===Multi-group effects===<br />
"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"<br />
<br />
"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"<br />
<br />
"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"<br />
===Mediated Moderation===<br />
An example of a mediated moderation hypothesis would be something like: <br />
<br />
“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.” <br />
<br />
In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator): <br />
<br />
“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”<br />
===Handling controls===<br />
When including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]"<br />
For example:<br />
<br />
"Exercise positively moderates the positive relationship between diet and weight loss ''when controlling for TV time and diet''"<br />
<br />
"Diet has a positive effect on weight loss ''when controlling for TV time and diet''"<br />
<br />
Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.<br />
<br />
=== Logical Support for Hypotheses ===<br />
Getting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for '''''WHY''''' you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like: <br />
*Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).<br />
<br />
===Statistical Support for Hypotheses through global and local tests===<br />
In order for a hypothesis to be supported, many criteria must be met. These criteria can be classified as global or local tests. In order for a hypothesis to be supported, the local test must be met, but in order for a local test to have meaning, all global tests must be met. Global tests of model fit are the first necessity. If a hypothesized relationship has a significant p-value, but the model has poor fit, we cannot have confidence in that p-value. Next is the global test of variance explained or R-squared. We might observe significant p-values and good model fit, but if R-square is only 0.025, then the relationships we are testing are not very meaningful because they do not explain sufficient variance in the dependent variable. The figure below illustrates the precedence of global and local tests. Lastly, and almost needless to explain, if a regression weight is significant, but is in the wrong direction, our hypothesis is not supported. Instead, there is counter-evidence. For example, if we theorized that exercise would increase weight loss, but instead, exercise decreased weight loss, then we would have counter-evidence.<br />
<br />
[[File:globallocal.png]]<br />
<br />
== Controls ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Controls.pptx '''Controls''']<br />
Controls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a negative effect on school performance. But there are many things that could affect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.<br />
<br />
[[File:controlsIQ.png]]<br />
<br />
As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.<br />
<br />
Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default. However, there are different schools of thought on this. The downside of covarying with all exogenous variables is that you gain no degrees of freedom. If you are in need of degrees of freedom, then try removing the non-significant covariances with controls.<br />
<br />
[[File:controlsAMOS.png]]<br />
<br />
When reporting the model, you '''''do''''' need to include the controls in '''''all''''' your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).<br />
<br />
== Mediation ==<br />
*[[File:books.jpg]]'''''Lesson:''''' [http://www.kolobkreations.com/Mediation%20Step%20by%20Step%20with%20Bootstrapping.pptx '''Testing Mediation using Bootstrapping''']<br />
*[[File:YouTube.png]] '''''Video Lecture:''''' [http://youtu.be/j_yufPUjkwk?hd=1 '''A Simpler Guide to Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/0artfnxyF_A '''Mediation in AMOS''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/41XgTZc66ko '''Specific Indirect Effects''']<br />
*'''''Hair et al.:''''' ''pp. 751-755''<br />
<br />
=== Concept ===<br />
<br />
Mediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, ''work effectiveness'', may be a good mediator. We would say that work effectiveness mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance is ''better'' explained through the mediator of work effectiveness. The logic is, intelligent workers tend to perform better '''because''' they work more efficiently. Thus, when intelligence leads to working smarter, then we observe greater performance. <br />
<br />
[[File:mediation.png]]<br />
<br />
<br />
We used to theorize three main types of mediation based on the Barron and Kenny approach; namely: 1) partial, 2) full, and 3) indirect. However, recent literature suggests that mediation is less nuanced than this -- that simply, if a significant indirect effect exists, then mediation is present.<br />
<br />
Here is another useful site for mediation: https://msu.edu/~falkcarl/mediation.html<br />
<br />
== Interaction ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=K34sF_AmWio '''Testing Interaction Effects''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Interaction.pptx '''Interaction Effects''']<br />
===Concept===<br />
In factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. <br />
This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example<br />
*you lose 1 pound of weight for every hour you exercise<br />
*you lose 1 pound of weight for every 500 calories you cut back from your regular diet<br />
*but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds<br />
So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:<br />
*Chocolate is yummy<br />
*Cheese is yummy<br />
*but combining chocolate and cheese is yucky!<br />
<br />
The following figure is an example of a simple interaction model.<br />
<br />
[[File:interaction.png]]<br />
<br />
===Types===<br />
Interactions enable more precise explanation of causal effects by providing a method for explaining not only ''how'' X affects Y, but also ''under what circumstances'' the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically. <br />
<br />
[[File:interactionTypes.png]]<br />
<br />
== Model fit again ==<br />
You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. Every time the model changes and a hypothesis is tested, model fit must be assessed. If multiple hypotheses are tested on the same model, model fit will not change, so it only needs to be addressed once for that set of hypotheses. The method for assessing model fit in a causal model is the same as for a measurement model: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms. '''Also, a warning that some argue there is never an appropriate argument for covarying error terms.''' (I tend to agree that they should not be covaried.)<br />
*If the correlated variables are ''not'' logically '''causally''' correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship.<br />
**e.g., burnout from customers is highly correlated with burnout from management<br />
**We expect these to have similar values (residuals) because they are logically similar and have similar wording in our survey, but they do not necessarily have any causal ties.<br />
*If the correlated variables are logically '''causally''' correlated, then simply add a regression line.<br />
**e.g., burnout from customers is highly correlated with satisfaction with customers<br />
**We expect burnC to predict satC, so ''not'' accounting for it is negligent.<br />
<br />
Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
== Multi-group ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mirI5ETQRTA '''Testing Multi-group Moderation using Chi-square difference test'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/w5ikoIgTIc0?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Testing Multi-group differences using AMOS's multigroup function''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Mediation%20and%20Multi-group%20Moderation.pptx '''Mediation versus Moderation''']<br />
Multi-group comparisons are a special form of moderation in which a dataset is split along values of a grouping variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group comparisons is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group analysis would answer the question: does dieting effect weight loss differently for males than for females?<br />
In the videos above, you will learn how to set up a multigroup analysis in AMOS, and test it using chi-square differences, and AMOS's built in multigroup function. For those who have seen my video on the critical ratios approach, be warned that currently, the chi-square approach is the most widely accepted because the critical ratios approach doesn't take into account family-wise error which affects a model when testing multiple hypotheses simultaneously. For now, I recommend using the chi-square approach. The AMOS built in multigroup function uses the chi-square approach as well.<br />
<br />
==From Measurement Model to Structural Model ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n-ULF6BGVw0 '''From CFA to SEM in AMOS''']<br />
Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.<br />
<br />
==Creating Factor Scores from Latent Factors==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=dsOS9tQjxW8 '''Imputing Factor Scores in AMOS''']<br />
If you would like to create factor scores (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:<br />
*You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or Excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data). <br />
*Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").<br />
After those two caveats are addressed, then you can simply go to the ''Analyze'' menu, and select ''Data Imputation''. Select ''Regression Imputation'', and then click on the ''Impute'' button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.<br />
<br />
==Need more degrees of freedom==<br />
Did you run your model and observe that DF=0 or CFI=1.000. Sounds like you need more degrees of freedom. There are a few ways to do this:<br />
#If there are opportunities to use latent variables instead of computed variables, use latents.<br />
#If you have control variables, do not link them to every other variable.<br />
#Do not include all paths by default. Just include the ones that make good theoretical sense.<br />
#If a path is not significant, omit it. If you do this, make sure to argue that the reason for doing this was to increase degrees of freedom (and also because the path was not significant).<br />
Increasing the degrees of freedom allows AMOS to calculate model fit measures. If you have zero degrees of freedom, model fit is irrelevant because you are "perfectly" accounting for all possible relationships in the model.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=1302907Guidelines2018-04-27T15:50:40Z<p>Jgaskin: /* Some general guidelines for the order to conduct each procedure */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run] <br />
*The SEM Speed Run does almost everything listed below. However, I've also added below a few more links for the few items that either are not covered in the speed run, or have been updated since the speed run was made.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin]<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included.<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=1302904Guidelines2018-04-27T15:48:47Z<p>Jgaskin: /* Some general guidelines for the order to conduct each procedure */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run <br />
*The SEM Speed Run does almost everything listed below. However, I've also added below a few more links for the few items that either are not covered in the speed run, or have been updated since the speed run was made.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin]<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included.<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Main_Page&diff=1302900Main Page2018-04-27T15:46:21Z<p>Jgaskin: /* Welcome to Gaskination's StatWiki! */</p>
<hr />
<div>== Welcome to Gaskination's StatWiki! ==<br />
''''' Supported by the Doctor of Management Program at Case Western Reserve University and by Brigham Young University'''''<br />
<br />
This wiki has been created to provide you with all sorts of statistics tutorials to guide you through the standard statistical analyses common to hypothesis testing in the social sciences. Examples are geared toward organizational, business, and management fields. AMOS, SPSS, Excel, SmartPLS and PLS-graph are used to perform all analyses provided on this wiki. This wiki is not exhaustive, or even very comprehensive. I provide ''brief'' explanations of concepts, rather than full length instruction. My main focus is on providing guidance on ''how to perform'' the statistics. This is very much a mechanically oriented resource. For more comprehensive instruction on the methods demonstrated in this wiki, please refer to Hair et al 2010 (''Multivariate Data Analysis''), as well as to the powerpoint presentations offered for most of the topics. I hope you find the resources here useful. I will likely update them from time to time. <br />
<br />
This teaching material has been developed as part of a quantitative social science method sequence aimed to prepare [http://weatherhead.case.edu/degrees/doctor-management Doctor of Management] students for their quantitative research project. These students are working executives who carry out a rigorous quantitative project as part of their research stream. Examples of these projects and examples of how to report results of quantitative research projects in academic papers can be found at in the [http://weatherhead.case.edu/degrees/doctor-management/dm-research DM Research Library]. <br />
<br />
'''Acknowledgments'''<br />
<br />
The materials and teaching approach adopted in these materials have been developed by a team of teachers consisting of Jagdip Singh, Toni Somers, Kalle Lyytinen, Nick Berente, Shyam Giridharadas and me over the last several years. Although I have developed and refined much of the material and the resources in this Wiki, I am not the sole contributor. I greatly appreciate the work done by Kalle Lyytinen (Case Western Reserve University) , Toni Somers (Wayne State University), Nick Berente (University of Georgia), Shyam Giridharadas (University of Washington) and Jagdip Singh (Case Western Reserve University) who selected and identified much of the literature underlying the materials and also originally developed many of the Powerpoint slides. They also helped me refine these materials by providing useful feedback on the slides and videos. I also appreciate the contribution and help of Jagdip Singh (Case Western Reserve University), who is the owner of the Sohana and Bencare datasets used in the examples and which are made available below. I also acknowledge the continued support of the [http://weatherhead.case.edu/degrees/doctor-management Doctor of Management Program] at the [http://weatherhead.case.edu Weatherhead School of Management] at [http://www.case.edu Case Western Reserve University], Cleveland, Ohio for their involvement, support, and sponsoring of this wiki, as well as to Brigham Young University for encouraging me in all my SEM-related endeavors.<br />
<br />
''Please report any problems with the wiki to james.eric.gaskin@gmail.com'' [mailto:james.eric.gaskin@gmail.com]<br />
*If you are having trouble and cannot figure out what to do, even after using the resources on this wiki or on Gaskination, then you might benefit from the archive of support emails I have received and responded to over the past years: [http://www.kolobkreations.com/StatsHelpArchive.pdf Stats Help Archive].<br />
*[[File:Excelicon.jpg]]You may find this set of Excel tools useful/necessary for many of the analyses you will learn about in this wiki: [http://www.kolobkreations.com/Stats%20Tools%20Package.xlsm '''''Stats Tools Package'''''] Please note that this one is the most recently updated one, and does not include a variance column in the Validity Master sheet. This is because it was a mistake to include variances when working with standardized estimates. <br />
*You may also find this basics tutorial for AMOS and SPSS useful as a starter.<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=efC81f-Z22Q '''Basic Analysis in AMOS and SPSS'''] <br />
<br />
'''Datasets'''<br />
<br />
Here are some links to the datasets, and related resources, I use in many of the video tutorials. <br />
*[http://www.kolobkreations.com/YouTube%20SEM%20Series.sav YouTube SEM Series] (this data goes along with this YouTube playlist: [https://www.youtube.com/playlist?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series 2016'''])<br />
*[http://www.kolobkreations.com/Sohana.zip Sohana]<br />
*[http://www.kolobkreations.com/Bencare.zip Bencare]<br />
*[http://www.kolobkreations.com/SalesPerformance.sav Sales Performance]<br />
*[https://drive.google.com/open?id=1k-4v8qFGqKi3m-mrQtyct6j2AxkJ4R-h Example Models]<br />
<br />
===How to cite Gaskination resources===<br />
''[http://www.kolobkreations.com/PLSIEEETPC2014.pdf IEEE TPC PLS article]'': <br />
*Paul Benjamin Lowry & James Gaskin (2014). “Partial Least Squares (PLS) Structural Equation Modeling (SEM) for Building and Testing Behavioral Causal Theory: When to Choose It and How to Use It,” IEEE TPC (57:2), pp. 123-146.<br />
<br />
''Wiki'':<br />
*Gaskin, J., (2016), "Name of section", Gaskination's StatWiki. http://statwiki.kolobkreations.com<br />
<br />
''YouTube videos'':<br />
*Gaskin, J., (Year video uploaded), "Name of video", Gaskination's Statistics. http://youtube.com/Gaskination<br />
<br />
''Stats Tools Package'':<br />
*Gaskin, J., (2016), "Name of tab", Stats Tools Package. http://statwiki.kolobkreations.com<br />
<br />
''Plugin or Estimand'':<br />
*Gaskin, J., (2016), "Name of Plugin or Estimand", Gaskination's Statistics. http://statwiki.kolobkreations.com<br />
<br />
== StatWiki Contents ==<br />
1. [[Data screening]]<br />
<br />
*[[Data screening#Missing Data|Missing Data]]<br />
*[[Data screening#Outliers|Outliers]]<br />
*[[Data screening#Normality|Normality]]<br />
*[[Data screening#Linearity|Linearity]]<br />
*[[Data screening#Homoscedasticity|Homoscedasticity]]<br />
*[[Data screening#Multicollinearity|Multicollinearity]] <br />
<br />
2. [[Exploratory Factor Analysis]] (EFA)<br />
<br />
*[[Exploratory Factor Analysis#Rotation types|Rotation types]]<br />
*[[Exploratory Factor Analysis#Factoring methods|Factoring methods]]<br />
*[[Exploratory Factor Analysis#Appropriateness of data|Appropriateness of data]]<br />
*[[Exploratory Factor Analysis#Communalities|Communalities]]<br />
*[[Exploratory Factor Analysis#Dimensionality|Dimensionality]]<br />
*[[Exploratory Factor Analysis#Factor Structure|Factor Structure]]<br />
*[[Exploratory Factor Analysis#Convergent validity|Convergent validity]]<br />
*[[Exploratory Factor Analysis#Discriminant validity|Discriminant validity]]<br />
*[[Exploratory Factor Analysis#Face validity|Face validity]]<br />
*[[Exploratory Factor Analysis#Reliability|Reliability]]<br />
*[[Exploratory Factor Analysis#Formative vs. Reflective|Formative vs. Reflective]]<br />
<br />
3. [[Confirmatory Factor Analysis]] (CFA)<br />
<br />
*[[Confirmatory Factor Analysis#Model Fit|Model Fit]]<br />
*[[Confirmatory Factor Analysis#Validity and Reliability|Validity and Reliability]]<br />
*[[Confirmatory Factor Analysis#Common Method Bias (CMB)|Common Method Bias (CMB)]]<br />
*[[Confirmatory Factor Analysis#Measurement_Model_Invariance|Invariance]]<br />
*[[Confirmatory Factor Analysis#2nd Order Factors|2nd Order Factors]]<br />
<br />
4. [[Structural Equation Modeling]] (SEM)<br />
<br />
*[[Structural Equation Modeling#Hypotheses|Hypotheses]]<br />
*[[Structural Equation Modeling#Controls|Controls]]<br />
*[[Structural Equation Modeling#Mediation|Mediation]]<br />
*[[Structural Equation Modeling#Interaction|Interaction]]<br />
*[[Structural Equation Modeling#Model fit again|Model fit again]]<br />
*[[Structural Equation Modeling#Multi-group|Multi-group]]<br />
*[[Structural Equation Modeling#From Measurement Model to Structural Model|From Measurement Model to Structural Model]]<br />
*[[Structural Equation Modeling#Creating Composites from Latent Factors|Creating Composites from Latent Factors]]<br />
<br />
5. [[PLS]] (Partial Least Squares)<br />
*[[PLS#Installing PLS-graph|Installing PLS-graph]]<br />
*[[PLS#Troubleshooting|Troubleshooting]]<br />
*[[PLS#Sample Size Rule|Sample Size Rule]]<br />
*[[PLS#Factor Analysis|Factor Analysis]]<br />
*[[PLS#Testing Causal Models|Testing Causal Models]]<br />
*[[PLS#Testing Group Differences|Testing Group Differences]]<br />
*[[PLS#Handling Missing Data|Handling Missing Data]]<br />
*[[PLS#Convergent and Discriminant Validity|Convergent and Discriminant Validity]]<br />
*[[PLS#Common Method Bias|Common Method Bias]]<br />
*[[PLS#Interaction|Interaction]]<br />
*[[PLS#SmartPLS|SmartPLS]]<br />
<br />
6. [[Guidelines|General Guidelines]]<br />
<br />
*[[Guidelines#Example Analysis|Example Analysis]]<br />
*[[Guidelines#Ten Steps|Ten Steps to Building a Good Quant Model]]<br />
*[[Guidelines#Order of Operations|Order of Operations]]<br />
*[[Guidelines#Structuring a Quantitative Paper|General Guidelines to Writing a Quant Paper]]<br />
<br />
7. [[Cluster Analysis|Cluster Analysis]]<br />
*Just a bunch of videos here</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Structural_Equation_Modeling&diff=1298370Structural Equation Modeling2018-04-25T20:10:47Z<p>Jgaskin: /* Model fit again */</p>
<hr />
<div>“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/<br />
<br />
SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Hypotheses ==<br />
Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe. <br />
===Direct effects===<br />
"Diet has a positive effect on weight loss"<br />
<br />
"An increase in hours spent watching television will negatively effect weight loss"<br />
===Mediated effects===<br />
For mediated effects, be sure to indicate the direction of the mediation (positive or negative), the degree of the mediation (partial, full, or simply indirect), and the direction of the mediated relationship (positive or negative).<br />
<br />
"Exercise positively and partially mediates the positive relationship between diet and weight loss"<br />
<br />
"Television time positively and fully mediates the positive relationship between diet and weight loss"<br />
<br />
"Diet affects weight loss positively and indirectly through exercise"<br />
<br />
===Interaction effects===<br />
"Exercise positively moderates the positive relationship between diet and weight loss"<br />
<br />
"Exercise amplifies the positive relationship between diet and weight loss"<br />
<br />
"TV time negatively moderates (dampens) the positive relationship between diet and weight loss"<br />
<br />
===Multi-group effects===<br />
"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"<br />
<br />
"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"<br />
<br />
"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"<br />
===Mediated Moderation===<br />
An example of a mediated moderation hypothesis would be something like: <br />
<br />
“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.” <br />
<br />
In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator): <br />
<br />
“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”<br />
===Handling controls===<br />
When including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]"<br />
For example:<br />
<br />
"Exercise positively moderates the positive relationship between diet and weight loss ''when controlling for TV time and diet''"<br />
<br />
"Diet has a positive effect on weight loss ''when controlling for TV time and diet''"<br />
<br />
Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.<br />
<br />
=== Logical Support for Hypotheses ===<br />
Getting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for '''''WHY''''' you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like: <br />
*Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).<br />
<br />
===Statistical Support for Hypotheses through global and local tests===<br />
In order for a hypothesis to be supported, many criteria must be met. These criteria can be classified as global or local tests. In order for a hypothesis to be supported, the local test must be met, but in order for a local test to have meaning, all global tests must be met. Global tests of model fit are the first necessity. If a hypothesized relationship has a significant p-value, but the model has poor fit, we cannot have confidence in that p-value. Next is the global test of variance explained or R-squared. We might observe significant p-values and good model fit, but if R-square is only 0.025, then the relationships we are testing are not very meaningful because they do not explain sufficient variance in the dependent variable. The figure below illustrates the precedence of global and local tests. Lastly, and almost needless to explain, if a regression weight is significant, but is in the wrong direction, our hypothesis is not supported. Instead, there is counter-evidence. For example, if we theorized that exercise would increase weight loss, but instead, exercise decreased weight loss, then we would have counter-evidence.<br />
<br />
[[File:globallocal.png]]<br />
<br />
== Controls ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Controls.pptx '''Controls''']<br />
Controls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a negative effect on school performance. But there are many things that could affect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.<br />
<br />
[[File:controlsIQ.png]]<br />
<br />
As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.<br />
<br />
Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default. However, there are different schools of thought on this. The downside of covarying with all exogenous variables is that you gain no degrees of freedom. If you are in need of degrees of freedom, then try removing the non-significant covariances with controls.<br />
<br />
[[File:controlsAMOS.png]]<br />
<br />
When reporting the model, you '''''do''''' need to include the controls in '''''all''''' your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).<br />
<br />
== Mediation ==<br />
*[[File:books.jpg]]'''''Lesson:''''' [http://www.kolobkreations.com/Mediation%20Step%20by%20Step%20with%20Bootstrapping.pptx '''Testing Mediation using Bootstrapping''']<br />
*[[File:YouTube.png]] '''''Video Lecture:''''' [http://youtu.be/j_yufPUjkwk?hd=1 '''A Simpler Guide to Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/0artfnxyF_A '''Mediation in AMOS''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/41XgTZc66ko '''Specific Indirect Effects''']<br />
*'''''Hair et al.:''''' ''pp. 751-755''<br />
<br />
=== Concept ===<br />
<br />
Mediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, ''work effectiveness'', may be a good mediator. We would say that work effectiveness mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance is ''better'' explained through the mediator of work effectiveness. The logic is, intelligent workers tend to perform better '''because''' they work more efficiently. Thus, when intelligence leads to working smarter, then we observe greater performance. <br />
<br />
[[File:mediation.png]]<br />
<br />
<br />
We used to theorize three main types of mediation based on the Barron and Kenny approach; namely: 1) partial, 2) full, and 3) indirect. However, recent literature suggests that mediation is less nuanced than this -- that simply, if a significant indirect effect exists, then mediation is present.<br />
<br />
Here is another useful site for mediation: https://msu.edu/~falkcarl/mediation.html<br />
<br />
== Interaction ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=K34sF_AmWio '''Testing Interaction Effects''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Interaction.pptx '''Interaction Effects''']<br />
===Concept===<br />
In factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. <br />
This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example<br />
*you lose 1 pound of weight for every hour you exercise<br />
*you lose 1 pound of weight for every 500 calories you cut back from your regular diet<br />
*but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds<br />
So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:<br />
*Chocolate is yummy<br />
*Cheese is yummy<br />
*but combining chocolate and cheese is yucky!<br />
<br />
The following figure is an example of a simple interaction model.<br />
<br />
[[File:interaction.png]]<br />
<br />
===Types===<br />
Interactions enable more precise explanation of causal effects by providing a method for explaining not only ''how'' X affects Y, but also ''under what circumstances'' the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically. <br />
<br />
[[File:interactionTypes.png]]<br />
<br />
== Model fit again ==<br />
You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. Every time the model changes and a hypothesis is tested, model fit must be assessed. If multiple hypotheses are tested on the same model, model fit will not change, so it only needs to be addressed once for that set of hypotheses. The method for assessing model fit in a causal model is the same as for a measurement model: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms. '''Also, a warning that some argue there is never an appropriate argument for covarying error terms.''' (I tend to agree that they should not be covaried.)<br />
*If the correlated variables are ''not'' logically '''causally''' correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship.<br />
**e.g., burnout from customers is highly correlated with burnout from management<br />
**We expect these to have similar values (residuals) because they are logically similar and have similar wording in our survey, but they do not necessarily have any causal ties.<br />
*If the correlated variables are logically '''causally''' correlated, then simply add a regression line.<br />
**e.g., burnout from customers is highly correlated with satisfaction with customers<br />
**We expect burnC to predict satC, so ''not'' accounting for it is negligent.<br />
<br />
Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
== Multi-group ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mirI5ETQRTA '''Testing Multi-group Moderation using Chi-square difference test'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/w5ikoIgTIc0?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Testing Multi-group differences using AMOS's multigroup function''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Mediation%20and%20Multi-group%20Moderation.pptx '''Mediation versus Moderation''']<br />
Multi-group comparisons are a special form of moderation in which a dataset is split along values of a grouping variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group comparisons is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group analysis would answer the question: does dieting effect weight loss differently for males than for females?<br />
In the videos above, you will learn how to set up a multigroup analysis in AMOS, and test it using chi-square differences, and AMOS's built in multigroup function. For those who have seen my video on the critical ratios approach, be warned that currently, the chi-square approach is the most widely accepted because the critical ratios approach doesn't take into account family-wise error which affects a model when testing multiple hypotheses simultaneously. For now, I recommend using the chi-square approach. The AMOS built in multigroup function uses the chi-square approach as well.<br />
<br />
==From Measurement Model to Structural Model ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n-ULF6BGVw0 '''From CFA to SEM in AMOS''']<br />
Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.<br />
<br />
==Creating Factor Scores from Latent Factors==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=dsOS9tQjxW8 '''Imputing Factor Scores in AMOS''']<br />
If you would like to create factor scores (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:<br />
*You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or Excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data). <br />
*Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").<br />
After those two caveats are addressed, then you can simply go to the ''Analyze'' menu, and select ''Data Imputation''. Select ''Regression Imputation'', and then click on the ''Impute'' button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.<br />
<br />
==Need more degrees of freedom==<br />
Did you run your model and observe that DF=0 or CFI=1.000. Sounds like you need more degrees of freedom. There are a few ways to do this:<br />
#If there are opportunities to use latent variables instead of computed variables, use latents.<br />
#If you have control variables, do not link them to every other variable.<br />
#Do not include all paths by default. Just include the ones that make good theoretical sense.<br />
#If a path is not significant, omit it. If you do this, make sure to argue that the reason for doing this was to increase degrees of freedom (and also because the path was not significant).<br />
Increasing the degrees of freedom allows AMOS to calculate model fit measures. If you have zero degrees of freedom, model fit is irrelevant because you are "perfectly" accounting for all possible relationships in the model.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298360Confirmatory Factor Analysis2018-04-25T20:06:53Z<p>Jgaskin: /* Modification indices */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/wV6UudZSBCA '''Model Fit Thresholds''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms ('''however, some argue that there are never appropriate reasons to covary errors'''), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights. Keep constraints the same, but for each factor, for one of the groups, make the variance constraint = 1. This can be done in the ''Manage Models'' section of AMOS.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298357Confirmatory Factor Analysis2018-04-25T20:04:45Z<p>Jgaskin: /* Model Fit */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/wV6UudZSBCA '''Model Fit Thresholds''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights. Keep constraints the same, but for each factor, for one of the groups, make the variance constraint = 1. This can be done in the ''Manage Models'' section of AMOS.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298350Confirmatory Factor Analysis2018-04-25T20:00:00Z<p>Jgaskin: /* Scalar */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights. Keep constraints the same, but for each factor, for one of the groups, make the variance constraint = 1. This can be done in the ''Manage Models'' section of AMOS.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298346Confirmatory Factor Analysis2018-04-25T19:58:35Z<p>Jgaskin: Undo revision 1298342 by Jgaskin (talk)</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298342Confirmatory Factor Analysis2018-04-25T19:56:41Z<p>Jgaskin: Reverted edits by Jgaskin (talk) to last revision by 121.52.153.101</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*ASV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero-constrained ===<br />
The most current and best approach is the zero-constrained test where you include the CLF and a Marker if available, and then conduct a chi-square difference test between the unconstrained model and a model where all paths from the CLF are constrained to zero. This approach tests whether the amount of shared variance across all variables is significantly different from zero. If it is, then we conclude that method bias does exist in our measures. The video listed at the top of this subsection demonstrates this approach.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=6j4_ZrkCxTc '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural and metric invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset).<br />
<br />
An even simpler and less time-consuming approach to metric invariance is to conduct a multigroup moderation test using critical ratios for differences in AMOS. Below is a video to explain how to do this. The video is about a lot of things in the CFA, but the link below will start you at the time point for testing metric invariance with critical ratios.<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''Metric Invariance''']<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298334Confirmatory Factor Analysis2018-04-25T19:53:54Z<p>Jgaskin: /* Scalar */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. This is done the same as with metric invariance, but with the test being done on intercepts and structural covariances instead of measurement weights.<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Structural_Equation_Modeling&diff=1298302Structural Equation Modeling2018-04-25T19:37:42Z<p>Jgaskin: /* Mediation */</p>
<hr />
<div>“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/<br />
<br />
SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
<br />
== Hypotheses ==<br />
Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe. <br />
===Direct effects===<br />
"Diet has a positive effect on weight loss"<br />
<br />
"An increase in hours spent watching television will negatively effect weight loss"<br />
===Mediated effects===<br />
For mediated effects, be sure to indicate the direction of the mediation (positive or negative), the degree of the mediation (partial, full, or simply indirect), and the direction of the mediated relationship (positive or negative).<br />
<br />
"Exercise positively and partially mediates the positive relationship between diet and weight loss"<br />
<br />
"Television time positively and fully mediates the positive relationship between diet and weight loss"<br />
<br />
"Diet affects weight loss positively and indirectly through exercise"<br />
<br />
===Interaction effects===<br />
"Exercise positively moderates the positive relationship between diet and weight loss"<br />
<br />
"Exercise amplifies the positive relationship between diet and weight loss"<br />
<br />
"TV time negatively moderates (dampens) the positive relationship between diet and weight loss"<br />
<br />
===Multi-group effects===<br />
"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"<br />
<br />
"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"<br />
<br />
"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"<br />
===Mediated Moderation===<br />
An example of a mediated moderation hypothesis would be something like: <br />
<br />
“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.” <br />
<br />
In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator): <br />
<br />
“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”<br />
===Handling controls===<br />
When including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]"<br />
For example:<br />
<br />
"Exercise positively moderates the positive relationship between diet and weight loss ''when controlling for TV time and diet''"<br />
<br />
"Diet has a positive effect on weight loss ''when controlling for TV time and diet''"<br />
<br />
Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.<br />
<br />
=== Logical Support for Hypotheses ===<br />
Getting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for '''''WHY''''' you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like: <br />
*Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).<br />
<br />
===Statistical Support for Hypotheses through global and local tests===<br />
In order for a hypothesis to be supported, many criteria must be met. These criteria can be classified as global or local tests. In order for a hypothesis to be supported, the local test must be met, but in order for a local test to have meaning, all global tests must be met. Global tests of model fit are the first necessity. If a hypothesized relationship has a significant p-value, but the model has poor fit, we cannot have confidence in that p-value. Next is the global test of variance explained or R-squared. We might observe significant p-values and good model fit, but if R-square is only 0.025, then the relationships we are testing are not very meaningful because they do not explain sufficient variance in the dependent variable. The figure below illustrates the precedence of global and local tests. Lastly, and almost needless to explain, if a regression weight is significant, but is in the wrong direction, our hypothesis is not supported. Instead, there is counter-evidence. For example, if we theorized that exercise would increase weight loss, but instead, exercise decreased weight loss, then we would have counter-evidence.<br />
<br />
[[File:globallocal.png]]<br />
<br />
== Controls ==<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Controls.pptx '''Controls''']<br />
Controls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a negative effect on school performance. But there are many things that could affect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.<br />
<br />
[[File:controlsIQ.png]]<br />
<br />
As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.<br />
<br />
Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default. However, there are different schools of thought on this. The downside of covarying with all exogenous variables is that you gain no degrees of freedom. If you are in need of degrees of freedom, then try removing the non-significant covariances with controls.<br />
<br />
[[File:controlsAMOS.png]]<br />
<br />
When reporting the model, you '''''do''''' need to include the controls in '''''all''''' your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).<br />
<br />
== Mediation ==<br />
*[[File:books.jpg]]'''''Lesson:''''' [http://www.kolobkreations.com/Mediation%20Step%20by%20Step%20with%20Bootstrapping.pptx '''Testing Mediation using Bootstrapping''']<br />
*[[File:YouTube.png]] '''''Video Lecture:''''' [http://youtu.be/j_yufPUjkwk?hd=1 '''A Simpler Guide to Mediation''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/0artfnxyF_A '''Mediation in AMOS''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/41XgTZc66ko '''Specific Indirect Effects''']<br />
*'''''Hair et al.:''''' ''pp. 751-755''<br />
<br />
=== Concept ===<br />
<br />
Mediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, ''work effectiveness'', may be a good mediator. We would say that work effectiveness mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance is ''better'' explained through the mediator of work effectiveness. The logic is, intelligent workers tend to perform better '''because''' they work more efficiently. Thus, when intelligence leads to working smarter, then we observe greater performance. <br />
<br />
[[File:mediation.png]]<br />
<br />
<br />
We used to theorize three main types of mediation based on the Barron and Kenny approach; namely: 1) partial, 2) full, and 3) indirect. However, recent literature suggests that mediation is less nuanced than this -- that simply, if a significant indirect effect exists, then mediation is present.<br />
<br />
Here is another useful site for mediation: https://msu.edu/~falkcarl/mediation.html<br />
<br />
== Interaction ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=K34sF_AmWio '''Testing Interaction Effects''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Interaction.pptx '''Interaction Effects''']<br />
===Concept===<br />
In factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. <br />
This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example<br />
*you lose 1 pound of weight for every hour you exercise<br />
*you lose 1 pound of weight for every 500 calories you cut back from your regular diet<br />
*but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds<br />
So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:<br />
*Chocolate is yummy<br />
*Cheese is yummy<br />
*but combining chocolate and cheese is yucky!<br />
<br />
The following figure is an example of a simple interaction model.<br />
<br />
[[File:interaction.png]]<br />
<br />
===Types===<br />
Interactions enable more precise explanation of causal effects by providing a method for explaining not only ''how'' X affects Y, but also ''under what circumstances'' the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically. <br />
<br />
[[File:interactionTypes.png]]<br />
<br />
== Model fit again ==<br />
You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. Every time the model changes and a hypothesis is tested, model fit must be assessed. If multiple hypotheses are tested on the same model, model fit will not change, so it only needs to be addressed once for that set of hypotheses. The method for assessing model fit in a causal model is the same as for a measurement model: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms. <br />
*If the correlated variables are ''not'' logically '''causally''' correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship.<br />
**e.g., burnout from customers is highly correlated with burnout from management<br />
**We expect these to have similar values (residuals) because they are logically similar and have similar wording in our survey, but they do not necessarily have any causal ties.<br />
*If the correlated variables are logically '''causally''' correlated, then simply add a regression line.<br />
**e.g., burnout from customers is highly correlated with satisfaction with customers<br />
**We expect burnC to predict satC, so ''not'' accounting for it is negligent.<br />
<br />
Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
== Multi-group ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=mirI5ETQRTA '''Testing Multi-group Moderation using Chi-square difference test'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/w5ikoIgTIc0?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Testing Multi-group differences using AMOS's multigroup function''']<br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/Mediation%20and%20Multi-group%20Moderation.pptx '''Mediation versus Moderation''']<br />
Multi-group comparisons are a special form of moderation in which a dataset is split along values of a grouping variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group comparisons is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group analysis would answer the question: does dieting effect weight loss differently for males than for females?<br />
In the videos above, you will learn how to set up a multigroup analysis in AMOS, and test it using chi-square differences, and AMOS's built in multigroup function. For those who have seen my video on the critical ratios approach, be warned that currently, the chi-square approach is the most widely accepted because the critical ratios approach doesn't take into account family-wise error which affects a model when testing multiple hypotheses simultaneously. For now, I recommend using the chi-square approach. The AMOS built in multigroup function uses the chi-square approach as well.<br />
<br />
==From Measurement Model to Structural Model ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=n-ULF6BGVw0 '''From CFA to SEM in AMOS''']<br />
Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.<br />
<br />
==Creating Factor Scores from Latent Factors==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=dsOS9tQjxW8 '''Imputing Factor Scores in AMOS''']<br />
If you would like to create factor scores (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:<br />
*You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or Excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data). <br />
*Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").<br />
After those two caveats are addressed, then you can simply go to the ''Analyze'' menu, and select ''Data Imputation''. Select ''Regression Imputation'', and then click on the ''Impute'' button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.<br />
<br />
==Need more degrees of freedom==<br />
Did you run your model and observe that DF=0 or CFI=1.000. Sounds like you need more degrees of freedom. There are a few ways to do this:<br />
#If there are opportunities to use latent variables instead of computed variables, use latents.<br />
#If you have control variables, do not link them to every other variable.<br />
#Do not include all paths by default. Just include the ones that make good theoretical sense.<br />
#If a path is not significant, omit it. If you do this, make sure to argue that the reason for doing this was to increase degrees of freedom (and also because the path was not significant).<br />
Increasing the degrees of freedom allows AMOS to calculate model fit measures. If you have zero degrees of freedom, model fit is irrelevant because you are "perfectly" accounting for all possible relationships in the model.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298294Confirmatory Factor Analysis2018-04-25T19:34:09Z<p>Jgaskin: /* Measurement Model Invariance */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs?t=54m19s '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural, metric, and scalar invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above). Make sure you place the factor constraint of 1 on the factor variance, rather than on the indicator paths (as shown in the video). <br />
<br />
=== Scalar ===<br />
If we pass metric invariance, we need to then assess scalar invariance. This can be done as shown in the video above. Essentially you need to assess whether intercepts and structural covariances are equivalent across groups. To do this, you'll need to set each factor variance constraint of one of the groups to equal 1. This can be done in the ''manage models'' section of AMOS. The rest can remain named (as the MGA function in AMOS automatically names them). <br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1298280Confirmatory Factor Analysis2018-04-25T19:24:19Z<p>Jgaskin: /* Metric */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=6j4_ZrkCxTc '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural and metric invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset). If there is a difference between groups, you'll want to find which factors are different (do this one at a time as demonstrated in the video above).<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=References&diff=1295804References2018-04-24T23:30:19Z<p>Jgaskin: /* Confirmatory Factor Analysis */</p>
<hr />
<div>'''Here are some helpful references for structural equation modeling (in no particular order - I just keep adding to the list as they come).''' <br />
<br />
'''To search for a specific term, in Windows hit CTRL+F, on a Mac hit COMMAND+F.''' <br />
<br />
==Constructs and Validity==<br />
*Devellis, R. F. (2003). Scale Development: Theory and Applications Second Edition (Applied Social Research Methods).<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organizational Research Methods, 19(2), 159-203.<br />
*Churchill Jr, G. A. (1979). A paradigm for developing better measures of marketing constructs. Journal of marketing research, 64-73.<br />
*Yaniv, E. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
* Editor’s Comments. (2011). Construct clarity in theories of management and organization. Academy of Management Review, 36(3), 590-592.<br />
*Law, K. S., Wong, C. S., & Mobley, W. M. (1998). Toward a taxonomy of multidimensional constructs. Academy of management review, 23(4), 741-755.<br />
*Shaffer, J. A., DeGeest, D., & Li, A. (2016). Tackling the problem of construct proliferation: A guide to assessing the discriminant validity of conceptually related constructs. Organizational Research Methods, 19(1), 80-110.<br />
*Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806-838.<br />
*Krosnick, J. A. (1999). Survey research. Annual review of psychology, 50(1), 537-567.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35(2), 293-334.<br />
*Bolton, R. N. (1993). Pretesting questionnaires: content analyses of respondents' concurrent verbal protocols. Marketing science, 12(3), 280-303.<br />
*Podsakoff, N. P., Podsakoff, P. M., MacKenzie, S. B., & Klinger, R. L. (2013). Are we really measuring what we say we're measuring? Using video techniques to supplement traditional construct validation procedures. Journal of Applied Psychology, 98(1), 99.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Podsakoff, N. P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS quarterly, 35(2), 293-334.<br />
*Nahm, A. Y., Rao, S. S., Solis-Galvan, L. E., & Ragu-Nathan, T. S. (2002). The Q-sort method: assessing reliability and construct validity of questionnaire items at a pre-testing stage. Journal of Modern Applied Statistical Methods, 1(1), 15.<br />
*Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of consumer research, 30(2), 199-218.<br />
*MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323-326.<br />
*Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social indicators research, 46(2), 137-155.<br />
*Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. Structural equation modeling: Present and future, 195-216.<br />
*Hancock, Gregory R., and Ralph O. Mueller. "Rethinking construct reliability within latent variable systems." Structural equation modeling: Present and future (2001): 195-216. (discusses MaxR(H))<br />
<br />
==Measurement Models==<br />
===Exploratory Factor Analysis===<br />
*Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.<br />
*Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most From Your Analysis. Practical Assessment, Research & Evaluation,10(7), 1-9.<br />
*Reio Jr, T. G., & Shuck, B. (2015). Exploratory factor analysis: Implications for theory, research, and practice. Advances in Developing Human Resources, 17(1), 12-25.<br />
*Treiblmaier, H., & Filzmoser, P. (2010). Exploratory factor analysis revisited: How robust methods support the detection of hidden multivariate data structures in IS research. Information & management, 47(4), 197-207.<br />
*Ferguson, E., & Cox, T. (1993). Exploratory factor analysis: A users’ guide. International Journal of Selection and Assessment, 1(2), 84-94.<br />
<br />
===Confirmatory Factor Analysis===<br />
*Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational research methods, 3(1), 4-70.<br />
*Byrne, B. M. (2008). Testing for multigroup equivalence of a measuring instrument: A walk through the process. Psicothema, 20(4), 872-882.<br />
*Byrne, B. M. (2004). Testing for multigroup invariance using AMOS graphics: A road less traveled. Structural Equation Modeling, 11(2), 272-300.<br />
*Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: Review of practice and implications. Human Resource Management Review, 18(4), 210-222.<br />
*Brown, T. A. (2014). Confirmatory factor analysis for applied research (2nd ed.). Guilford Publications.<br />
*Matsunaga, M. (2015). How to factor-analyze your data right: do’s, don’ts, and how-to’s. International Journal of Psychological Research, 3(1), 97-110.<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
*Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
====Method Bias, Response Bias, Specific Bias====<br />
*Fuller et al., (2016) "Common methods variance detection in business research", Journal of Business Research,<br />
Volume 69, Issue 8, pp. 3192-3198 (suggests Harman's single factor test is useful under certain circumstances).<br />
*Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of applied psychology, 88(5), 879.<br />
*MacKenzie, S. B., & Podsakoff, P. M. (2012). Common method bias in marketing: causes, mechanisms, and procedural remedies. Journal of Retailing, 88(4), 542-555.<br />
*Williams, L. J., Hartman, N., & Cavazotte, F. (2010). Method variance and marker variables: A review and comprehensive CFA marker technique. Organizational Research Methods, 13(3), 477-514.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569. <br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569.<br />
*Doty, D. H., & Glick, W. H. (1998). Common methods bias: does common methods variance really bias results?. Organizational research methods, 1(4), 374-406.<br />
*Estabrook, Ryne, and Michael Neale. “A Comparison of Factor Score Estimation Methods in the Presence of Missing Data: Reliability and an Application to Nicotine Dependence.” Multivariate behavioral research 48.1 (2013): 1–27. PMC. Web. 1 Nov. 2017. <br />
*Arbuckle JL. Amos 7.0 user’s guide. Chicago, IL: SPSS; 2006. <br />
*Bartlett MS. The statistical conception of mental factors. British Journal of Psychology. 1937;28:97–104.<br />
*Lawley DN, Maxwell MA. Factor analysis as a statistical method. 2. London, UK: Butterworths; 1971. <br />
*Horn JL, McArdle JJ, Mason R. When invariance is not invariant: A practical scientist’s view of the ethereal concept of factorial invariancesnce. The Southern Psychologist. 1983;1:179–188.<br />
*Muthén L, Muthén B. Mplus user’s guide. 5. Los Angeles, CA: Author; 1998–2007.<br />
*Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods.<br />
*Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual review of psychology, 63, 539-569<br />
<br />
===Other===<br />
*Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological methods, 5(2), 155.<br />
*Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological assessment, 7(3), 286.<br />
*Gerbing, D. W., & Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of marketing research, 186-192.<br />
*Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality and social psychology bulletin, 28(12), 1629-1646.<br />
*Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of marketing research, 39-50.<br />
*Bagozzi, R. P. (2011). Measurement and meaning in information systems and organizational research: Methodological and philosophical foundations. Mis Quarterly, 261-292.<br />
*MacKenzie, S. B., Podsakoff, P. M., & Jarvis, C. B. (2005). The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. Journal of Applied Psychology, 90(4), 710.<br />
*Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of business research, 61(12), 1203-1218.<br />
<br />
==Mediation, Moderation, and Moderated Mediation==<br />
===Mediation===<br />
*Mathieu, J. E., & Taylor, S. R. (2006). Clarifying conditions and decision points for mediational type inferences in organizational behavior. Journal of Organizational Behavior, 27(8), 1031-1056.<br />
*Mathieu, J. E., DeShon, R. P., & Bergh, D. D. (2008). Mediational inferences in organizational research: Then, now, and beyond. Organizational Research Methods, 11(2), 203-223.<br />
*MacKinnon, D. P., Coxe, S., & Baraldi, A. N. (2012). Guidelines for the investigation of mediating variables in business research. Journal of Business and Psychology, 27(1), 1-14.<br />
*MacKinnon, D. P., & Pirlott, A. G. (2015). Statistical approaches for enhancing causal interpretation of the M to Y relation in mediation analysis. Personality and Social Psychology Review, 19(1), 30-43.<br />
*Preacher, K. J. (2015). Advances in mediation analysis: A survey and synthesis of new developments. Annual Review of Psychology, 66, 825-852.<br />
*Zhao, X., Lynch, J. G., & Chen, Q. (2010). Reconsidering Baron and Kenny: Myths and truths about mediation analysis. Journal of consumer research, 37(2), 197-206.<br />
*Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication monographs, 76(4), 408-420.<br />
<br />
===Moderation and Multigroup===<br />
*Byrne, B. M., & Stewart, S. M. (2006). Teacher's corner: The MACS approach to testing for multigroup invariance of a second-order structure: A walk through the process. Structural Equation Modeling, 13(2), 287-321.<br />
*Schumacker, R. E., & Marcoulides, G. A. (1998). Interaction and nonlinear effects in structural equation modeling. Lawrence Erlbaum Associates Publishers.<br />
*Li, F., Harmer, P., Duncan, T. E., Duncan, S. C., Acock, A., & Boles, S. (1998). Approaches to testing interaction effects using structural equation modeling methodology. Multivariate Behavioral Research, 33(1), 1-39.<br />
*Floh, A., & Treiblmaier, H. (2006). What keeps the e-banking customer loyal? A multigroup analysis of the moderating role of consumer characteristics on e-loyalty in the financial service industry.<br />
<br />
===Both or Other===<br />
*Aguinis, H., Edwards, J. R., & Bradley, K. J. (2016). Improving our understanding of moderation and mediation in strategic management research. Organizational Research Methods, 1094428115627498.<br />
*Sardeshmukh, S. R., & Vandenberg, R. J. (2016). Integrating Moderation and Mediation A Structural Equation Modeling Approach. Organizational Research Methods, 1094428115621609.<br />
*Preacher, K. J., Rucker, D. D., & Hayes, A. F. (2007). Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate behavioral research, 42(1), 185-227.<br />
*Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press.<br />
<br />
==Partial Least Squares==<br />
*Becker, J. M., Klein, K., and Wetzels, M. (2012). Hierarchical Latent Variable Models in PLS-SEM: Guidelines for Using Reflective-Formative Type Models. Long Range Planning, 45(5), 359-394.<br />
*Becker, J.-M., Rai, A., Ringle, C. M., and Völckner, F. (2013). Discovering Unobserved Heterogeneity in Structural Equation Models to Avert Validity Threats. MIS Quarterly, 37 (3), 665-694.<br />
*Gefen, D., & Straub, D. (2005). A practical guide to factorial validity using PLS-Graph: Tutorial and annotated example. Communications of the Association for Information systems, 16(1), 5.<br />
*Hair, J. F., C. M. Ringle, and M. Sarstedt (2011). PLS-SEM. Indeed a Silver Bullet, Journal of Marketing Theory & Practice, 19 (2), 139-151. <br />
*Hair, J. F., M. Sarstedt, C. M. Ringle, and J. A. Mena (2012). An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research, Journal of the Academy of Marketing Science, 40 (3), 414-433. <br />
*Hair, J. F., M. Sarstedt, T. Pieper, and C. M. Ringle (2012). The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications, Long Range Planning, 45(5/6), 320-340. <br />
*Hair, J. F., Ringle, C. M., & Sarstedt, M. (2013). Editorial-partial least squares structural equation modeling: Rigorous applications, better results and higher acceptance.<br />
*Hair, J., Sarstedt, M., Hopkins, L., & G. Kuppelwieser, V. (2014). Partial least squares structural equation modeling (PLS-SEM) An emerging tool in business research. European Business Review, 26(2), 106-121.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2015). A New Criterion for Assessing Discriminant Validity in Variance-based Structural Equation Modeling, Journal of the Academy of Marketing Science, 43 (1), 115–135.<br />
*Henseler, J., C. M. Ringle, and M. Sarstedt (2016). Testing Measurement Invariance of Composites Using Partial Least Squares, International Marketing Review, 33 (3), 405-431.<br />
*Henseler, J., Dijkstra, T.K., Sarstedt, M., Ringle, C.M., Diamantopoulos, A., Straub, D.W., Ketchen, D.J., Hair, J.F., Hult, G.T.M., and Calantone, R.J. (2014). Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013). Organizational Research Methods, 17(2), 182-209. <br />
*Henseler, J., Ringle, C. M., & Sinkovics, R. R. (2009). The use of partial least squares path modeling in international marketing. In New challenges to international marketing (pp. 277-319). Emerald Group Publishing Limited.<br />
*Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach. International Journal of e-Collaboration, 11(4), 1-10.<br />
*Lowry, P. B., & Gaskin, J. (2014). Partial least squares (PLS) structural equation modeling (SEM) for building and testing behavioral causal theory: When to choose it and how to use it. IEEE Transactions on Professional Communication, 57(2), 123-146.<br />
*McIntosh, C. N., Edwards, J. R., & Antonakis, J. (2014). Reflections on partial least squares path modeling. Organizational Research Methods, 17(2), 210-251.<br />
*Monge, C., Cruz, J., & López, F. (2014). Manufacturing and continuous improvement areas using partial least squares path modeling with multiple regression comparison. In Proceedings of CBU International Conference on Innovation, Technology Transfer and Education (2014), February (pp. 3-5).<br />
*Rigdon, E. E. (2014). Rethinking partial least squares path modeling: breaking chains and forging ahead. Long Range Planning, 47(3), 161-167.<br />
*Ringle, C. M., M. Sarstedt, and D. W. Straub (2012). A Critical look at the Use of PLS-SEM in MIS Quarterly, MIS Quarterly, 36(1), iii-xiv.<br />
*Sarstedt, M., Henseler, J., & Ringle, C. M. (2011). Multigroup analysis in partial least squares (PLS) path modeling: Alternative methods and empirical results. In Measurement and research methods in international marketing (pp. 195-218). Emerald Group Publishing Limited.<br />
*Wong, K. K. K. (2013). Partial least squares structural equation modeling (PLS-SEM) techniques using SmartPLS. Marketing Bulletin, 24(1), 1-32.<br />
<br />
==General Topics==<br />
*Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.<br />
*Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage learning.<br />
*Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological bulletin, 103(3), 411.<br />
*Suits, D. B. (1957). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548-551.<br />
*Gefen, D., Rigdon, E. E., & Straub, D. (2011). Editor's comments: an update and extension to SEM guidelines for administrative and social science research. MIS Quarterly, iii-xiv.<br />
*Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.<br />
*Blunch, N. (2013). Introduction to structural equation modeling using IBM SPSS statistics and AMOS (2nd ed.). Los Angeles, CA: Sage.<br />
*Kline, R. B. (2011). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford publications.<br />
*Argyrous, G. (2011). Statistics for research: with a guide to SPSS (3rd ed.). Thousand Oaks, CA: Sage Publications.<br />
*Byrne, B. M. (2009). Structural equation modeling with AMOS: basic concepts, applications, and programming (2nd ed.). Abingdon-on-Thames: Routledge.<br />
*Williams, L. J., Vandenberg, R. J., & Edwards, J. R. (2009). Structural equation modeling in management research: A guide for improved analysis. The Academy of Management Annals, 3 (1), 543-604.<br />
<br />
===Model Fit===<br />
*Kenny, D. A. (2012). Measuring Model Fit. http://davidakenny.net/cm/fit.htm<br />
*Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural equation modeling: a multidisciplinary journal, 6(1), 1-55.<br />
*Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.<br />
*Hooper, D., Coughlan, J., & Mullen, M. (2008) Structural Equation Modelling: Guidelines for Determining Model Fit. Journal of Business Research, 6(1), 53-60.<br />
<br />
==Miscellaneous==<br />
*Jalayer Khalilzadeh, Asli D.A. Tasci, Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research, In Tourism Management, Volume 62, 2017, Pages 89-96, http://www.sciencedirect.com/science/article/pii/S026151771730078X<br />
*Green, J. P., Tonidandel, S., & Cortina, J. M. (2016). Getting through the gate: Statistical and methodological issues raised in the reviewing process. Organizational Research Methods, 19(3), 402-432.<br />
*Malhotra, Naresh K. Marketing research: An applied orientation, 5/e. Pearson Education India, 2008.<br />
*Gravetter, F. J., & Wallnau, L. B. (2016). Statistics for the behavioral sciences (2nd ed.). Los Angeles: SAGE Publications, Inc.<br />
*Blair, J., Czaja, R. F., & Blair, E. A. (2014). Designing surveys: A guide to decisions and procedures (3rd ed.). Sage Publications.<br />
*Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability.<br />
*Kenny, D. A. (2011). Respecification of Latent Variable Models. http://davidakenny.net/cm/respec.htm<br />
*Bagozzi, R. P., & Yi, Y. (2012). Specification, evaluation, and interpretation of structural equation models. Journal of the academy of marketing science, 40(1), 8-34.<br />
*Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270-301. (for Cook's distance)<br />
*Winklhofer, H. M., & Diamantopoulos, A. (2002). Managerial evaluation of sales forecasting effectiveness: A MIMIC modeling approach. International Journal of Research in Marketing, 19(2), 151-166.<br />
*Thomas, D. M., & Watson, R. T. (2002). Q-sorting and MIS research: A primer. Communications of the Association for Information Systems, 8(1), 9.<br />
*Osborne, J. W. (2012). Power and Planning for Data Collection. In Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage Publications.<br />
*Steenkamp, J. B. E., De Jong, M. G., & Baumgartner, H. (2010). Socially desirable response tendencies in survey research. Journal of Marketing Research, 47(2), 199-214.<br />
*Bacharach, S. B. (1989). Organizational theories: Some criteria for evaluation. Academy of management review, 14(4), 496-515.<br />
*Becker, T. E. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274-289.<br />
*Dietz, W. H., & Gortmaker, S. L. (1985). Do we fatten our children at the television set? Obesity and television viewing in children and adolescents. Pediatrics, 75(5), 807-812.<br />
*Peterson, C., Park, N., & Seligman, M. E. (2005). Orientations to happiness and life satisfaction: The full life versus the empty life. Journal of happiness studies, 6(1), 25-41.<br />
*Sposito, V. A., Hand, M. L., & Skarpness, B. (1983). On the efficiency of using the sample kurtosis in selecting optimal lpestimators. Communications in Statistics-simulation and Computation, 12(3), 265-272.<br />
*McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of mathematical and statistical Psychology, 34(1), 100-117.<br />
*Trochim, W. M., & Donnelly, J. P. (2006). The research methods knowledge base (3rd ed.). Cincinnati, OH:Atomic Dog.<br />
*Gravetter, F., & Wallnau, L. (2014). Essentials of statistics for the behavioral sciences (8th ed.). Belmont, CA: Wadsworth.<br />
*Field, A. (2000). Discovering statistics using spss for windows. London-Thousand Oaks- New Delhi: Sage publications.<br />
*Field, A. (2009). Discovering statistics using SPSS. London: SAGE.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1295802Confirmatory Factor Analysis2018-04-24T23:29:19Z<p>Jgaskin: /* Modification indices */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website], or to the very helpful article: Hermida, R. 2015. "The Problem of Allowing Correlated Errors in Structural Equation Modeling: Concerns and Considerations," Computational Methods in Social Sciences (3:1), p. 5.<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=6j4_ZrkCxTc '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural and metric invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset).<br />
<br />
An even simpler and less time-consuming approach to metric invariance is to conduct a multigroup moderation test using critical ratios for differences in AMOS. Below is a video to explain how to do this. The video is about a lot of things in the CFA, but the link below will start you at the time point for testing metric invariance with critical ratios.<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''Metric Invariance''']<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Confirmatory_Factor_Analysis&diff=1295792Confirmatory Factor Analysis2018-04-24T23:23:53Z<p>Jgaskin: /* Zero and Equal Constraints */</p>
<hr />
<div>Confirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor structure of your dataset. In the EFA we ''explore'' the factor structure (how the variables relate and group based on inter-variable correlations); in the CFA we ''confirm'' the factor structure we extracted in the EFA. <br />
*[[File:books.jpg]] '''''LESSON:''''' [http://www.kolobkreations.com/CFA.pptx '''Confirmatory Factor Analysis'''] <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''CFA part 1''']<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/hqV9RUSwwdA '''CFA part 2''']<br />
----<br />
[[File:citation.png]]''Do you know of some citations that could be used to support the topics and procedures discussed in this section? Please [mailto:james.eric.gaskin@gmail.com email them to me] with the name of the section, procedure, or subsection that they support. '''Thanks!'''''<br />
----<br />
== Model Fit ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=JkZGWUUjdLg '''Handling Model Fit''']<br />
Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA. <br />
<br />
=== Metrics ===<br />
There are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).<br />
<br />
[[File:GOFMetrics1.png]]<br />
<br />
=== Modification indices ===<br />
Modification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this guideline - however, there are exceptions. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: [http://davidakenny.net/cm/respec.htm David's website]<br />
<br />
[[File:modelfit.png]]<br />
<br />
=== Standardized Residual Covariances ===<br />
Standardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant ''standardized'' residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.<br />
<br />
== Validity and Reliability ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=yk6DVC7Wg7g '''Testing Validity and Reliability in a CFA''']<br />
<br />
It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV), and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:<br />
<br />
'''Reliability'''<br />
*CR > 0.7<br />
<br />
'''Convergent Validity'''<br />
*AVE > 0.5<br />
<br />
'''Discriminant Validity'''<br />
*MSV < AVE<br />
*Square root of AVE greater than inter-construct correlations<br />
<br />
If you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.<br />
<br />
If you need to cite these suggested thresholds, please use the following:<br />
<br />
*Hair, J., Black, W., Babin, B., and Anderson, R. (2010). ''Multivariate data analysis'' (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.<br />
<br />
AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).<br />
<br />
*Malhotra N. K., Dash S. (2011). Marketing Research an Applied Orientation. London: Pearson Publishing.<br />
<br />
Here is an updated video that uses the most recent Stats Tools Package, which includes a more accurate measure of AVE and CR. <br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''SEM Series (2016) 5. Confirmatory Factor Analysis Part 2''']<br />
<br />
== Common Method Bias (CMB) ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/CFBUECZgUuo?list=PLnMJlbz3sefJaVv8rBL2_G85HoUko5I-- '''Zero-constraint approach to CMB''']<br />
*'''REF:''' Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., and Podsakoff, N.P. "Common method biases in behavioral research: a critical review of the literature and recommended remedies," ''Journal of Applied Psychology'' (88:5) 2003, p 879.<br />
<br />
Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.<br />
<br />
=== Harman’s single factor test ===<br />
*''It should be noted that the Harman's single factor test is no longer widely accepted and is considered an outdated and inferior approach.''<br />
A Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).<br />
<br />
[[File:Harman.png]]<br />
<br />
=== Common Latent Factor ===<br />
This method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to '''all''' observed items in the model. Then compare the standardised regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.<br />
<br />
[[File:CMF.png]]<br />
<br />
=== Marker Variable ===<br />
This method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), ''but make sure it is something that you would not expect to correlate with the other latent factors in the model'' (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.<br />
<br />
[[File:Marker.png]]<br />
<br />
=== Zero and Equal Constraints ===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/abzt5zTkCxk '''Specific Bias 2017''']<br />
The most current and best approach is outlined below.<br />
#Do an EFA, and make sure to include “marker” or specific bias (SB) constructs.<br />
##Specific bias constructs are just like any other multi-item constructs but measure specific sources of bias that may account for shared variance not due to a causal relationship between key variables in the study. A common one is Social Desirability Bias.<br />
#Do the CFA with SB constructs covaried to other constructs (this looks like a normal CFA).<br />
##Assess and adjust to achieve adequate goodness of fit<br />
##Assess and adjust to achieve adequate validity and reliability<br />
#Add a common latent factor (CLF), sometimes called an unmeasured method factor. <br />
##Make sure the CLF is connected to all observed items, including the items of the SB constructs. <br />
##If this breaks your model, then remove the CLF and proceed with the steps below. <br />
##If it does not break your model, then in the steps below, keep the SB constructs covaried to the other latent factors, and keep the CLF connected with regression lines to all observed factors. <br />
##The steps below assume the CLF will break the model, so some instructions that say to connect the SB to all observed variables should instead be CLF to all observed variables (if the CLF did not break your model).<br />
#Then conduct the CFA with the SB constructs shown to influence ALL indicators of other constructs in the study. Do not correlate the SB constructs with the other constructs of study. If there is more than one SB construct, they follow the same approach and can correlate with each other.<br />
##Retest validity, but be willing to accept lower thresholds. <br />
##If change in AVE is extreme (e.g., >.300) then there is too much shared variance attributable to a response variable. This means that variable is compromised and any subsequent analysis with it may be biased.<br />
##If the majority of factors have extreme changes to their AVE, you might consider rethinking your data collection instrument and how to reduce specific response biases.<br />
#If the validities are still sufficient, then conduct the zero-constrained test. This test determines whether the response bias is any different from zero.<br />
##To do this, constrain all paths from the SB constructs to all indicators (but do not constrain their own) to zero. Then conduct a chi-square difference test between the constrained and unconstrained models.<br />
##If the null hypothesis cannot be rejected (i.e., the constrained and unconstrained models are the same or "invariant"), you have demonstrated that you were unable to detect any specific response bias affecting your model. You can move on to causal modeling, but make sure to retain the SB construct(s) to include as control in the causal model. See the bottom of this subsection for how to do this. <br />
##If you changed your model while testing for specific bias, you should retest validities and model fit with this final (unconstrained) measurement model, as it may have changed.<br />
#If the zero-constrained chi-square difference test resulted in a significant result (i.e., reject null, i.e., response bias is not zero), then you should run an equal-constrained test. This test determines whether the response bias is evenly distributed across factors.<br />
##To do this, constrain all paths from the SB construct to all indicators (not including their own) to be equal. There are multiple ways to do this. One easy way is simply to name them all the same thing (e.g., "aaa").<br />
##If the chi-square difference test between the constrained (to be equal) and unconstrained models indicates invariance (i.e., fail to reject null - that they are equal), then the bias is equally distributed. Make note of this in your report. e.g., "A test of equal specific bias demonstrated evenly distributed bias."<br />
##Move on to causal modeling with the SB constructs retained (keep them).<br />
##If the chi-square test is significant (i.e., unevenly distributed bias), which is more common, you should still retain the SB construct for subsequent causal analyses. Make note of this in your report. e.g., "A test of equal specific bias demonstrated unevenly distributed bias."<br />
<br />
*'''What to do if you have to retain the specific bias factor'''<br />
**You can do this either by imputing factor scores while the SB construct is connected to all observed variables (thereby essentially parceling out the shared bias with the SB construct), and then exclude from your causal model the SB construct that was imputed during your measurement model. Or you can disconnect the SB construct from all your observed variables, but covary it with all your latent variables, and then impute factor scores. If taking this latter approach, the SB will not be parceled out, so you will then need to include the factor score for the SB construct in the causal model as a control variable, connected to all other variables.<br />
<br />
== Measurement Model Invariance ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=6j4_ZrkCxTc '''Measurement Model Invariance''']<br />
Before creating composite variables for a path analysis, configural and metric invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups). <br />
<br />
=== Configural ===<br />
Configural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: [[Confirmatory Factor Analysis#Model Fit|Model Fit]]). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.<br />
<br />
=== Metric ===<br />
If we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset).<br />
<br />
An even simpler and less time-consuming approach to metric invariance is to conduct a multigroup moderation test using critical ratios for differences in AMOS. Below is a video to explain how to do this. The video is about a lot of things in the CFA, but the link below will start you at the time point for testing metric invariance with critical ratios.<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://youtu.be/MCYmyzRZnIY '''Metric Invariance''']<br />
<br />
=== Contingency Plans ===<br />
<br />
If you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.<br />
*1. '''Modification indices:''' Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights. <br />
*2. '''Regression weights:''' You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).<br />
*3. '''Standardized Residual Covariances:''' To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “[http://www.youtube.com/watch?v=JkZGWUUjdLg Model fit during a Confirmatory Factor Analysis (CFA) in AMOS]” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.<br />
<br />
== 2nd Order Factors ==<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.<br />
<br />
[[File:2ndOrderCFA.png]]<br />
[[File:2ndOrderSEM.png]]<br />
<br />
== Common CFA Problems ==<br />
1. CFA that reaches iteration limit.<br />
*Here is a video: *[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/B7YOv7hSohY '''Iteration limit reached in AMOS'''] <br />
2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)<br />
*The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.<br />
3. CFA with negative error variances<br />
*This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)<br />
4. CFA with negative error covariances (sometimes shows up as “not positive definite”)<br />
*In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first. <br />
5. CFA with Heywood cases<br />
*This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).<br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/Vx24KFf-rAo '''AMOS Heywood Case'''] <br />
6. CFA with discriminant validity issues<br />
*This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them: <br />
<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [http://www.youtube.com/watch?v=HBQPqj63Y7s '''Handling 2nd Order Factors''']<br />
7. CFA with “missing constraint” error<br />
*Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=1295534Guidelines2018-04-24T21:07:07Z<p>Jgaskin: /* Standard outline for quantitative model building/testing paper */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run (does everything below)'''], but I've also added below a few more relevant and updated links.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin]<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project") (1-2 paragraphs)<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included.<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskinhttp://statwiki.kolobkreations.com/index.php?title=Guidelines&diff=1295533Guidelines2018-04-24T21:06:41Z<p>Jgaskin: /* Standard outline for quantitative model building/testing paper */</p>
<hr />
<div>On this wiki page I share my thoughts on various academic topics, including my 10 Steps to building a good quantitative '''variance model''' that can be addressed using a well-designed '''survey''', as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.<br />
----<br />
==Example Analysis - needs to be updated...==<br />
I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.<br />
*[http://www.kolobkreations.com/QuantExampleAnalysis.pdf Click here to access the example analysis.]<br />
<br />
==How to start any (EVERY) Research Project==<br />
In a page or less, using only bullet points, answer these questions (or fill out this outline). Then share it with a trusted advisor (not me unless I am actually your advisor) to get early feedback. This way you don't waste your time on a bad or half-baked idea. You might also consider reviewing the editorial by Arun Rai at MISQ called: "Avoiding Type III Errors: Formulating Research Problems that Matter." This is written for the information systems field, but is generalizable to all fields.<br />
#What is the problem you are seeking to address? (If there is no problem, then there is usually no research required.)<br />
#Why is this an important (not just interesting) contemporary or upcoming problem? (i.e., old problems don't need to be readdressed if they are not still a problem)<br />
#Who else has addressed this problem? (Very rarely is the answer to this: "nobody". Be creative. Someone has studied something related to this problem, even if it isn't the exact same problem. This requires a lit review.)<br />
#In what way are the prior efforts of others incomplete? (i.e., if others have already addressed the problem, what is left to study - what are the "gaps"?)<br />
#How will you go about filling these gaps in prior research? (i.e., study design)<br />
##Why is this an appropriate approach?<br />
#(If applicable) Who is your target population for studying this problem? (Where are you going to get your data?)<br />
##How are you going to get the data you want? (quantity and quality)<br />
<br />
If you would like to use the answers to the above questions as the substance of your introduction section, just add these two points:<br />
#Overall, what did your endeavors discover?<br />
#How is your paper organized to effectively communicate the arguments and contributions you are trying to make?<br />
<br />
== Developing Your Quantitative Model ==<br />
===Ten Steps for Formulating a Decent Quantitative Model===<br />
#Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions. <br />
#Figure out why explaining and predicting these DVs is important.<br />
##Why should we care? <br />
##For whom will it make a difference? <br />
##What can we possibly contribute to knowledge that is not already known?<br />
##If these are all answerable and suggest continuing the study, then go to #3, otherwise, go to #1 and try different DVs. <br />
#Form one or two research questions around explaining and predicting these DVs. <br />
##Scoping your research questions may also require you to identify your population. <br />
#Is there some existing theory that would help explore these research questions? <br />
##If so, then how can we adopt it for specifically exploring these research questions?<br />
##Does that theory also suggest other variables we are not considering?<br />
#What do you think (and what has research said) impacts the DVs we have chosen? <br />
##These become IVs.<br />
#What is it about these IVs that is causing the effect on the DVs?<br />
##These become Mediators.<br />
#Do these relationships depend on other factors, such as age, gender, race, religion, industry, organization size and performance, etc.?<br />
##These become Moderators<br />
#What variables could potentially explain and predict the DVs, but are not directly related to our interests? <br />
##These become control variables. These are often some of those moderators like age and gender, or variables in extant literature. <br />
#Identify your population.<br />
##Do you have access to this population?<br />
##Why is this population appropriate to sample in order to answer the research questions?<br />
#Based on all of the above, but particularly #4, develop an initial conceptual model involving the IVs, DVs, Mediators, Moderators, and Controls.<br />
##If tested, how will this model contribute to research (make us think differently) and practice (make us act differently)?<br />
<br />
== From Model Development to Model Testing ==<br />
[[File:YouTube.png]] [http://youtu.be/3uBx-6-62Y0?hd=1'''''Video explanation of this section''''']<br />
===Critical tasks that happen between model development and model testing===<br />
#'''Develop a decent quantitative model'''<br />
##see previous section<br />
#'''Find existing scales and develop your own if necessary'''<br />
##You need to find ways to measure the constructs you want to include in your model. Usually this is done through reflective latent measures on a Likert scale. It is conventional and encouraged to leverage existing scales that have already been either proposed or, better yet, validated in extant literature. If you can’t find existing scales that match your construct, then you might need to develop your own. For guidelines on how to design your survey, please see the next section [[#Guidelines_on_Survey_Design]]<br />
##'''Find existing scales'''<br />
###I’ve made a [[File:YouTube.png]] [https://youtu.be/dro9VE6oQOg'''''VIDEO TUTORIAL'''''] about finding existing scales. The easy way is to go to http://inn.theorizeit.org/ and search their database. You can also search google scholar for scale development of your construct. Make sure to note the source for the items, as you will need to report this in your manuscript.<br />
###Once you’ve found the measures you need, you’ll most likely need to adapt them to your context. For example, let’s say you’re studying the construct of Enjoyment in the context of Virtual Reality. If the existing scale was “I enjoy using the website”, you’ll want to change that to “I enjoyed the Virtual Reality experience” (or something like that). The key consideration is to retain the “spirit” or intent of the item and construct. If you do adapt the measures, be sure to report your adaptations in the appendix of any paper that uses these adapted measures. <br />
###Along this idea of adapting, you can also trim the scale as needed. Many established scales are far too large, consisting of more than 10 items. A reflective construct never requires more than 4 or 5 items. Simply pick the 4-5 items that best capture the construct of interest. If the scale is multidimensional, it is likely formative. In this case, you can either:<br />
####Keep the entire scale (this can greatly inflate your survey, but it allows you to use a latent structure)<br />
####Keep only one dimension (just pick the one that best reflects the construct you are interested in)<br />
####Keep one item from each dimension (this allows you to create an aggregate score; i.e., sum, average, or weighted average)<br />
##'''Develop new scales'''<br />
###Developing new scales is a bit trickier, but is perhaps less daunting than many make it out to be. The first thing you ''must'' do before developing your own scales is to precisely define your construct. You cannot develop new measures for a construct if you do not know precisely what it is you are hoping to measure. <br />
###Once you have defined your construct, I strongly recommend developing reflective scales where applicable. These are far easier to handle statistically, and are more amenable to conventional SEM approaches. Formative measures can also be used, but they involve several caveats and considerations during the data analysis stage. <br />
####For '''reflective''' measures, simply create 5 interchangeable statements that can be measured on a 5-point Likert scale of agreement, frequency, or intensity. We develop 5 items so that we have some flexibility in dropping 1 or 2 during the EFA if needed. If the measures are truly reflective, using more than 5 items would be unnecessarily redundant. If we were to create a scale for Enjoyment (defined in our study as the extent to which a user receives joy from interacting with the VR), we might have the following items that the user can answer from strongly disagree to strongly agree:<br />
#####I enjoyed using the VR<br />
#####Interacting with the VR was fun<br />
#####I was happy while using the VR<br />
#####Using the VR was boring (reverse coded)<br />
#####Using the VR was pleasurable<br />
#'''If developing your own scales, do pretesting (talk aloud, Q-sort)'''<br />
##To ensure the newly developed scales make sense to others and will hopefully measure the construct you think they should measure, you need to do some pretesting. Two very common pretesting exercises are ‘talk-aloud’ and ‘Q-sort’. <br />
###'''Talk-aloud''' exercises include sitting down with between five and eight individuals who are within, or close to, your target population. For example, if you plan on surveying nurses, then you should do talk-alouds with nurses. If you are surveying a more difficult to access population, such as CEOs, you can probably get away with doing talk-alouds with upper level management instead. The purpose of the talk-aloud is to see if the newly developed items make sense to others. Invite the participant (just one participant at a time) to read out loud each item and respond to it. If they struggle to read it, then it is worded poorly. If they have to think very long about how to answer, then it needs to be more direct. If they are unsure how to answer, then it needs to be clarified. If they say “well, it depends” then it needs to be simplified or made more contextually specific. You get the idea. After the first talk-aloud, revise your items accordingly, and then do the second talk-aloud. Repeat until you stop getting meaningful corrections. <br />
###'''Q-sort''' is an exercise where the participant (ideally from the target population, but not strictly required) has a card (physical or digital) for each item in your survey, even existing scales. They then sort these cards into piles based on what construct they think the item is measuring. To do this, you’ll need to let them know your constructs and the construct definitions. This should be done for formative and reflective constructs, but not for non-latent constructs (e.g., gender, industry, education). Here is a video I’ve made for Q-sorting: [[File:YouTube.png]] [https://youtu.be/ZEqPJoKxo2w'''''Q-sorting in Qualtrics''''']. You should have at least 8 people participate in the Q-sort. If you arrive at consensus (>70% agreement between participants) after the first Q-sort, then move on. If not, identify the items that did not achieve adequate consensus, and then try to reword them to be more conceptually distinct from the construct they miss-loaded on while being more conceptually similar to the construct they should have loaded on. Repeat the Q-sort (with different participants) until you arrive at adequate consensus.<br />
#'''Identify target sample and, if necessary, get approval to contact'''<br />
##Before you can submit your study for IRB approval, you must identify who you will be collecting data from. Obtain approval and confirmation from whoever has stewardship over that population. For example, if you plan to collect data from employees at your current or former organization, you should obtain approval from the proper manager over the group you plan to solicit. If you are going to collect data from students, get approval from their professor(s). <br />
#'''Conduct a Pilot Study'''<br />
##It is exceptionally helpful to conduct a pilot study if time and target population permit. A pilot study is a smaller data collection effort (between 30 and 100 participants) used to obtain reliability scores (like Cronbach’s alpha) for your reflective latent factors, and to confirm the direction of relationships, as well as to do preliminary manipulation checks (where applicable). Usually the sample size of a pilot study will not allow you to test the full model (either measurement or structural) altogether, but it can give you sufficient power to test pieces at a time. For example, you could do an EFA with 20 items at a time, or you could run simple linear regressions between an IV and a DV. <br />
##Often time and target population do not make a pilot study feasible. For example, you would never want to cannibalize your target population if that population is difficult to access and you are concerned about final sample size. Surgeons, for example, are a hard population to access. Doing a pilot study of surgeons will cannibalize your final sample size. Instead, you could do a pilot study of nurses, or possibly resident surgeons. Deadlines are also real, and pilot studies take time – although, they may save you time in the end. If the results of the pilot study reveal poor Cronbach’s alphas, or poor loadings, or significant cross-loadings, you should revise your items accordingly. Poor Cronbach’s alphas and poor loadings indicate too much conceptual inconsistency between the items within a construct. Significant cross-loadings indicate too much conceptual overlap between items across separate constructs. <br />
#'''Get IRB approval'''<br />
##Once you’ve identified your population and obtained confirmation that you’ll be able to collect data from them, you are now ready to submit your study for approval to your local IRB. You cannot publish any work that includes data collected prior to obtaining IRB approval. This means that if you did a pilot study before obtaining approval, you cannot use that data in the final sample (although you can still say that you did a pilot study). IRB approval can take between 3 days and 6 weeks (or more), depending on the nature of your study and the population you intend to target. Typically studies of organizations regarding performance and employee dispositions and intentions are simple and do not get held up in IRB review. Studies that involve any form of deception or risk (physical, psychological, or financial) to participants require extra consideration and may require oral defense in front of the IRB. <br />
#'''Collect Data'''<br />
##You’ve made it! Time to collect your data. This could take anywhere between three days and three months, depending on many factors. Be prepared to send reminders. Incentives won’t hurt either. Also be prepared to only obtain a fraction of the responses you expected. For example, if you are targeting an email list of 10,000 brand managers, expect half of the emails to return abandoned, three quarters of the remainder to go unread, and then 90% of the remainder to go ignored. That leaves us with only 125 responses, 20% of which may be unusable, thus leaving us with only 100 usable responses from our original 10,000. <br />
#'''Test your model'''<br />
##see next section<br />
<br />
==Guidelines on Survey Design==<br />
#Make sure you are using formative or reflective measures intentionally (i.e., know which ones are which and be consistent). If you are planning on using AMOS, make sure all measures are reflective, or be willing to create calculated scores out of your formative measures.<br />
#If reflective measures are used, make sure they are truly reflective (i.e., that all items must move together). <br />
#If any formative measures are used, make sure that they are not actually 2nd order factors with multiple dimensions. If they are, then make sure there is sufficient and equal representation from each dimension (i.e., same number of items per dimension).<br />
#Make sure you are using the proper scale for each measure. Many scholars will mistakenly use a 5-point Likert scale of agreement (1=strongly disagree, 5=strongly agree) for everything, even when it is not appropriate. For example, if the item is “I have received feedback from my direct supervisor”, a scale of agreement makes no sense. It is a yes/no question. You could perhaps change it to a scale of frequency: 1=never, 5=daily, but a scale of agreement is not correct. <br />
#Along these same lines, make sure your measures are not yes/no, true/false, etc. if they are intended to belong to reflective constructs. <br />
#Make sure scales go from left to right, low to high, negative to positive, absence to presence, and so on. This is so that when you start using statistics on the data, an increase in the value of the response represents an increase in the trait measured. <br />
#Use exact numbers wherever possible, rather than buckets. This allows you much more flexibility to later create buckets of even size if you want to. This also gives you richer data.<br />
#However, make sure to restrict what types of responses can be given for numbers. For example, instead of asking someone’s age with a text box entry, use a slider. This prevents them from giving responses like: “twenty seven”, “twenty-seven”, “twentisven”, “227”, and “none of your business”. <br />
#Avoid including “N/A” and “other” if possible. These get coded as either 0 or 6 or 8, etc. but the number is completely invalid. However, when you’re doing statistics on it, your statistics software doesn’t know that those numbers are invalid, so it uses them as actual datapoints. <br />
#Despite literature stating the contrary, I’ve found reverse coded questions a perpetual nightmare. They nearly always fail in the factor analysis because some cultures are drawn to the positive end of the scale, while others are drawn to the negative end of the scale. So they rarely actually capture the trait the way you intend. When I design new surveys, I nearly always re-reverse reverse coded questions so that they are in the same direction as the regular items. <br />
#Measure only one thing with each item. Don’t ask about two things at once. For example, don’t include items like this: “I prefer face to face communication and don’t like talking via web conferencing.” This asks about two separate things. What if they like both?<br />
#Don’t make assumptions with your measures. For example, this item assumes everyone loses their temper: “When I lose my temper, it is difficult to think long term.”<br />
#Make sure your items are applicable to everyone within your sampled population. For example, don’t include items like this: “My children are a handful.” What if this respondent doesn’t have children? How should they respond?<br />
#Be careful including sensitive questions, or questions that have a socially desirable way to respond. Obvious ones might be like: “I occasionally steal from the office” or “I don’t report all my assets on my tax forms”. Regardless of the actual truth, respondents will enter the more favorable response. More subtle such measures might include: “I consider myself a critical thinker” or “sometimes I lose self-control”. These are less obvious, but still will result in biased responses because everyone thinks they are critical thinkers and no one wants to admit that they have anything less than full control over their emotions and self. <br />
#Include an occasional attention trap so that you can catch those who are responding without thinking. Such items should be mixed in with the regular items and should not stand out. For example, if a set of regular items all start with “My project team often…” then make sure to word your attention trap the same way. For example, “My project team often, never mind, please respond with somewhat disagree”.<br />
<br />
==Order of Operations for Testing your Model==<br />
===Some general guidelines for the order to conduct each procedure===<br />
*[[File:YouTube.png]]'''''VIDEO TUTORIAL:''''' [https://youtu.be/-j3LkADfgWs '''SEM Speed Run (does everything below)'''], but I've also added below a few more relevant and updated links.<br />
# Develop a good theoretical model<br />
## See the Ten Steps above<br />
## Develop hypotheses to represent your model: [[Structural_Equation_Modeling#Hypotheses]]<br />
# Case Screening<br />
## Missing data in rows<br />
## Unengaged responses<br />
## Outliers (on continuous variables)<br />
# Variable Screening<br />
## Missing data in columns<br />
## Skewness & Kurtosis<br />
# Exploratory Factor Analysis: [https://youtu.be/oeoTpXiSncc Messy EFA]<br />
## Iterate until you arrive at a clean pattern matrix<br />
## Adequacy<br />
## Convergent validity<br />
## Discriminant validity<br />
## Reliability: [https://youtu.be/xVl6Fg2A9GA Improving Reliability]<br />
# Confirmatory Factor Analysis<br />
## Obtain a roughly decent model quickly (cursory model fit, validity)<br />
## Do configural, metric, and scalar invariance tests (if using grouping variable in causal model)<br />
## Validity and Reliability check: [https://youtu.be/JqySgMU_qMQ Master Validity Plugin]<br />
## Response bias (aka common method bias, use specific bias variable(s) if possible): [https://youtu.be/abzt5zTkCxk Method Bias Plugin]<br />
## Final measurement model fit: [https://youtu.be/wV6UudZSBCA Model Fit Plugin]<br />
## Optionally, impute factor scores: [https://youtu.be/dsOS9tQjxW8 Imputing Factor Scores]<br />
# Structural Models<br />
## Multivariate Assumptions<br />
### Outliers and Influentials<br />
### Multicollinearity<br />
## Include control variables in all of the following analyses<br />
## Mediation<br />
### Test indirect effects using bootstrapping<br />
### If you have multiple indirect paths from same IV to same DV, use AxB estimand or: [https://youtu.be/41XgTZc66ko Specific Indirect Effects Plugin]<br />
## Interactions <br />
### Standardize constituent variables (if not already standardized)<br />
### Compute new product terms<br />
### Plot significant interactions<br />
## Multigroup Comparisons<br />
### Create multiple models<br />
### Assign them the proper group data<br />
### Test significance of moderation via chi-square difference test: [https://youtu.be/Pakg3PlppuY MGA Magic Plugin]<br />
# Report findings in a concise table<br />
## Ensure global and local tests are met<br />
## Include [https://youtu.be/XPIj__vCj5s post-hoc power analyses] for unsupported direct effects hypotheses<br />
# Write paper<br />
## See guidelines below<br />
<br />
== Structuring a Quantitative Paper ==<br />
===Standard outline for quantitative model building/testing paper===<br />
*'''Title''' (something catchy and accurate)<br />
*'''Abstract''' (concise – 150-250 words – to explain paper): roughly one sentence each:<br />
**What is the problem?<br />
**Why does it matter?<br />
**How do you address the problem?<br />
**What did you find?<br />
**How does this change practice (what people in business do), and how does it change research (existing or future)?<br />
*'''Keywords''' (4-10 keywords that capture the contents of the study)<br />
*'''Introduction''' (2-4 pages)<br />
**What is the problem and why does it matter? And what have others done to try to address this problem, and why have their efforts been insufficient (i.e., what is the gap in the literature)? (1-2 paragraphs)<br />
**What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)<br />
**One sentence about sample (e.g., "377 undergraduate university students using Excel").<br />
**How does studying this DV(s) in this context adequately address the problem? (1-2 paragraphs)<br />
**What existing theory/theories do you leverage, if any, to pursue this study, and why are these appropriate? (1-2 paragraphs)<br />
**Who else has pursued this research question (or something related), and why were their efforts insufficient? (see section above about "how to start every research project")<br />
**Briefly discuss the primary contributions of this study in general terms without discussing exact findings (i.e., no p-values here). <br />
**How is the rest of the paper organized? (1 paragraph)<br />
*'''Literature review''' (1-3 pages)<br />
**Fully define your dependent variable(s) and summarize how it has been studied in existing literature within your broader context (like Information systems, or, Organizations, etc.).<br />
**If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study. <br />
**If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s) or tried to understand related research questions. <br />
**(Optionally) Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”. Try to do this without repeating everything you are just going to say in the theory section anyway.<br />
**(Optionally) Briefly discuss control variables and why they are being included.<br />
*'''Theory & Hypotheses''' (take what space you need, but try to be parsimonious)<br />
**Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible). <br />
**Begin supporting H1 then state H1 formally. Support should include strong causal logic and literature. <br />
**H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. <br />
*'''Methods''' (keep it brief; many approaches; this is just a common template)<br />
**Construct operationalization (where did you get your measures?)<br />
**Instrument development (if you created your own measures)<br />
**Explanation of study design (e.g., pretest, pilot, and online survey)<br />
**Sampling (some descriptive statistics, like demographics (education, experience, etc.), sample size; don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study)).<br />
**Mention that IRB exempt status was granted and protocols were followed if applicable.<br />
**Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you conducted multi-group comparisons, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (hopefully bootstrapping)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?<br />
*'''Analysis''' (1-3 pages; sometimes combined with methods section)<br />
**Data Screening<br />
**EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were dropped.<br />
**CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model. Supporting material can be placed in the Appendices if necessary.<br />
**Mention CMB approach and results and actions taken if any (e.g., if you found CMB and had to keep the CLF).<br />
**Report the correlation matrix, CR and AVE (you can include MSV and ASV if you want), and briefly discuss any issues with validity and reliability – if any. <br />
**Report whether you used the full latent SEM, or if you imputed factor scores for a path model.<br />
**Report the final structural model(s) (include R-squares and betas) and the model fit for the model(s).<br />
*'''Findings''' (1-2 pages)<br />
**Report the results for each hypothesis (supported or not, with evidence).<br />
**Point out any unsupported or counter-evidence (significant in opposite direction) hypotheses.<br />
**Provide a table that concisely summarizes your findings.<br />
*'''Discussion''' (2-5 pages)<br />
**Summarize briefly the study and its intent and findings, focusing mainly on the research question(s) (one paragraph).<br />
**What insights did we gain from the study ''that we could not have gained without doing the study''?<br />
**How do these insights change the way practitioners do their work?<br />
**How do these insights shed light on existing literature and shape future research in this area?<br />
**What limitations is our study subject to (e.g., surveying students, just survey rather than experiment, statistical limitations like CMB etc.)?<br />
**What are some opportunities for future research ''based on the insights of this study''?<br />
*'''Conclusion''' (1-2 paragraphs)<br />
**Summarize the insights gained from this study and how they address existing gaps or problems.<br />
**Explain the primary contribution of the study.<br />
**Express your vision for moving forward or how you hope this work will affect the world.<br />
*'''References''' (Please use a reference manager like EndNote)<br />
*'''Appendices''' (Any additional information, like the instrument and measurement model stuff that is necessary for validating or understanding or clarifying content in the main body text.)<br />
**DO NOT pad the appendices with unnecessary statistics tables and illegible statistical models. Everything in the appendix should add value to the manuscript. If it doesn't add value, remove it.<br />
<br />
== My Thoughts on Conference Presentations ==<br />
I've presented at and attended many many conferences. Over the years, I've seen the good, the bad, and the ugly in terms of presentation structure, content, and delivery. Here are a few of my thoughts on what to include and what to avoid. <br />
===What to include in a conference presentation===<br />
*What’s the problem and why is it important to study? <br />
**Don’t short-change this part. If the audience doesn’t understand the problem, or why it is important, they won’t follow anything else you say.<br />
*Who else has researched this and what did they miss? <br />
**Keep this short; just mention the key studies you’re building off of.<br />
*How did we fill that gap? <br />
**Theoretically and methodologically<br />
*What did we find, what does it mean, and why does it matter? <br />
**Spend most of your time here.<br />
*The end. Short and sweet. <br />
===What to avoid in a conference presentation===<br />
*Long lit review <br />
**Completely unnecessary. You don’t have time for this. Just mention the key pieces you’re building off of.<br />
*Listing all hypotheses and explaining each one <br />
**Just show a model (or some illustrative figure) and point out the most important parts.<br />
*Including big tables of statistics (for quant) or quotes (for qual)<br />
**Just include a model with indications of significance if a quantitative study.<br />
**Just include a couple key quotes (no more than one per slide) if a qualitative study.<br />
*Back story on origination of the idea<br />
**Don’t care unless it’s crazy fascinating and would make a great movie.<br />
*Travel log of methodology<br />
**Again, don’t care. We figure you did the thing right.<br />
*Statistics on model validation and measurement validation.<br />
**Again, we figure you did the thing right. We’ll read the paper if we want to check your measurement model.<br />
*Repeating yourself too much<br />
**The time is short. There is no need to be redundant.<br />
*Using more words than images<br />
**Presentations are short and so are attention spans. Use pictures with a few words only. I can’t read your slide and listen to you at the same time. I bet you’d rather I listen to you than read your slide.<br />
*Reading the entire prepared presentation... <br />
**Yes, that has happened, more than once... cringe...<br />
*Failing to take notes of feedback<br />
**Literally write down on paper the feedback you get, even if it is stupid. This is just respectful.</div>Jgaskin