Transformation to Achieve Perfect Correlation

Authors

DOI:

https://doi.org/10.64060/JASR.v1i3.4

Keywords:

Correlation coefficient, Generalized inverse, Multiple linear regressions, Canonical regression, Normal distribution

Abstract

Correlation and linear regression are common means to evaluate association and empirical relationships between two or more variables. Such relationships often show significant departure of |r_XY | from unity. Existing transformations to increase correlation fail to achieve perfect correlation. For a bivariate data, the paper proposes transforming Y to y=G.‖x‖‖y‖, which gives r_(X y)=1  where  G is the G-inverse of the matrix A=x.x^Tand x, y denote vectors of deviation scores. The concept is extended to perfect linearity between a dependent variable (Y) and a set of independent variables (Multiple linear regressions) or between set of dependent variables and set of independent variables (Canonical regression), avoiding problems of insignificant beta coefficients in univariate and multivariate regression models and outliers. Empirical illustration of G-inverse and extensions for multiple linear regressions and Canonical regressions are also given. The proposed transformation is a novel method of introducing perfect correlation between two variables. Extension of the concept in multiple linear regressions and canonical regression will go a long way in empirical researches in various branches of science. Future studies may include finding distribution of the proposed perfect correlations and comparison of efficacy of our suggested approach against other traditional ones by providing quantitative evidences.

Downloads

Download data is not yet available.

References

Agresti A. (2002). Categorical data analysis (2nd ed). Hoboken, NJ: Wiley

Bignardi G., Dalmaijer E.S., Astle D.E. (2022): Testing the specificity of environmental risk factors for developmental out-comes. Child Dev. 93:e282–e298. doi: 10.1111/cdev.13719

Brooks, Thomas, Pope, D. and Marcolini, Michael. (2014): Airfoil Self-Noise. UCI Machine Learning Repository. https://doi.org/10.24432/C5VW2C.

Brossart, D. F., Parker, R. I., & Castillo, L. G. (2011). Robust regression for single-case data analysis: How can it help? Behavior Research Methods, 43(3), 710–719. https://doi.org/10.3758/s13428-011-0079-7

Box, G. E. P. and Cox, D. R. (1964): An analysis of transformations, Journal of the Royal Statistical Society, Series B, 26, 211-252.

Chakrabartty, Satyendra Nath (2023): Improving Linearity in Health Science Investigations. Health Sci J. Vol. 17 No. 4: 1010. DOI: 10.36648/1791-809X.17.4.1010

Chakrabartty, S. N., Kangrui, Wang and Chakrabarty, Dalia (2024): Reliable Uncertainties of Tests & Surveys - a Data-driven Ap-proach. International Journal of Metrology and Quality Engineering (IJMQE).15, 4, 1 – 14. https://doi.org/10.1051/ijmqe/2023018

Cox DR.(1972). Regression models and life-tables (with discussion). J R STAT SOC ; B. 34:187-220. doi: http://dx.doi.org/10.2307/2985181

Erceg-Hurn, D. M., & Mirosevich, V. M. (2008). Modern robust statistical methods: An easy way to maximize the accuracy and power of your research. American Psychologist, 63(7), 591–601. https://doi.org/10.1037/0003-066X.63.7.591

Feng, Ge, Peng, Jing, TU, Dongke, Zheng, Julia Z. and Feng, Changyong (2016). Two Paradoxes in Linear Regression Analysis. Shanghai Archives of Psychiatry, Vol. 28, No. 6, 355 – 360. https://doi.org/10.11919/j.issn.1002-0829.216084

Field, A. P., & Wilcox, R. R. (2017). Robust statistical methods: A primer for clinical psychology and experimental psychopatholo-gy researchers. Behaviour Research and Therapy, 98(Supp. C), 19–38. https://doi.org/10.1016/j.brat.2017.05.013

Fox, S. and Hammond, S. (2017). Investigating the multivariate relationship between impulsivity and psychopathy using canonical correlation analysis. Personality and Individual Differences, 111, 187-192. doi:10.1016/j.paid.2017.02.025

Gavurova B., Rigelsky M., Ivankova V. (2020): Perceived health status and economic growth in terms of gender-oriented inequalities in the OECD countries. Economics and Sociology, 13:245–257. doi: 10.14254/2071-789X.2020/13-2/16.

Hand, D. J. ( 1996): Statistics and the Theory of Measurement, J. R. Statist. Soc. A; 159, Part 3, 445-492

Jamieson, S. (2004): Likert scales: How to (ab) use them. Medical Education, 38, 1212 -1218

Kim, Y., Kim, T.-H., & Ergun, T. (2015). The instability of the Pearson correlation coefficient in the presence of coincidental outli-ers. Finance Research Letters, 13, 243–257. https://doi.org/10.1016/j.frl.2014.12.005

Kovacevic, M. (2011): Review of HDI Critiques and Potential Improvements, The Human Development Research Paper (HDRP) Se-ries, Research Paper 2010/33.

Liu Y, Ruan J, Wan C, Tan J, Wu B, Zhao Z. (2022): Canonical correlation analysis of factors that influence quality of life among patients with chronic obstructive pulmonary disease based on QLICD-COPD (V2.0). BMJ Open Respir Res. 9(1):e001192. doi: 10.1136/bmjresp-2021-001192.

Loco, J.V; Elskens, M., Croux, C. and Beernaert, H. (2002). Linearity of calibration curves: use and misuse of the correlation coeffi-cient. Accreditation and Quality Assurance (7):281–285. DOI 10.1007/s00769-002-0487-6

Malakar B., Roy S.K., Pal B. (2022): Relationship between physical strength measurements and anthropometric variables: Multivari-ate analysis. J. Public Health Dev. 20:132–145. doi: 10.55131/jphd/2022/200111

Mardia, K.V. and Bibby, J.M. and Kent, J.T. (1982): Multivariate analysis, Academic Press

Niven, E. B., & Deutsch, C. V. (2012). Calculating a robust correlation coefficient and quantifying its uncertainty. Computers & Geosciences, 40, 1–9. https://doi.org/10.1016/j.cageo.2011.06.021

Parkin D, Rice N, Devlin N.(2010): Statistical analysis of EQ-5D profiles: does the use value sets bias inferences? Med Decis Making 30(5): 556–565. DOI: 10.1177/0272989X09357473

Rao, C. Radhakrishna and Mitra, Sujit Kumar (1971). Generalized Inverse of Matrices and its Applications. New York: John Wiley & Sons. ISBN 978-0-471-70821-6

Rousseeuw, P. J., & van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points. Journal of the American Sta-tistical Association, 85(411), 633–639. https://doi.org/10.1080/01621459.1990.10474920

Song-Gui Wang & Shein-Chung Chow (1987): Some results on canonical correlations and measures of multivariate associa-tion. Communications in Statistics - Theory and Methods, 16:2, 339-351, DOI: 10.1080/03610928708829370

Stefano, Claudio; Fontanella, Francesco; Maniaci, Marilena and Freca, Alessandra (2018). Avila. UCI Machine Learning Repository. https://doi.org/10.24432/C5K02X

Vasylieva T, Gavurova B, Dotsenko T, Bilan S, Strzelec M, Khouri S. (2023): The Behavioral and Social Dimension of the Public Health System of European Countries: Descriptive, Canonical, and Factor Analysis. Int J Environ Res Public Health. 20(5):4419. doi: 10.3390/ijerph20054419.

Wessa P. (2012): Box-Cox Linearity Plot (v1.0.5) in Free Statistics Software (v1.1.23-r7), Office for Research Development and Ed-ucation. http://www.wessa.net/rwasp_boxcoxlin.wasp/

Wilcox, R. R. (2023). Robust Correlation Coefficients That Deal With Bad Leverage Points. Methodology, Vol. 19(4), 348–364. https://doi.org/10.5964/meth.11045

Wilcox, R. R. (2022). Introduction to robust estimation and hypothesis testing (5th ed.). Academic Press.

Yellowlees, A., Bursa, F., Fleetwood, K. J., Charlton, S., Hirst, K. J., Sun, R., & Fusco, P. C. (2016). The appropriateness of ro-bust regression in addressing outliers in an anthrax vaccine potency test. Bioscience, 66(1), 63–72. https://doi.org/10.1093/biosci/biv159

68

Downloads

Published

2025-10-24

Issue

Section

Research Article

How to Cite

Transformation to Achieve Perfect Correlation. (2025). SCOPUA Journal of Applied Statistical Research, 1(3). https://doi.org/10.64060/JASR.v1i3.4

Share

Similar Articles

1-10 of 17

You may also start an advanced similarity search for this article.