Skip to content

heart disease data set

colnames(data) <- c( "age", ethnicity, smoker, "sex",# 0 = female, 1 = male "cp", # chest pain # 1 = typical angina, # 2 = atypical angina, # 3 = non-anginal pain, # 4 = asymptomatic "trestbps", # resting blood pressure (in mm Hg) "chol", # serum cholestoral in mg/dl "fbs", # fasting blood sugar if less than 120 mg/dl, 1 = TRUE, 0 = FALSE "restecg", # resting electrocardiographic results # 1 = normal # 2 = having ST-T wave abnormality # 3 = showing probable or definite left ventricular hypertrophy "thalach", # maximum heart rate achieved "exang", # exercise induced angina, 1 = yes, 0 = no "oldpeak", # ST depression induced by exercise relative to rest "slope", # the slope of the peak exercise ST segment # 1 = upsloping # 2 = flat # 3 = downsloping "ca", # number of major vessels (0-3) colored by fluoroscopy "thal", # this is short of thalium heart scan # 3 = normal (no cold spots) # 6 = fixed defect (cold spots during rest and exercise) # 7 = reversible defect (when cold spots only appear during exercise) "hd" # (the predicted attribute) - diagnosis of heart disease # 0 if less than or equal to 50% diameter narrowing # 1 if greater than 50% diameter narrowing ) Dear Sir/Madam Try to build the above data set which is a challenge with currently available information. Will be introducing some regression analysis to indicate relationships between variables.

Would help to identify relationships between variables in heart disease and thus the understanding of relationships.

Releasing this dataset would allow some analysis to be conducted to indicate relationships and in understanding in an open forum.

Mike Wilson


No Comments

Comments are disabled
Top