合肥生活安徽新聞合肥交通合肥房產(chǎn)生活服務(wù)合肥教育合肥招聘合肥旅游文化藝術(shù)合肥美食合肥地圖合肥社保合肥醫(yī)院企業(yè)服務(wù)合肥法律

        代寫MLDS 421: Data Mining

        時(shí)間:2024-02-21  來源:合肥網(wǎng)hfw.cc  作者:hfw.cc 我要糾錯(cuò)


        Individual Assignment (100 points)

        Instructions:

        • Submit the paper review as a word or pdf file.

        • Submit code as a Python notebook (.ipynb) file along with the HTML version.

        • Write elegant code with substantial comments. If you have referred to or reused code from a website add the links as reference.

        1. Paper Review – Following the guidelines review any one of the technical papers from Group2 (20)

        2. Generate random multidimensional (n=1000, D >= 15) data using sklearn. (20)

        • Build a K-means function from scratch (without using sklearn) and make assumptions to simplify the code as needed.

        • Use the elbow method to find an appropriate value for k

        • Use the silhouette plot to evaluate your clusters

        • Re-cluster the data to see if you can improve your results

        • Perform PCA on the original dataset and retain the most important PCs.

        • Run K-means on the PCA output, compare results with respect to cluster quality and time taken

        3. Using the data from 2, perform hyperparameter optimizations of the following clustering algorithms. (20)

        • Agglomerative hierarchical clustering (number of clusters, linkage criterion)

        • Density-based clustering (DBSCAN) (eps, minPts)

        • Model-based clustering (GMM) (number of clusters)

        4. Data mining and Cluster analysis of the following dataset (40)

        https://data.cdc.gov/NCHS/NCHS-Injury-Mortality-United-States/vc9m-u7tv/about_data

        The dataset contains the number of injury deaths per year by different injury intents from years 1999 to 2016 in the US. There are different groupings by age group, gender, race, and injury intent.

        As a data science consultant, your goal is to mine the dataset and extract meaningful insights for your clients in the health care industry. The course of action is as follows:

        • Review and understand the structure of the data.

        o Columns are year, sex, age group, race, injury mechanism, injury intent, deaths, population, age specific rate, and the statistics of age specific rate

        • Data Transformation

        o For each year, group by age group, sex, or race and summarize data as needed for subsequent analysis.

        • Exploratory Data Analysis (10)

        o Create statistical summaries.

        o Create boxplots, correlation/pairwise plots.

        o Perform basic outlier analysis.

        • Clustering (15)

        o In a few lines create a plan that describes the 3-4 questions that are suitable for cluster analysis.

        o List the various clustering algorithm(s) you’d use and why:

        o E.g., K-means, K-medians, K-modes, Hierarchical methods, DBSCAN, etc.

        o Apply the above algorithms to the filtered dataset based on your plan.

        o Report on the quality of the clusters, pros/cons, and summarize your findings.

        • Bias/Fairness Questions (15)

        Data

        o In the dataset under study, from a bias/fairness (b/f) perspective, there are 2 sensitive features: race and gender.

        o Analyze the data by a combination (2) of features (sensitive and other). Example features to include in the analysis: location (county, state), and other features you consider relevant. Though these features may not be considered sensitive they can be a proxy for sensitive features.

        o Determine feature groupings that are relevant for your analysis and explain your choices.

        o Do you detect bias in the data?

        o Present the results visually to show salient insights with respect to bias.

        o Based on the EDA and your project objective, develop a hypothesis about where b/f issues could arise in the modeling (cluster analysis).

        Modeling

        o Based on your hypothesis, assess the fairness of your model/analysis by applying the fairness-related metrics that are available in any of the following tools: Python Fairlearn package, R Fairness/Fairmodels package, or other similar tools.

        o Explain the reasoning for the groups that you selected for the fairness metrics.

        o Compare the fairness metrics for the different groups.

        o If you developed multiple models compare the fairness metrics for the models.

        o Comment on the results.

        o Suggest how the bias/fairness issues could be mitigated.

        o Present the results visually to show salient insights.

        Note: In the Fall Quarter you attended lectures on Bias/Fairness. Additionally, the following is a useful resource for analyzing b/f in data and modeling: Fairness & Bias Metrics
        請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

        掃一掃在手機(jī)打開當(dāng)前頁
      1. 上一篇:代寫 Behavioural Economics ECON3124
      2. 下一篇:代寫COMP1721、代做java程序設(shè)計(jì)
      3. 無相關(guān)信息
        合肥生活資訊

        合肥圖文信息
        急尋熱仿真分析?代做熱仿真服務(wù)+熱設(shè)計(jì)優(yōu)化
        急尋熱仿真分析?代做熱仿真服務(wù)+熱設(shè)計(jì)優(yōu)化
        出評(píng) 開團(tuán)工具
        出評(píng) 開團(tuán)工具
        挖掘機(jī)濾芯提升發(fā)動(dòng)機(jī)性能
        挖掘機(jī)濾芯提升發(fā)動(dòng)機(jī)性能
        海信羅馬假日洗衣機(jī)亮相AWE  復(fù)古美學(xué)與現(xiàn)代科技完美結(jié)合
        海信羅馬假日洗衣機(jī)亮相AWE 復(fù)古美學(xué)與現(xiàn)代
        合肥機(jī)場(chǎng)巴士4號(hào)線
        合肥機(jī)場(chǎng)巴士4號(hào)線
        合肥機(jī)場(chǎng)巴士3號(hào)線
        合肥機(jī)場(chǎng)巴士3號(hào)線
        合肥機(jī)場(chǎng)巴士2號(hào)線
        合肥機(jī)場(chǎng)巴士2號(hào)線
        合肥機(jī)場(chǎng)巴士1號(hào)線
        合肥機(jī)場(chǎng)巴士1號(hào)線
      4. 短信驗(yàn)證碼 酒店vi設(shè)計(jì) NBA直播 幣安下載

        關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

        Copyright © 2025 hfw.cc Inc. All Rights Reserved. 合肥網(wǎng) 版權(quán)所有
        ICP備06013414號(hào)-3 公安備 42010502001045

        骚包在线精品国产美女| 国产精品中文久久久久久久| 久99精品视频在线观看婷亚洲片国产一区一级在线 | 青草国产精品久久久久久| 久久精品亚洲福利| www.99精品视频在线播放| 日韩精品免费在线视频| 在线观看麻豆精品国产不卡| 亚洲色精品三区二区一区| 国产成人精品日本亚洲专一区| 99热在线只有精品| 一区二区三区精品| 午夜精品一区二区三区免费视频| 国产综合色在线精品| 国内精品久久久久久久久电影网| 亚洲午夜精品一级在线播放放| 蜜桃导航一精品导航站| 日韩av一中美av一中文字慕| 美女内射无套日韩免费播放| 国产av无码久久精品| 依依成人精品视频在线观看| 国产精品久久久久国产精品三级| 国语自产精品视频在线完整版| 国产成人精品18| 91情侣在线精品国产免费| 亚洲精品综合在线影院| 精品久久久久久婷婷| 日本尤物精品视频在线看| 日本h在线精品免费观看| 国产精品18久久久久久vr| 国产伦精品一区二区三区女| 无码人妻精品中文字幕免费东京热| 久久99精品久久久大学生| 91精品国产福利在线观看麻豆 | 日韩三级在线观看视频| 国产乱人伦真实精品视频| AV在线播放日韩亚洲欧| 国产日韩高清三级精品人成| 国产日韩亚洲大尺度高清| 亚洲AV日韩AV永久无码绿巨人| 日韩高清国产一区在线|