【人工智能】机器学习及与智能数据处理之降维算法PCA及其应用手写识别【自定义数据集】

有时候,不是因为你没有能力,也不是因为你缺少勇气,只是因为你付出的努力还太少,所以,成功便不会走向你。而你所需要做的,就是坚定你的梦想,你的目标,你的未来,然后以不达目的誓不罢休的那股劲,去付出你的努力,成功就会慢慢向你靠近。

导读:本篇文章讲解 【人工智能】机器学习及与智能数据处理之降维算法PCA及其应用手写识别【自定义数据集】,希望对大家有帮助,欢迎收藏,转发!站点地址:www.bmabk.com,来源:原文

降维算法PCA及其应用手写识别【自定义数据集】

利用PCA算法实现手写字体识别,要求:

1、实现手写数字数据集的降维;

2、比较两个模型(64维和10维)的准确率;

3、对两个模型分别进行10次10折交叉验证,绘制评分对比曲线。

实验步骤

1. 导入自定义数据集

可以事先下载,也可以联网下载!
下载地址:

http://deeplearning.net/data/mnist/

保存如下:
在这里插入图片描述

from pathlib import Path
DATA_PATH = Path("data")
PATH = DATA_PATH / "mnist"
PATH.mkdir(parents=True, exist_ok=True)
URL = "http://deeplearning.net/data/mnist/"
FILENAME = "mnist.pkl.gz"
# 如果未下载,则创建目录下载数据
if not (PATH / FILENAME).exists():
    content = requests.get(URL + FILENAME).content
    (PATH / FILENAME).open("wb").write(content)
# 读取数据集
with gzip.open((PATH / FILENAME).as_posix(), "rb") as f:
    ((x_train, y_train), (x_test, y_test), _) = pickle.load(f, encoding="latin-1")
x_train = x_train[:5000,:]
y_train = y_train[:5000,]
x_test = x_test[:360,:]
y_test = y_test[:360,]

其他步骤和上一个相同【人工智能之手写字体识别】机器学习及与智能数据处理之降维算法PCA及其应用手写字体识别

代码详解

import matplotlib.pyplot as plt
from pathlib import Path
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import requests
import pickle
import gzip
DATA_PATH = Path("data")
PATH = DATA_PATH / "mnist"
PATH.mkdir(parents=True, exist_ok=True)
URL = "http://deeplearning.net/data/mnist/"
FILENAME = "mnist.pkl.gz"
if not (PATH / FILENAME).exists():
    content = requests.get(URL + FILENAME).content
    (PATH / FILENAME).open("wb").write(content)
# 读取数据集
with gzip.open((PATH / FILENAME).as_posix(), "rb") as f:
    ((x_train, y_train), (x_test, y_test), _) = pickle.load(f, encoding="latin-1")
x_train = x_train[:5000,:]
y_train = y_train[:5000,]
x_test = x_test[:360,:]
y_test = y_test[:360,]
#################################################################
# Each image is 28 x 28, and is being stored as a flattened row of length
# 784 (=28x28). Let's take a look at one; we need to reshape it to 2d
# first.
ss = StandardScaler()
x_train = ss.fit_transform(x_train)
x_test = ss.transform(x_test)
svc = SVC(kernel = 'rbf')
svc.fit(x_train,y_train)
y_predict = svc.predict(x_test)
print('The Accuracy of SVC is', svc.score(x_test, y_test))
print("classification report of SVC\n",classification_report(y_test, y_predict))
samples = x_test[:100]
y_pre = y_predict[:100]
plt.figure(figsize=(12,38))
for i in range(100):
    # 创建子图
    plt.subplot(10,10,i+1)
    # 显示灰度图像
    plt.imshow(samples[i].reshape(28,28),cmap='gray')
    title = str(y_pre[i])
    plt.title(title,color='red')
    # 关闭坐标轴
    plt.axis('off')
plt.show()
# 实现手写数字数据集的降维实现手写数字数据集的降维
pca = PCA(n_components=10,whiten=True)
pca.fit(x_train,y_train)
x_train_pca = pca.transform(x_train)
x_test_pca = pca.transform(x_test)
svc = SVC(kernel = 'rbf')
svc.fit(x_train_pca,y_train)
# 比较两个模型(64维和10维)的准确率
y_pre_svc = svc.predict(x_test_pca)
print("The Accuracy of PCA_SVC is ", svc.score(x_test_pca,y_test))
print("classification report of PCA_SVC\n", classification_report(y_test, y_pre_svc))
samples = x_test[:100]
y_pre = y_pre_svc[:100]
plt.figure(figsize=(12,38))
# 对两个模型分别进行10次10折交叉验证,绘制评分对比曲线
for i in range(100):
    plt.subplot(10,10,i+1)
    plt.imshow(samples[i].reshape(28,28),cmap='gray')
    title = str(y_pre[i])
    plt.title(title)
    plt.axis('off')
plt.show()

结果:

在这里插入图片描述

SVC

在这里插入图片描述

PCA

在这里插入图片描述

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

文章由极客之音整理,本文链接:https://www.bmabk.com/index.php/post/147432.html

(0)
飞熊的头像飞熊bm

相关推荐

发表回复

登录后才能评论
极客之音——专业性很强的中文编程技术网站,欢迎收藏到浏览器,订阅我们!