Spark机器学习 (英)彭特里思(Nick Pentreath) 著 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
Spark机器学习 (英)彭特里思(Nick Pentreath) 著电子书下载地址
- 文件名
- [epub 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 epub格式电子书
- [azw3 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 azw3格式电子书
- [pdf 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 pdf格式电子书
- [txt 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 txt格式电子书
- [mobi 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 mobi格式电子书
- [word 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 word格式电子书
- [kindle 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 kindle格式电子书
寄语:
新华书店正版,关注店铺成为会员可享店铺专属优惠,团购客户请咨询在线客服!
内容简介:
你可以从书中学到使用Scala、Java和Python创建你的靠前个Spark程序;在你自己的计算机以及AmazonEC2上建立、配置Spark开发环境;访问公共机器学习数据集,使用Spark载入、处理、清理、转换数据;使用Spark的机器学习库来实现能够利用各种熟知的机器学习模型的程序;等等。
书籍目录:
Preface
Chapter 1: Getting Up and Running with Spark
Installing and setting up Spark locally
Spark clusters
The Spark programming model
Spark Context and Spark Conf
The Spark shell
Resilient Distributed Datasets
Creating RDDs
Spark operations
Caching RDDs
Broadcast variables and accumulators
The first step to a Spark program in Scala
The first step to a Spark program in Java
The first step to a Spark program in Python
Getting Spark running on Amazon EC2
Launching an EC2 Spark cluster
Summary
Chapter 2: Designing a Machine Learning System
Introducing Movie Stream
Business use cases for a machine learning system
Personalization
Targeted marketing and customer segmentation
Predictive modeling and analytics
Types of machine learning models
The components of a data—driven machine learning system
Data ingestion and storage
Data cleansing and transformation
Model training and testing loop
Model deployment and integration
Model monitoring and feedback
Batch versus real time
An architecture for a machine learning system
Practical exercise
Summary
Chapter 3: Obtaining, Processing, and Preparing Data with Spark
Accessing publicly available datasets
The Movie Lens lOOk dataset
Exploring and visualizing your data
Exploring the user dataset
Exploring the movie dataset
Exploring the rating dataset
Processing and transforming your data
Filling in bad or missing data
Extracting useful features from your data
Numerical features
Categorical features
Derived features
Transforming timestamps into categorical features
Text features
Simple text feature extraction
Normalizing features
Using MLlib for feature normalization
Using packages for feature extraction
Summary
Chapter 4: Building a Recommendation Engine with Spark
Types of recommendation models
Content—based filtering
Collaborative filtering
Matrix factorization
Extracting the right features from your data
Extracting features from the MovieLens 100k dataset
Training the recommendation model
Training a model on the MovieLens 100k dataset
Training a model using implicit feedback data
Using the recommendation model
User recommendations
Generating movie recommendations from the MovieLens 100k dataset
Item recommendations
Generating similar movies for the MovieLens 100k dataset
Evaluating the performance of recommendation models
Mean Squared Error
Mean average precision at K
Using MLlib's built—in evaluation functions
RMSE and MSE
MAP
Summary
Chapter 5: Building a Classification Model with Spark
Types of classification models
Linear models
Logistic regression
Linear support vector machines
The na'fve Bayes model
Decision trees
Extracting the right features from your data
Extracting features from the Kaggle/StumbleUpon evergreen classification dataset
Training classification models
Training a classification model on the Kaggle/StumbleUpon evergreen classification dataset
Using classification models
Generating predictions for the Kaggle/StumbleUpon
evergreen classification dataset
Evaluating the performance of classification models
Accuracy and prediction error
Precision and recall
ROC curve and AUC
Improving model performance and tuning parameters
Feature standardization
Additional features
Using the correct form of data
Tuning model parameters
Linear models
Decision trees
The naive Bayes model
Cross—validation
Summary
Chapter 6: Buildin a Regression Model with Spark
Types of regression models
Least squares regression
Decision trees for regression
Extracting the right features from your data
Extracting features from the bike sharing dataset
Creating feature vectors for the linear model
Creating feature vectors for the decision tree
Training and using regression models
Training a regression model on the bike sharing dataset
Evaluating the performance of regression models
Mean Squared Error and Root Mean Squared Error
Mean Absolute Error
Root Mean Squared Log Error
The R—squared coefficient
Computing performance metrics on the bike sharing dataset
Linear model
Decision tree
Improving model performance and tuning parameters
Transforming the target variable
Impact of training on log—transformed targets
Tuning model parameters
Creating training and testing sets to evaluate parameters
The impact of parameter settings for linear models
The impact of parameter settings for the decision tree
Summary
Chapter 7: Building a Clustering Model with Spark
Types of clustering models
K—means clustering
Initialization methods
Variants
Mixture models
Hierarchical clustering
Extracting the right features from your data
Extracting features from the MovieLens dataset
Extracting movie genre labels
Training the recommendation model
Normalization
Training a clustering model
Training a clustering model on the MovieLens dataset
Making predictions using a clustering model
Interpreting cluster predictions on the MovieLens dataset
Interpreting the movie clusters
Evaluating the performance of clustering models
Internal evaluation metrics
External evaluation metrics
Computing performance metrics on the MovieLens dataset
Tuning parameters for clustering models
Selecting K through cross—validation
Summary
Chapter 8: Dimensionality Reduction with Spark
Types of dimensionality reduction
Principal Components Analysis
Singular Value Decomposition
Relationship with matrix factorization
Clustering as dimensionality reduction
Extracting the right features from your data
Extracting features from the LFW dataset
Exploring the face data
Visualizing the face data
Extracting facial images as vectors
Normalization
Training a dimensionality reduction model
Running PCA on the LFW dataset
Visualizing the Eigenfaces
Interpreting the Eigenfaces
Using a dimensionality reduction model
Projecting data using PCA on the LFW dataset
The relationship between PCA and SVD
Evaluating dimensionality reduction models
Evaluating k for SVD on the LFW dataset
Summary
Chapter 9: Advanced Text Processing with Spark
What's so special about text data?
Extracting the right features from your data
Term weighting schemes
Feature hashing
Extracting the TF—IDF features from the 20 Newsgroups dataset
Exploring the 20 Newsgroups data
Applying basic tokenization
Improving our tokenization
Removing stop words
Excluding terms based on frequency
A note about stemming
Training a TF—IDF model
Analyzing the TF—IDF weightings
Using a TF—IDF model
Document similarity with the 20 Newsgroups dataset and
TF—IDF features
Training a text classifier on the 20 Newsgroups dataset
using TF—IDF
Evaluating the impact of text processing
Comparing raw features with processed TF—IDF features on the
20 Newsgroups dataset
Word2Vec models
Word2Vec on the 20 Newsgroups dataset
Summary
Chapter 10: Real—time Machine Learning withSpark Streaming
Online learning
Stream processing
An introduction to Spark Streaming
Input sources
Transformations
Actions
Window operators
Caching and fault tolerance with Spark Streaming
Creating a Spark Streaming application
The producer application
Creating a basic streaming application
Streaming analytics
Stateful streaming
Online learning with Spark Streaming
Streaming regression
A simple streaming regression program
Creating a streaming data producer
Creating a streaming regression model
Streaming K—means
Online model evaluation
Comparing model performance with Spark Streaming
Summary
Index
作者介绍:
彭特里思,如果你是一名Scala、Java或Python开发人员,对机器学习和数据分析饶有兴趣,并热衷于学习如何使用spa rk框架将常见机器学习技术运用干大规模应用,那么这本书就是写给你的。如果对spark有基本的理解自然会有益处,但这并不是必需的。
出版社信息:
暂无出版社相关信息,正在全力查找中!
书籍摘录:
暂无相关书籍摘录,正在全力查找中!
在线阅读/听书/购买/PDF下载地址:
原文赏析:
在信息检索中,准确率通常用于评价结果的质量,而召回率用来评价结果的完整性。
通常,准确率和召回率是负相关的,高准确率常常对应低召回率,反之亦然。
准确率和召回率在单独度量时用处不大,但是它们通常会被一起组成聚合或者平均度量。二者也同时依赖于模型中选择的阈值。
现代的大数据场景包含如下需求:比如能与系统的其他组件整合,尤其是数据的收集和存储系统、分析和报告以及前端应用;易于扩展且与其他组件相对独立..;.. 最好能同时支持批处理和实时处理。
个性化和推荐十分相似,但推荐通常专指向用户显式地呈现某些产品或是内容,而个性化有时偏向隐式。比如说,对 MovieStream 的搜索功能个性化,以根据该用户的数据来改变搜索结果。
对数据进行初步预处理之后,需要将其转换为一种适合机器学习模型的表示形式。对许多模型类型来说,这种表示就是包含数值数据的向量或矩阵。
其它内容:
书籍介绍
Apache spark是一款全新开发的分布式框架,特别对低延迟任务和内存数据存储进行了优化。它结合了速度、可扩展性、内存处理以及容错性,是极少数适用于并行计算的框架之一,同时还非常易于编程,拥有一套灵活、表达能力丰富、功能强大的API设计。
《Spark机器学习(影印版 英文版)》指导你学习用于载入及处理数据的spark APl的基础知识,以及如何为各种机器学习模型准备适合的输入数据:另有详细的例子和实际生活中的真实案例来帮助你学习包括推荐系统、分类、回归、聚类、降维在内的常见机器学习模型,你还会看到如大规模文本处理之类的高级主题、在线机器学习的相关方法以及使用spa rk st reami ng进行模型评估。
网站评分
书籍多样性:4分
书籍信息完全性:9分
网站更新速度:8分
使用便利性:8分
书籍清晰度:3分
书籍格式兼容性:4分
是否包含广告:3分
加载速度:3分
安全性:7分
稳定性:7分
搜索功能:5分
下载便捷性:3分
下载点评
- 购买多(163+)
- 愉快的找书体验(393+)
- 在线转格式(603+)
- 书籍完整(664+)
- 赚了(351+)
- 无盗版(363+)
- 无缺页(129+)
- 品质不错(284+)
- pdf(398+)
- 赞(627+)
- 藏书馆(66+)
下载评价
- 网友 龚***湄: ( 2025-01-07 14:40:41 )
差评,居然要收费!!!
- 网友 方***旋: ( 2024-12-24 21:58:13 )
真的很好,里面很多小说都能搜到,但就是收费的太多了
- 网友 宓***莉: ( 2024-12-21 20:11:18 )
不仅速度快,而且内容无盗版痕迹。
- 网友 隗***杉: ( 2025-01-13 16:51:58 )
挺好的,还好看!支持!快下载吧!
- 网友 家***丝: ( 2025-01-06 22:41:59 )
好6666666
- 网友 苍***如: ( 2025-01-16 10:48:51 )
什么格式都有的呀。
- 网友 石***致: ( 2025-01-04 08:59:33 )
挺实用的,给个赞!希望越来越好,一直支持。
- 网友 敖***菡: ( 2025-01-11 05:07:41 )
是个好网站,很便捷
- 网友 潘***丽: ( 2024-12-20 13:05:12 )
这里能在线转化,直接选择一款就可以了,用他这个转很方便的
- 网友 步***青: ( 2025-01-12 22:04:58 )
。。。。。好
- 网友 印***文: ( 2024-12-21 05:43:53 )
我很喜欢这种风格样式。
喜欢"Spark机器学习 (英)彭特里思(Nick Pentreath) 著"的人也看了
- Flash CS6數位遊戲實戰技:從滑鼠、鍵盤到觸控的多款遊戲設計 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 《金粉世家》(全三册) 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 公路养护技术与管理 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 人卫版·2021乡村全科执业助理医师资格考试辅导讲义·2021新版·医师资格考试 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 空调器维修技能一学就会 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 名宅生活美学:时尚与舒适的空间对话 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 审计案例(第7版) 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 海洋科考船 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 光明在前 胡也频小说选 胡也频 著 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
- 工程建设法规与案例(第3版高等学校工程管理和工程造价学科专业指导委员会规划推荐教材 下载 pdf 百度网盘 epub 免费 2025 电子书 mobi 在线
书籍真实打分
故事情节:7分
人物塑造:3分
主题深度:8分
文字风格:6分
语言运用:4分
文笔流畅:7分
思想传递:3分
知识深度:6分
知识广度:6分
实用性:4分
章节划分:8分
结构布局:9分
新颖与独特:5分
情感共鸣:9分
引人入胜:8分
现实相关:9分
沉浸感:4分
事实准确性:7分
文化贡献:6分